Ask Your Question
1

multibyte-character support for rosjava

asked 2011-10-12 20:31:09 -0600

Kei Okada gravatar image

updated 2011-10-17 23:38:38 -0600

Java supports multi-byte character as the Japanese, Korean or Chinese. This means that length() method of java.lang.String class returns number of the character, not number of bytes. http://rosettacode.org/wiki/String_length#Java

java.lang.String tmp1 = "robot";
System.out.println(tmp1.length());           -> 5
System.out.println(tmp1.getBytes().length);  -> 5
java.lang.String tmp2 = "ロボット";
System.out.println(tmp2.length());           -> 4
System.out.println(tmp2.getBytes().length);  ->12

Thus, using Japanese in Talker.java in rosjava_tutorial_pubsub crushes as following errors

Exception in thread "Thread-3" java.nio.BufferOverflowException
at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:165)
at java.nio.ByteBuffer.put(ByteBuffer.java:813)
at org.ros.message.Message$Serialization.writeString(Unknown Source)
at org.ros.message.std_msgs.String.serialize(Unknown Source)
at org.ros.message.Message.serialize(Unknown Source)

My solution is to change String.java as followings but not confident since I'm not Java programmer.

public int serializationLength() {
   int __l = 0;
   //__l += 4 + data.length();
   __l += 4 + data.getBytes().length;
   return __l;
}

download attached file and rename .jpg to to .tgz multichar_pubsub.jpg

edit retag flag offensive close merge delete

Comments

Kei Okada gravatar imageKei Okada ( 2011-10-13 00:01:56 -0600 )edit

1 Answer

Sort by » oldest newest most voted
0

answered 2011-10-16 08:13:57 -0600

kwc gravatar image

Thanks for the report. I will try and get this patched soon.

What is your recommendation for rospy? For performance reasons, rospy does not do any transformation on strings inside of messsages. Generally, the safe policy is to always encode your strings as utf-8 before serialization.

edit flag offensive delete link more

Comments

check talker.py in the attached file length=len(_x); buff.write(struct.pack('<I%ss'%length,length,_x)) or length=len(_x.encode('utf-8','ignore'));buff.write(struct.pack('<I%ss'%length,length,_x.encode('utf-8','ignore')) works, I'm not python programmer so it may better solution
Kei Okada gravatar imageKei Okada ( 2011-10-17 23:28:18 -0600 )edit
No longer sufficient to discuss here, moved to ticket: https://code.ros.org/trac/ros/ticket/3713
kwc gravatar imagekwc ( 2011-10-18 07:26:29 -0600 )edit

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

Stats

Asked: 2011-10-12 20:31:09 -0600

Seen: 1,561 times

Last updated: Oct 17 '11