[omniORB] CodeSets implementation

Duncan Grisby duncan at grisby.org
Mon Dec 22 17:00:12 GMT 2003


On Friday 19 December, ondrej at frcatel.fri.utc.sk wrote:

[...]
>   While WResponse() function
> working just fine, the Respond() version fails with
> DATA_CONVERSION,DATA_CONVERSION_BadInput exception. The reason is, that
> the ORB tries to marshal win1250-encoded string as if it would was UCS-2.
> But it is not and the lookup for the corresponding char fails; in
> particular, it fails at first non-ascii character 0x9a. Since unicode
> 0x009a character (which is not unicode in real!) is not a part of win1250,
> the lookup is unsuccessfull and the exception is raised. In my opinion,
> the problem lies in marshalString function (and it's variants and un*
> friends) at cs-8bit.cc:227
> 
> void
> omniCodeSet::TCS_C_8bit::marshalString(cdrStream& stream,
> 				       _CORBA_ULong len,
> 				       const omniCodeSet::UniChar* us)
> {
>   len++;
>   len >>= stream;
> 
>   _CORBA_Char          c;
>   omniCodeSet::UniChar uc;
> 
>   for (_CORBA_ULong i=0; i<len; i++) {
>     uc = us[i];
>     c = pd_fromU[(uc & 0xff00) >> 8][uc & 0x00ff];
>     if (uc && !c)  OMNIORB_THROW(DATA_CONVERSION,
> 				DATA_CONVERSION_BadInput,
> 				(CORBA::CompletionStatus)stream.completion());
>     stream.marshalOctet(c);
>   }
> }
> 
> As you can see, `uc` is always considered to be unicode (or UCS-2)
> character but when ORBs negotiate 8bit TCS_C, no conversion takes place.
> `us` is in my case 8bit-coded win1250 string (stored in wide integers). As
> I see it, the codeset handling framework should be redesigned to work
> properly, because this can't work in any way.

You must be doing something wrong. The us string really _is_ meant to
be a UCS-2 string. You're looking at the transmission code set. If the
marshalString function is called, it is expecting a native code set to
have already converted the input string into UCS-2 for conversion into
the transmission code set. If you've set the native code set to be
windows-1250, marshalString should not be being called at all. It
should be using fastMarshalString instead, avoiding the trip via
UCS-2. Even if it was going via UCS-2, the NCS object should have
correctly converted the cp-1250 into UCS-2 ready for the TCS to
convert it back.

I can assure you that the code set conversion framework works fine.
What does you client code look like?

Cheers,

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --



More information about the omniORB-list mailing list