[DO-SIG] Re: [omniORB] omniORBpy, Python 2.0 and unicode

Duncan Grisby dgrisby@uk.research.att.com
Tue, 12 Dec 2000 10:42:38 +0000


On Monday 11 December, "Martin v. Loewis" wrote:

> It provides a wrapper by means of the CORBA.wstr function, although
> I'd expect applications to rely on the fact that the output of
> CORBA.wstr really is a Unicode object. Once there is implementation
> experience with the Python Unicode type in the IDL mapping, it might
> be reasonable to formally codify that wstring *is* the Unicode type.

I agree that the mapping should be updated to require the Unicode type
for wstring. As far as I know, none of the Python ORBs support wstring
yet, so it won't break anything.

> > Strings can, however, be in any supported code set, not just ISO
> > 8859-1. That includes UTF-8, so the whole of the Unicode space (and
> > more) can be supported.
> 
> How does the application express which encoding a string is in?

With an omniORB specific function omniORB.nativeCharCodeSet. The
function can only be called once, before the ORB is initialised. The
ORB has to know the native code set so it can put it in the profiles
of IORs it exports. The default is, of course, ISO 8859-1. The only
other thing I expect people to use if UTF-8, since that can cover most
things, and is important for Java interoperability. Once a code set
has been chosen, all strings are assumed to use that code set.

Cheers,

Duncan.

-- 
 -- Duncan Grisby  \  Research Engineer  --
  -- AT&T Laboratories Cambridge          --
   -- http://www.uk.research.att.com/~dpg1 --