[omniORB] omniORBpy, Python 2.0 and unicode

Duncan Grisby dgrisby@uk.research.att.com
Mon, 11 Dec 2000 12:44:21 +0000


I'm cross-posting this to the Python do-sig, since I think it is
relevant to the standard Python mapping, not just omniORBpy.

On Sunday 10 December, uche.ogbuji@fourthought.com wrote:

> Based on quick and dirty tests, if you pass a Python unicode object
> in a string-type argument using omniORBpy and omniORB (latest CVS),
> you get a CORBA BAD_PARAM exception.
>
> The Python/CORBA binding doesn't say anything about unicode.
> Understandable since it predates Python/Unicode, but it seems to
> make sense that Python unicode strings should be accepted as string
> data.  For one thing, this would parallel the Java binding.
> 
> Am I right that one can't use unicode objects for string parameters
> in omniORBpy?  Are there any plans to change this?  Using omniORB
> for XML processing (in 4Suite Server), one runs into many unicode
> objects and it would be quite a burden to precede every CORBA
> invocation with code for character encoding.

At present, parameters described as string in IDL must be Python
strings. omniORB 3 only supports GIOP 1.0, so the only string data
which can be transmitted must be ISO 8859-1.

The next major releases of omniORB and omniORBpy (4.0 and 2.0
respectively) will fully support CORBA's code set negotiation, and the
wstring type. What I've implemented at the moment is that wstring maps
to Python unicode, but string still only maps to Python string.
Strings can, however, be in any supported code set, not just ISO
8859-1. That includes UTF-8, so the whole of the Unicode space (and
more) can be supported.

It would not be much effort to extend omniORBpy so it accepted unicode
objects when it was expecting strings, but I'm not sure it's a good
idea. Following the general Python mantra of "explicit is better than
implicit", I'd lean towards forcing the programmer to convert their
unicode objects to strings in their chosen encoding, rather than
having the ORB do it. I think it's analogous to disallowing passing
floating point values where integers are expected (although integers
_are_ accepted where floating point values are expected).

I'm not totally convinced, though. Does anyone else have an opinion on
the matter?

Cheers,

Duncan.

PS. The adventurous can try out omniORB 4 and omniORBpy 2 by checking
out the omni4_0_develop and omnipy2_develop branches from CVS. Be
warned that there are some known bugs which will bite you, but you
should be able to try out the code set negotiation stuff. There will
be some very significant changes to the code base before release.

-- 
 -- Duncan Grisby  \  Research Engineer  --
  -- AT&T Laboratories Cambridge          --
   -- http://www.uk.research.att.com/~dpg1 --