[omniORB] RE: Strange Error

Duncan Grisby duncan at grisby.org
Fri Nov 12 14:27:22 GMT 2004


On Wednesday 3 November, Wilson Jimmy - jiwils wrote:

> Here's the requested trace.  I should note that setting "lcdMode" to "1"
> causes this corbaloc to behave just as the working corbaloc specifying IIOP
> 1.1, but I would still like to understand what is going on.

I've picked through what's going on, and it's a bug in TAO (or the
CORBA specification process, depending on how you look at it).

I'll quote bits of your trace, with some explanation of what's going
on...

> omniORB: Creating ref to remote: key<NameService>
>  target id      : IDL:omg.org/CORBA/Object:1.0
>  most derived id:

The application has done string_to_object on
corbaname::1.2 at ds20clu3.conway.acxiom.com:3434 so it builds a
reference with the key NameService, and base CORBA::Object repository
id.

> omniORB: Creating Python ref to unknown: key<NameService>
>  target id      : IDL:omg.org/CORBA/Object:1.0
>  most derived id:

Since you're using Python, that reference gets converted to one
suitable for Python.

At this point, the client code tries to narrow to a
NamingContextExt...

> omniORB: omniRemoteIdentity deleted.
> omniORB: ObjRef() -- deleted.
> omniORB: Client attempt to connect to
> giop:tcp:ds20clu3.conway.acxiom.com:3434

We connect to the machine.

> omniORB: Python thread state scavenger start.
> omniORB: AsyncInvoker: thread id = 2 has started. Total threads = 1
> omniORB: Scavenger task execute.
> omniORB: Client opened connection to giop:tcp:139.61.179.111:3434
> omniORB: sendChunk: to giop:tcp:139.61.179.111:3434 103 bytes
> omniORB:
> 4749 4f50 0102 0100 5b00 0000 0200 0000 GIOP....[.......
> 0300 0000 0000 6c02 0b00 0000 4e61 6d65 ......l.....Name
> 5365 7276 6963 6504 0600 0000 5f69 735f Service....._is_
> 6100 6403 0000 0000 2b00 0000 4944 4c3a a.d.....+...IDL:
> 6f6d 672e 6f72 672f 436f 734e 616d 696e omg.org/CosNamin
> 672f 4e61 6d69 6e67 436f 6e74 6578 7445 g/NamingContextE
> 7874 3a31 2e30 00                       xt:1.0.

We ask if the reference with key NameService is a NamingContextExt by
calling _is_a on it.

> omniORB: inputMessage: from giop:tcp:139.61.179.111:3434 256 bytes
> omniORB:
> 4749 4f50 0102 0101 f400 0000 0200 0000 GIOP............
> 0300 0000 0000 0000 2b00 0000 4944 4c3a ........+...IDL:
> 6f6d 672e 6f72 672f 436f 734e 616d 696e omg.org/CosNamin
> 672f 4e61 6d69 6e67 436f 6e74 6578 7445 g/NamingContextE
> 7874 3a31 2e30 0000 0100 0000 0000 0000 xt:1.0..........
> ac00 0000 0101 0200 0f00 0000 3133 392e ............139.
> 3631 2e31 3739 2e31 3131 006f 6a0d 2e61 61.179.111.oj..a
> 3300 0000 1401 0f00 4e55 5000 0000 1500 3.......NUP.....
> 0000 0001 0000 0000 4e61 6d65 5365 7276 ........NameServ
> 6963 6500 0118 0000 0100 0000 4e61 6d65 ice.........Name
> 5365 7276 6963 6500 0300 0000 0000 0000 Service.........
> 0800 0000 0100 0000 004f 4154 0100 0000 .........OAT....
> 1400 0000 0100 0000 0100 0100 0000 0000 ................
> 0901 0100 0000 0000 024f 4154 2000 0000 .........OAT ...
> 0100 0000 0100 0000 0f00 0000 3133 392e ............139.
> 3631 2e31 3739 2e31 3131 0000 6a0d ffff 61.179.111..j...

Rather than replying with yes or no, we get a location forward message
telling us to try a different object reference.

> omniORB: Creating ref to remote:
> key<....NUP.............NameService.........NameService>
>  target id      : IDL:omg.org/CORBA/Object:1.0
>  most derived id: IDL:omg.org/CosNaming/NamingContextExt:1.0

Create a reference to the new object.

> omniORB: GIOP::LOCATION_FORWARD -- retry request.
> omniORB: omniRemoteIdentity deleted.
> omniORB: ObjRef(IDL:omg.org/CosNaming/NamingContextExt:1.0) -- deleted.
> omniORB:  send codeset service context: (ISO-8859-1,UTF-16)
> omniORB: Client attempt to connect to giop:tcp:139.61.179.111:3434
> omniORB: Client opened connection to giop:tcp:139.61.179.111:3434

Connect to the new reference.

> omniORB: sendChunk: to giop:tcp:139.61.179.111:3434 167 bytes
> omniORB:
> 4749 4f50 0102 0100 9b00 0000 0200 0000 GIOP............
> 0300 0000 0000 734e 3300 0000 1401 0f00 ......sN3.......
> 4e55 5000 0000 1500 0000 0001 0000 0000 NUP.............
> 4e61 6d65 5365 7276 6963 6500 0118 0000 NameService.....
> 0100 0000 4e61 6d65 5365 7276 6963 6504 ....NameService.
> 0600 0000 5f69 735f 6100 0000 0100 0000 ...._is_a.......
> 0100 0000 0c00 0000 0100 0000 0100 0100 ................
> 0901 0100 6f67 7261 2b00 0000 4944 4c3a ....ogra+...IDL:
> 6f6d 672e 6f72 672f 436f 734e 616d 696e omg.org/CosNamin
> 672f 4e61 6d69 6e67 436f 6e74 6578 7445 g/NamingContextE
> 7874 3a31 2e30 00                       xt:1.0.

Send the _is_a again.

> omniORB: inputMessage: from giop:tcp:139.61.179.111:3434 25 bytes
> omniORB:
> 4749 4f50 0102 0101 0d00 0000 0200 0000 GIOP............
> 0000 0000 0000 0000 01                  .........

Yes, it's a NamingContextExt.

> omniORB: Creating Python ref to unknown: key<NameService>
>  target id      : IDL:omg.org/CosNaming/NamingContextExt:1.0
>  most derived id:

Create a Python object reference to the NamingContextExt.

> omniORB: omniRemoteIdentity deleted.
> omniORB: ObjRef() -- deleted.
> omniORB: LocateRequest to remote: key<NameService>
> omniORB: sendChunk: to giop:tcp:139.61.179.111:3434 35 bytes
> omniORB:
> 4749 4f50 0102 0103 1700 0000 0400 0000 GIOP............
> 0000 0000 0b00 0000 4e61 6d65 5365 7276 ........NameServ
> 6963 65                                 ice

Do a LocateRequest on the object to make sure it exists. This step is
actually redundant, since we've already contacted it once. If you were
just using C++, this bit wouldn't happen, but omniORBpy doesn't get
the information from the C++ side about having already talked to the
object. Perhaps it should. Anyway, this is an OK thing to do.

> omniORB: inputMessage: from giop:tcp:139.61.179.111:3434 256 bytes
> omniORB:
> 4749 4f50 0102 0104 f400 0000 0400 0000 GIOP............
> 0200 0000 0000 0000 2b00 0000 4944 4c3a ........+...IDL:
> 6f6d 672e 6f72 672f 436f 734e 616d 696e omg.org/CosNamin
> 672f 4e61 6d69 6e67 436f 6e74 6578 7445 g/NamingContextE
> 7874 3a31 2e30 0000 0100 0000 0000 0000 xt:1.0..........
> ac00 0000 0101 0200 0f00 0000 3133 392e ............139.
> 3631 2e31 3739 2e31 3131 006f 6a0d 2e61 61.179.111.oj..a
> 3300 0000 1401 0f00 4e55 5000 0000 1500 3.......NUP.....
> 0000 0001 0000 0000 4e61 6d65 5365 7276 ........NameServ
> 6963 6500 0118 0000 0100 0000 4e61 6d65 ice.........Name
> 5365 7276 6963 6500 0300 0000 0000 0000 Service.........
> 0800 0000 0100 0000 004f 4154 0100 0000 .........OAT....
> 1400 0000 0100 0000 0100 0100 0000 0000 ................
> 0901 0100 0000 0000 024f 4154 2000 0000 .........OAT ...
> 0100 0000 0100 0000 0f00 0000 3133 392e ............139.
> 3631 2e31 3739 2e31 3131 0000 6a0d ffff 61.179.111..j...

omniORBpy went back to the original object reference, so TAO tried to
reply with an OBJECT_FORWARD reply to the LocateRequest. This is where
it all goes wrong. Section 15.4.6.2 of the CORBA 2.6 specification
says:

  "LocateReply bodies are marshaled immediately following the
   LocateReply header."

Unfortunately, the CORBA 2.5 spec says:

  "In GIOP version 1.0 and 1.1, Locate reply bodies are marshaled into
   the CDR encapsulation of the containing Message immediately
   following the Reply Header. In GIOP version 1.2, the Reply Body is
   always aligned on an 8-octet boundary."

omniORB is expecting the CORBA 2.6 behaviour; TAO is doing the
alignment specified in CORBA 2.5. The difference is in these bytes:

> 4749 4f50 0102 0104 f400 0000 0400 0000 GIOP............
> 0200 0000 0000 0000 2b00 0000 4944 4c3a ........+...IDL:
            ^^^^^^^^^

Those zeros are padding that TAO put in to align the rest of the
message on an 8 octet boundary. omniORB reads them as meaning the
repository id that comes next has zero length. After that, it's out of
sync with the marshalled stream, so it misinterprets the bytes. Sooner
or later, it interprets something as a very long length, and...

> omniORB: throw MARSHAL from exception.cc:443 (YES,MARSHAL_SequenceIsTooLong)

...gives up.


Unfortunately, both TAO and omniORB are behaving correctly with regard
to the CORBA spec, just different versions of it. I really don't know
how to resolve this. It's not possible for omniORB to figure out that
there's extra padding because the padding octets can have any value.
By the time the MARSHAL exception comes, it's too late to go back and
have another go with the padding. In unfortunate cases, it may be the
case that the misaligned data actually unmarshals OK, just with
completely bogus values.

The CORBA 3.0 spec has the same wording as 2.6, so I guess TAO is
going to have to change sooner or later.

Cheers,

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --



More information about the omniORB-list mailing list