[omniORB] Assertion failure in giopImpl12 (omniORB4)

Chris Newbold cnewbold@laurelnetworks.com
Fri Dec 6 19:42:02 2002


On Thu, 2002-12-05 at 11:23, Duncan Grisby wrote:
> On Thursday 5 December, Chris Newbold wrote:
> 
> > I believe that tracing was with -ORBtraceLevel 30, though not
> > -ORBtraceThreadId. I assume the latter isn't going to magically show
> > GIOP traffic... Where does the GIOP traffic go? Through the trace logger
> > or somewhere else?
> 
> Looking at the code, I see that the GIOP dump always goes to stderr,
> rather than through the logger. I guess I should change that...

Okay, I've now got tracing which include the GIOP traffic and, I think,
have figured out what's going on.

In our scenario, we're running a number of processes all on one machine.
in the "real world", these processes are intended to run on separate
machines, but it's convenient for testing to plop them all on the same
machine.

All of the processes are busy talking to each other when we decide to
kill one of them off. This results in the other two frantically
attempting to make calls to the now defunct process to re-establish
connectivity. Each of these attempts consumes another ephemeral port
(since omniORB creates a new socket for each attempt). Now, two
processes with multiple threads each all doing this means that we chew
through the entire ephemeral port range fairly quickly.

The server process which we killed used to be listening on port M (4026
in the tracing below) . The client that eventually crashes is going
through ephemeral ports trying to connect back, starting at port N.

The client uses N to connect to M and fails; then N + 1, N + 2 and so
on. Eventually N == M and the client connects his socket back on itself!
In the tracing you can see the client connect and send a 96-byte request
which he promptly receives back again :-)

Dec  6 11:14:31.031290 rcpd[10870:10960]: omniORB                   D omniORB:  send codeset service context: (ISO-8859-1,UTF-16)
Dec  6 11:14:31.031700 rcpd[10870:10960]: omniORB                   D omniORB: throw giopStream::CommFailure from giopStream.cc:1045(0,NO,TRANSIENT_ConnectFailed)
Dec  6 11:14:31.032065 rcpd[10870:10960]: omniORB                   D omniORB: throw TRANSIENT from omniObjRef.cc:732 (NO,TRANSIENT_ConnectFailed)
Dec  6 11:14:31.036226 rcpd[10870:10960]: omniORB                   D omniORB:  send codeset service context: (ISO-8859-1,UTF-16)
Dec  6 11:14:31.036714 rcpd[10870:10960]: omniORB                   D omniORB: throw giopStream::CommFailure from giopStream.cc:1045(0,NO,TRANSIENT_ConnectFailed)
Dec  6 11:14:31.037064 rcpd[10870:10960]: omniORB                   D omniORB: throw TRANSIENT from omniObjRef.cc:732 (NO,TRANSIENT_ConnectFailed)
Dec  6 11:14:31.037913 rcpd[10870:10960]: omniORB                   D omniORB: omniRemoteIdentity deleted.
Dec  6 11:14:31.038130 rcpd[10870:10960]: omniORB                   D omniORB: ObjRef(IDL:mpls2/ManagedInterface:1.0) -- deleted.
Dec  6 11:14:31.038366 rcpd[10870:10960]: omniORB                   D omniORB:  send codeset service context: (ISO-8859-1,UTF-16)
Dec  6 11:14:31.038785 rcpd[10870:10960]: omniORB                   D omniORB: throw giopStream::CommFailure from giopStream.cc:1045(0,NO,TRANSIENT_ConnectFailed)
Dec  6 11:14:31.039138 rcpd[10870:10960]: omniORB                   D omniORB: throw TRANSIENT from omniObjRef.cc:732 (NO,TRANSIENT_ConnectFailed)
Dec  6 11:14:31.040250 rcpd[10870:10960]: omniORB                   D omniORB:  send codeset service context: (ISO-8859-1,UTF-16)
Dec  6 11:14:31.040733 rcpd[10870:10960]: omniORB                   D omniORB: throw giopStream::CommFailure from giopStream.cc:1045(0,NO,TRANSIENT_ConnectFailed)
Dec  6 11:14:31.041088 rcpd[10870:10960]: omniORB                   D omniORB: throw TRANSIENT from omniObjRef.cc:732 (NO,TRANSIENT_ConnectFailed)
Dec  6 11:14:31.041939 rcpd[10870:10960]: omniORB                   D omniORB: omniRemoteIdentity deleted.
Dec  6 11:14:31.042135 rcpd[10870:10960]: omniORB                   D omniORB: ObjRef(IDL:mpls2/ManagedInterface:1.0) -- deleted.
Dec  6 11:14:31.042356 rcpd[10870:10960]: omniORB                   D omniORB:  send codeset service context: (ISO-8859-1,UTF-16)
Dec  6 11:14:31.042765 rcpd[10870:10960]: omniORB                   D omniORB: Client opened connection to giop:tcp:172.18.0.84:4026
Dec  6 11:14:31.042924 rcpd[10870:10960]: omniORB                   D omniORB: sendChunk: to giop:tcp:172.18.0.84:4026 96 bytes
Dec  6 11:14:31.044508 rcpd[10870]: 4749 4f50 0102 0100 5400 0000 0200 0000 GIOP....T.......
Dec  6 11:14:31.044779 rcpd[10870]: 0300 0000 0000 0000 1700 0000 ff4c 534c .............LSL
Dec  6 11:14:31.044903 rcpd[10870]: 50fe 21cc f03d 0200 2a7e 0030 3030 3030 P.!..=..*~.00000
Dec  6 11:14:31.045021 rcpd[10870]: 3030 6200 0d00 0000 5f67 6574 5f69 6649 00b....._get_ifI
Dec  6 11:14:31.045137 rcpd[10870]: 6e64 6578 0000 0000 0100 0000 0100 0000 ndex............
Dec  6 11:14:31.045253 rcpd[10870]: 0c00 0000 0100 0000 0100 0100 0901 0100 ................
Dec  6 11:14:31.045508 rcpd[10870:10960]: omniORB                   D omniORB: inputMessage: from giop:tcp:172.18.0.84:4026 96 bytes
Dec  6 11:14:31.046113 rcpd[10870]: 4749 4f50 0102 0100 5400 0000 0200 0000 GIOP....T.......
Dec  6 11:14:31.046330 rcpd[10870]: 0300 0000 0000 0000 1700 0000 ff4c 534c .............LSL
Dec  6 11:14:31.046450 rcpd[10870]: 50fe 21cc f03d 0200 2a7e 0030 3030 3030 P.!..=..*~.00000
Dec  6 11:14:31.046609 rcpd[10870]: 3030 6200 0d00 0000 5f67 6574 5f69 6649 00b....._get_ifI
Dec  6 11:14:31.046768 rcpd[10870]: 6e64 6578 0000 0000 0100 0000 0100 0000 ndex............
Dec  6 11:14:31.046888 rcpd[10870]: 0c00 0000 0100 0000 0100 0100 0901 0100 ................
Dec  6 11:14:31.047148 rcpd[10870:10960]: omniORB                   D omniORB: Assertion failed.  This indicates a bug in the application using

Fixing giopImp12 not to assert will certainly prevent the crash: you'll
get a protocol error, close port M and move on to M + 1 and you'll be
okay.

Perhaps using some logic to avoid this situation explicitly would be in
order: if (getsockname(...) == getpeername(...)) { /* Wheee! */ }

-- 
====( Chris Newbold  <cnewbold@laurelnetworks.com> )==========================
      Laurel Networks, Inc. voice: +1 412 809 4200 fax: +1 412 809 4201
"If you fool around with a thing for very long you will screw it up." --Murphy
------------------------------------------------------------------------------