[omniORB] possible bug found (was Re: spurious COMM_FAILURE when using fixed ports)

Thu Jan 22 15:59:48 UTC 2026

Hi,

with the help of log, gdb and strace I found the difference, it's the timeout setting.

Now it's easy to reproduce with the binaries in examples/call_back.

If you start the server like this:

./cb_server -ORBclientCallTimeOutPeriod 6000 -ORBtraceLevel 40 > ior.txt

and then

IOR=`cat ior.txt`

./cb_client -ORBendPoint giop:tcp:localhost:53234  -ORBtraceLevel 40 $IOR

./cb_client -ORBendPoint giop:tcp:localhost:53234  -ORBtraceLevel 40 $IOR

the second client will receive the  (MAYBE,COMM_FAILURE_WaitingForReply) exception always.

This is generated in the server while calling the callback.

The reason is in tcpConnection::Recv in tcpConnection.cc. If there is no timeout set,

it will simply call recv(), which will then read the CloseConnection message from the first client run and rebuild

the connection properly.

Once a timeout is set, it will call poll(), which gets this:

  poll([{fd=22, events=POLLIN}], 1, 60000) = 1 ([{fd=22, revents=POLLIN|POLLERR|POLLHUP}])

in this case, it is not tried to recv anything but an error is thrown and propagated. I'll try to do a recv there anyway to

process what's still in the buffer and see if that helps.

Would this be a possible solution or would it be better handled elsewhere?

Greetings,

   Michael

On 1/22/26 10:13, Michael Teske via omniORB-list wrote:
>
> Hi,
>
>
> now I understand at least the working case from the example. The client calls sendCloseConnection
>
> from giopServer.cc (because the client is the "server" (corba-wise) for the callback connection), which is not trace-logged unfortunately.
>
> The server gets this message but does not process it yet. It will be done later, when it tries to send the next callback on the same connection
>
> to the next instance of the client which used the same callback port, in giopImpl12::inputQueueMessage.
>
> This  leads to the already mentioned output from the server:
>
> omniORB: (5) 2026-01-21 10:34:09.773590: Reset rope addresses (current address giop:tcp:[::1]:53234)
> omniORB: (5) 2026-01-21 10:34:09.773595: Orderly connection shutdown: giop:tcp:[::1]:53234 <-------
> omniORB: (5) 2026-01-21 10:34:09.773599: throw giopStream::CommFailure from giopImpl12.cc:192(1,NO,COMM_FAILURE_WaitingForReply)
>
> This is not forwarded to the caller but instead the connection is re-established as it should be.
>
> What's left is to find out, what happens in our case where we get
>>
>> omniORB: (199) 2026-01-21 10:04:51.861953: Reset rope addresses (current address giop:tcp:[::1]:53234)
>> omniORB: (199) 2026-01-21 10:04:51.861962: Error in network receive (start of message): giop:tcp:[::1]:53234
>> omniORB: (199) 2026-01-21 10:04:51.861966: throw giopStream::CommFailure from giopStream.cc:857(0,MAYBE,COMM_FAILURE_WaitingForReply)
>>
> I'll put some more log in the lib and let you know about what I found out.
>
> Regards, Michael
>
>
> _______________________________________________
> omniORB-list mailing list
> omniORB-list at omniorb-support.com
> https://www.omniorb-support.com/mailman/listinfo/omniorb-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.omniorb-support.com/pipermail/omniorb-list/attachments/20260122/628efdf1/attachment.htm>