[omniORB] Timed out waiting for rendezvousers to terminate

Mike Richmond mike.richmond at globalgraphics.com
Fri Sep 21 18:18:56 BST 2007


I am using omniORB 4.1.0 and am seeing a problem whereby my app takes 
a long time (about 10 seconds) to quit.  I've tracked this down to it 
timing out when waiting for the rendezvouser to terminate 
(giopServer.cc:683).  I've tried extending the wait by changing the 
value of timeout in gdb to 30 seconds, but the rendezvouser is still 
timed out.  Before and after the wait the rendezvouser thread is in 
this state:

Thread 3 (process 993 thread 0x160b):
#0  0x9001a1cc in select ()
#1  0x0207fe42 in omni::do_select (maxfd=17, r=0xb0101cd4, w=0x0, 
e=0x0, t=0x0) at SocketCollection.cc:1161
#2  0x02080136 in omni::SocketCollection::Select (this=0x493c258) at 
SocketCollection.cc:1239
#3  0x020a2c82 in omni::tcpEndpoint::AcceptAndMonitor 
(this=0x493c250, func=0x206a334 
<omni::giopRendezvouser::notifyReadable(void*, 
omni::giopConnection*)>, cookie=0x493c700) at ./tcp/tcpEndpoint.cc:613
#4  0x0206a425 in omni::giopRendezvouser::execute (this=0x493c700) at 
giopRendezvouser.cc:97
#5  0x020bcd95 in omniAsyncWorker::real_run (this=0x493c730) at invoker.cc:234
#6  0x02023811 in omniAsyncWorkerInfo::run (this=0xb0101ef4) at invoker.cc:282
#7  0x020bd033 in omniAsyncWorker::run (this=0x493c730) at invoker.cc:161
#8  0x015d3862 in omni_thread_wrapper (ptr=0x493c730) at posix.cc:451
#9  0x90024227 in _pthread_body ()

Adding a breakpoint shows that omni::SocketCollection::Select() does 
not return.

However if I step through the rendezvouser terminate() method, and in 
particular through tcpAddress->Poke(), then 
omni::SocketCollection::Select() does return.  In tcpAddress->Poke() 
::connect() gives EINPROGRESS, and CLOSESOCKET() returns 0.

My theory is that it is possible to close the socket in 
tcpAddress->Poke() before it has "done enough to poke the endpoint". 
In support of this theory I observe that 
omni::SocketCollection::Select() returns if I sleep for a short time 
before closing the socket in tcpAddress->Poke(), or if I undefine 
USE_NONBLOCKING_CONNECT.

My machine is pretty quick - a 2 x 2.66 GHz Dual-Core Intel Xeon Mac 
Pro running Mac OS X 10.4.10.  Unfortunately I don't know enough 
about sockets to know if this is a problem with tcpAddress->Poke(), 
or with the socket implementation on Mac OS X.  After some googling I 
tried adding a loop calling getsockopt( sock, SOL_SOCKET, SO_ERROR, 
&err, &len ) between ::connect() and CLOSESOCKET() but getsockopt() 
returned 0, err = 0 on the first call and didn't fix my problem.

Any suggestions?

Mike Richmond
Global Graphics Software Ltd



More information about the omniORB-list mailing list