[omniORB] [omniORB 2.8.0] Connection handling bug?

Simha, Rakshit RSimha at metasolv.com
Fri Jun 3 12:48:02 BST 2005


Hi folks,

I am seeing some interesting behaviour with omniORB 2.8.0 
in the following situation:

- Solaris 8; omniORB 2.8.0; client compiled using SunPRO; 
  server compiled using gcc; client and server located on 
  same host; both client and server have set max. TCP
  connections per server to (say) 5 before initializing
  the ORB.

- multi-threaded client makes CORBA call to server; while
  server is processing the call, the client thread blocks:
  -----------------  lwp# 34 / thread# 100  --------------------
   fee1bfb8 recv     (f, 9aba18, 2008, 0)
   ff1d53fc unsigned tcpSocketStrand::ll_recv(void*,unsigned) (9289a0, 9aba18, 2008, 9aba30, 9aba18, 9aba18) + c4
   ff1d2048 void reliableStreamStrand::fetch(unsigned long) (9289a0, 0, fffffff8, 9aba1f, ff1d1f78, 9aba18) + 58
   ff1d1c90 Strand::sbuf reliableStreamStrand::receive(unsigned,unsigned char,int, unsigned char) (9289a0, 2000, 0, 8, 1, 1) + 84
   ff1bea88 void NetBufferedStream::receive(unsigned,unsigned char) (fe080f08, 8, 1, 8, ff1d1f78, 0) + 11c
   ff1bf218 void*NetBufferedStream::align_and_get_bytes(omni::alignment_t,unsigned,unsigned char) (8, 1, 8, 1, 0, 9a9a40) + 54
   ff1be3c8 void NetBufferedStream::get_char_array(unsigned char*,int,omni::alignment_t,unsigned char) (fe080f08, fe080e40, 8, 1, 1, 0) + 134
   ff1b3bc8 GIOP::ReplyStatusType GIOP_C::ReceiveReply() (fe080f08, fe080f08, 3, 7437d9, 7d6dc8, 3f) + f4
   ff1d0448 void OmniProxyCallWrapper::invoke(omniObject*,OmniProxyCallDesc&) (83b13c, fe080fd4, 7437d9, 0, ff1d11a8, ff20cb74) + 128
   004525e0 void Common::_proxy_Transaction::commit() (83b198, 271078, 927118, 87f9b8, fef6a07c, 20) + 60
   00267398 void ProxyAgentDevice::transactionEnd() (8988b8, fe0812f4, 1, 0, 72e2ac, 61d836e4) + 2d8
   0026537c void ProxyAgentDevice::transaction() (8988b8, 271078, 1, 0, fe0818cb, 0) + 16c
   00263d00 void ProxyAgentDevice::configure() (8988b8, fe081874, 1, 925478, fe081943, 0) + 760
   002958ac void ProxyAgentDevicePoller::doConfig() (827c70, 827dd0, 0, fe0819eb,0, 0) + 324
   002a0398 void DeferredCall<ProxyAgentDevicePoller>::timerCallback(const Scheduler::Timer*) (827dc0, 92a680, fe081a93, 0, 0, 33d634e1) + 80
   00335b68 void Scheduler::ScheduledTimer::triggerTimer(Scheduler::Timer*)const (97abc8, 92a680, 0, fffffff8, 0, 81fa39) + 28
   00334a38 void*Scheduler::run_undetached(void*) (8fe0f0, 0, ff26c000, 0, 8fe0f0, 11c0c) + 630
   ff35269c omni_thread_wrapper (8fe0f0, fecf5d38, 0, 5, 1, fe401000) + e4
   ff25b01c _thread_start (8fe0f0, 0, 0, 0, 0, 0) + 40

- while this thread is still blocked, another client 
  thread makes another CORBA call to the server, to a 
  different remote object. Most of the time, this call
  goes through.  But once in a while, the call "freezes":
  --------------------------  thread# 36  --------------------
   ff2481ac cond_wait (fe181d98, 0, 0, ff26c000, 0, 0) + 11c
   ff248070 pthread_cond_wait (9289b0, 7fc2e0, 1, 0, 72e2ac, 0) + 8
   ff352094 void omni_condition::wait() (9289a8, 44d00, 1, 0, 3039, ffff8000) + 18
   ff1c835c void Strand::Sync::RdLock(unsigned char) (fe180f68, 9289a0, ff1c89cc, fef555ec, 9289a0, ffffffff) + 50
   ff1c8170 Strand::Sync::Sync #Nvariant 1(Rope*,unsigned char,unsigned char) (fe180f68, 7fc2d8, 1, 1, 4ea70, fe180fbc) + 60
   ff1be128 NetBufferedStream::NetBufferedStream(Rope*,unsigned char,unsigned char,unsigned) (fe180f68, 7fc2d8, 1, 1, 0, bc) + 28
   ff1b3768 GIOP_C::GIOP_C #Nvariant 1(Rope*) (fe180f68, 7fc2d8, 6, 1, fef6a07c, bc) + 14
   ff1d039c void OmniProxyCallWrapper::invoke(omniObject*,OmniProxyCallDesc&) (83b00c, fe181034, 6, 0, ff1d11a8, ff20cb74) + 7c
   00451ba8 void Common::_proxy_Transaction::begin() (83b068, 271078, 866900, 8fff50, fef6a07c, 20) + 60
   00266834 void ProxyAgentDevice::transactionStart() (883808, fe1812f4, 1, 0, 72e2ac, 61d836e4) + 1ac
   0026533c void ProxyAgentDevice::transaction() (883808, 271078, 1, 0, fe1818cb, 0) + 12c
   00263d00 void ProxyAgentDevice::configure() (883808, fe181874, 1, 967ed0, fe181943, 0) + 760
   002958ac void ProxyAgentDevicePoller::doConfig() (85f168, 85f2c8, 0, fe1819eb, 0, 0) + 324
   002a0398 void DeferredCall<ProxyAgentDevicePoller>::timerCallback(const Scheduler::Timer*) (85f2b8, 8773f8, fe181a93, 0, 0, 2d3f9822) + 80
   00335b68 void Scheduler::ScheduledTimer::triggerTimer(Scheduler::Timer*)const (9603f0, 8773f8, 0, fffffff8, 0, 888191) + 28
   00334a38 void*Scheduler::run_undetached(void*) (84e0d0, 0, ff26c000, 0, 84e0d0, 11c0c) + 630
   ff35269c omni_thread_wrapper (84e0d0, fec25d38, 0, 5, 1, fe401000) + e4
   ff25b01c _thread_start (84e0d0, 0, 0, 0, 0, 0) + 40

- This sometimes happens when there is one connection 
  already between client and server, and sometimes when 
  there are more (provided the total connections remains
  <= 5, in this instance).  It happens more often on a
  multi-CPU host.

- When this happens, the only way to get the second CORBA
  call to go through is to wait for the first call to 
  complete.  This severely impacts the software's 
  predictability and throughput.

>From what I can see, the creation of a GIOP_C object
results in an attempt to lock the Rope given to this
object (by the grandparent class Strand::Sync). This
Rope is unlocked in GIOP_C dtor.  If there is an
existing GIOP_C object for thread 0x100, it will have
the lock on the Rope. If the same Rope is passed to
the GIOP_C object in thread 0x36, then this ctor will
block trying to acquire the read-lock on the Rope
(since the object in thread 0x100 already has the
write-lock on this Rope).

But what I don't understand is: why does this not
happen all the time?  Is a different Rope handed to
each GIOP_C under normal circumstances?

I do realize the version of omniORB is very old - but
there are constraints that prevent an upgrade in the
near timeframe.  I'll settle for an explanation of how
this is supposed to work and any pointers on where I
can start digging.  If this is an issue recognized and
solved in a subsequent release, that would be great &
I'll appreciate any info on the fix location in CVS.

Regards,
Rak.

---
Rakshit Simha
Extension: 7-43/8324              MetaSolv Software Canada, Inc.
Phone: (613) 287-8324             360 Legget Dr., Kanata, ON K2K 3N1
Fax: (613) 287-8288               http://www.metasolv.com



More information about the omniORB-list mailing list