[omniORB] Working omniORB2 on linux, how?

Sai-Lai Lo S.Lo@uk.research.att.com
27 Oct 1999 19:33:00 +0100


Morten,

Its interesting that problem report tends to occur in clusters.
The assertion failure you saw has been reported just a few weeks ago by
Andy Carlson. Andy pointed out that, the assertion failure occurs when
the ORB fails to create a new thread to serve a new connection. The ORB has
code to handle this gracefully but there is a bug in the unwinding. Here is
the fix:

src/lib/omniORB2/orbcore/tcpSocketMTfactory.cc:

class tcpSocketWorker : public omni_thread {
public:
  tcpSocketWorker(tcpSocketStrand* s, tcpSocketMTincomingFactory* f) : 
          omni_thread(s), pd_factory(f), pd_sync(s,0,0) 
    {
      start();               <-- 
      s->decrRefCount();     <-- swap the two statements
    }
   .....
};

I run your test on a Redhat 5.2 (linux 2.2.12) system. You test creates so
many threads that all the process slots (400) are used up. Hence the
pthread_create calls failed and the ORB stumble on the bug above.

Even with the fix applied, doing a ps -x I'm seeing a number of process
slots (the thing that represent threads on linux) hang around forever. I
think this may be caused by glibc (the linux thread runtime?) not handling
resource exhaustion gracefully. I have not checked if glibc-2.1 behaves better.

The change has been committed to the 2.8 and the 3.0 tree. It is available
via the public cvs or the overnight snapshot. (Should have committed the
change some time ago.)


Sai-Lai



-- 
Sai-Lai Lo                                   S.Lo@uk.research.att.com
AT&T Laboratories Cambridge           WWW:   http://www.uk.research.att.com 
24a Trumpington Street                Tel:   +44 1223 343000
Cambridge CB2 1QA                     Fax:   +44 1223 313542
ENGLAND