[omniORB] high rate requests with small timeout cause excessive connection/thread usage

Serguei Kolos Serguei.Kolos at cern.ch
Sat Jul 5 18:33:45 BST 2008


Hello

I have an application which is acting as router of oneway messages - it 
receives them from many clients
and then forwards them to several subscribers at high rate using small 
timeout value (10 milliseconds).
The timeout is so small to make the router more responsive to its 
clients. If the forwarding request times out
then the message is placed in the buffer and the router has dedicated 
thread which is trying to
re-send this message.
What I have noticed is that on slow subscribers the omniORB is creating 
new threads rapidly and very
soon gets to the limit. Investigating the problem more deeply I have 
found that this happens because the
omniORB on the router side creates new Strand object if an attempt to 
send request timed out. When
new Strand object is created it opens new connection to the receiver. 
With the small timeout this happens
much more rapidly then destruction of the old connections (and threads).
Just as a proof of principal I have modified the giopStream::errorOnSend 
function in the following way

  if (rc == 0) {
    // Timeout.
    // We do not use the return code from the function.
//    pd_strand->state(giopStrand::DYING); // this is the old line
    pd_strand->state(giopStrand::TIMEDOUT);  // this 2 lines
    ((GIOP_C*)this)->state( IOP_C::Idle );         // are the new ones 
used to prevent creation of new Strand object
    retry = 0;
    minor = TRANSIENT_CallTimedout;
  }

and the issue disappeared - the number of threads was kept constant on 
the subscriber side. It seems that this way
the old Strand is reused after timeout and no new strands are created. 
But then I realized that the fix by itself
is incorrect since it caused problems in some other applications.
Can you please suggest a proper fix for that issue.
I'm using omniORB 4.0.7 on SLC4 Linux (kernel 2.6) with gcc 3.4.4. 
Subscribers are running in the thread per
connection mode with the maxServerThreadPerConnection = 20

Cheers,
Sergei

PS: Here is a fragment of the receiver output running with the traceLevel 10
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43601 because 
of this rule: "* unix,ssl,tcp"
omniORB: AsyncInvoker: thread id = 295 has started. Total threads = 295
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43602 because 
of this rule: "* unix,ssl,tcp"
omniORB: AsyncInvoker: thread id = 296 has started. Total threads = 296
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43603 because 
of this rule: "* unix,ssl,tcp"
omniORB: AsyncInvoker: thread id = 297 has started. Total threads = 297
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43604 because 
of this rule: "* unix,ssl,tcp"
omniORB: AsyncInvoker: thread id = 298 has started. Total threads = 298
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43605 because 
of this rule: "* unix,ssl,tcp"
omniORB: AsyncInvoker: thread id = 299 has started. Total threads = 299
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43606 because 
of this rule: "* unix,ssl,tcp"
omniORB: AsyncInvoker: thread id = 300 has started. Total threads = 300
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43607 because 
of this rule: "* unix,ssl,tcp"
omniORB: AsyncInvoker: thread id = 301 has started. Total threads = 301
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43608 because 
of this rule: "* unix,ssl,tcp"
omniORB: AsyncInvoker: thread id = 302 has started. Total threads = 302
omniORB: Accepted connection from giop:tcp:137.138.xx.yyy:43609 because 
of this rule: "* unix,ssl,tcp"
omniORB: Exception trying to start new thread.
omniORB: Cannot create a worker for this endpoint: 
giop:tcp:137.138.xx.yyy:43406 from giop:tcp:137.138.xx.yyy:43610
omniORB: Exception trying to start new thread.
omniORB: Cannot create a worker for this endpoint: 
giop:tcp:137.138.xx.yyy:43406 from giop:tcp:137.138.xx.yyy:43612
omniORB: Exception trying to start new thread.
omniORB: Cannot create a worker for this endpoint: 
giop:tcp:137.138.xx.yyy:43406 from giop:tcp:137.138.xx.yyy:43614
omniORB: Exception trying to start new thread.
omniORB: Cannot create a worker for this endpoint: 
giop:tcp:137.138.xx.yyy:43406 from giop:tcp:137.138.xx.yyy:43615




More information about the omniORB-list mailing list