[omniORB] Omnithreads suspension

Tom Haggie THaggie@img.seagatesoftware.com
Wed, 3 Jun 1998 06:34:15 -0700


We' re having problems with a thread pool in our application, running on

Windows NT ( version 4.00.1381).

Originally, it seemed that signals were getting lost between threads - a

request for work was being put on the queue and a signal sent to
announce 
that data was available, but no threads were waking, despite the fact
that 
they appeared to be waiting for the signal. We changed the signal to a 
broadcast so that all of the pool threads would wake when a request was 
placed on the queue, and this seemed to improve things and give us more 
information, but it hasn't solved the problem.

Most of the time, things work as expected - a request is placed on the
queue;
broadcast() is called on the "request present" condition; the free pool 
threads wake up one at a time with the first thread to wake picking up
the 
request and the others seeing that the queue is now empty and going back
to 
sleep. The request is completed normally and the thread doing the work
goes 
back to sleep to wait for more. However, occasionally one of the threads

fails to wake up and is never seen alive again. Eventually, the pool is 
reduced to one live thread and our application deadlocks because we need

two active threads - one worker thread can place a request for another 
worker to carry out. (It also deadlocks on shutdown because the pool has
to 
wait for its threads to exit before destructing, and the inactive
threads 
never wake up or exit). At the moment, we're running with only one
client 
so there's never more than a single request on the queue (obviously this

will change later).

Looking in a debugger, the inactive threads are still present but
suspended. 
Nothing is actually calling the Windows SuspendThread() function
directly, 
so we are a bit puzzled as to what is going on.

The broadcast which occasionally fails to wake all the free workers is
called
by another worker thread. Requests can also be made from our main thread

(not created by the omnithreads), but these don't seem to cause
problems.

Any ideas? We are using OmniORB 2.5.1 and the associated omnithreads
library.

Thanks,

Richard Wilkinson