[omniORB] Lockup on Solaris 2.5.1 with Y2K patch

Sai-Lai Lo S.Lo@uk.research.att.com
Mon, 21 Jun 1999 22:06:29 +0100


I'm seeing omniORB2 lockup on Solaris 2.5.1 which has been patched to be Y2K
compliant.

I want to see if anyone can reproduce this problem on a similar platform.

Please try this experiment with omniORB 2.7.1:

Start eg3_impl.

Run eg3_clt in a loop:

$ while [ 1 ]; do eg3_clt; done

What I'm seeing is after a random number of iterations, eg3_clt locks
up. Looking at it through the debugger:

1. there was a deadlock caused by one thread doing pthread_exit (in fact
   this is the outscavenger thread) and another solaris internal thread 
   both grabbing the same mutex and both blocked.
2. the main thread blocked on a join waiting for the outscavenger thread
   to exit. This never comes and so the whole process hanged.

I've repeated the test with Solaris 2.7 and unpatched Solaris 2.5.1 and
there is no lockup. I can't repeat the test on the same hardware with the
unpatched Solaris 2.5.1 but I swear it works well before the patch.

The workaround is simply to disable the cleanup of the outscavenger thread
when a process exit. However I would like to know for certain this is a
problem with the Solaris Y2K patches. Moreover, if it is a bug in the thread
runtime, it may cause lockup not just on process exit...

If you have this platform, please do the test and let me know the result,
both positives and negatives.


Thanks.

Sai-Lai