[omniORB] omniNames crashes on OpenVMS Alpha with multiple kernel threads enabled

Bruce Visscher visschb@rjrt.com
Mon, 05 Jun 2000 17:17:46 -0400


Hello omniORBers,

I hate to bring this up again, but...

I have been experiencing some extremely rare problems with strands being
deleted early in omniNames 2.8.0 on OpenVMS Alpha 7.1-2 (with DECthreads
ECOs) if multiple kernel threads is enabled.

I had reported this problem earlier, but stated that I still had some
VMS "issues".  I have since resolved these issues and they had nothing
to do with the crashes I have seen in omniNames (we uncovered a bug in
the std iostreams library that caused it to be thread unsafe).

Specifically, the problems always seem to occur in Strand::decrRefCount.

I have seen this assertion fail:

  assert(pd_refcount >= 0);

I have also had access violations (address=0x00000000) in the statement:

  pd_rope->pd_lock.lock();

This occurs in the call to pthread_mutex_lock.  I believe this indicates
that the pd_rope member is null which indicates that the strand has
already been destroyed (there's a pd_rope=0 in the destructor, perhaps
because of MSC double destruction bugs).

The most recent example of this occurred when
omniORB_Ripper::run_undetached invoked p->decrRefCount() just after the
p->real_shutdown().

The problem doesn't seem to occur if multiple kernel threads is
disabled.  However, my experience has been that enabling this option
seems to expose thread safety issues.

Unfortunately, this problem has proven to be extremely difficult to
reproduce.  I have to run omniNames with multiple dedicated clients
beating on it for hours to get it to crash.  No pattern has emerged.

Does anyone have any clues?

Bruce
-- 
All generalities are false - including this one.

Bruce Visscher                                        visschb@rjrt.com