[omniORB] Race between deactivate and outstanding invocations

Chris Newbold chris.newbold@laurelnetworks.com
Thu, 12 Oct 2000 11:49:07 -0400


I believe I've found a race condition between object deactivation (via
the POA) and outstanding method invocations; I'm not 100% confident
in my analysis, but...

We're currently running 3.0.0, but I made my diagnosis looking at the
3.0.2 source code. First, the symptom:

----------------------------------------------------------

Oct 11 01:38:28 npbd[5314]:  971242708.441742 15375
omniORB                   D Assertion failed.  This indicates a bug in
omniORB.
Oct 11 01:38:28 npbd[5314]:  file: ../objectAdapter.cc
Oct 11 01:38:28 npbd[5314]:  line: 311
Oct 11 01:38:28 npbd[5314]:  info: pd_nDetachedObjects > 0
Oct 11 01:38:28 npbd[5314]:
Oct 11 01:38:28 npbd[5314]: Aborted
Oct 11 01:38:28 npbd[5314]: PID = 5947
                Backtrace:
                #0   0x40456c68  __restore
                #1   0x40456d41  __kill+17
                #2   0x404580d8  abort+200
                #3   0x402953c3 
omniORB::fatalException::fatalException(char const *, int, char const
*)+55
                #4   0x4026887b  omni::assertFail(char const *, int,
char const *)+247
                #5   0x4024a8fb 
omniObjAdapter::met_detached_object(void)+79
                #6   0x40254389 
omniOrbPOA::lastInvocationHasCompleted(omniLocalIdentity *)+569
                #7   0x402abe13 
omniLocalIdentity_RefHolder::~omniLocalIdentity_RefHolder(void)+159
                #8   0x402482e3  omniLocalIdentity::dispatch(GIOP_S
&)+155
                #9   0x4027a5e7  GIOP_S::HandleRequest(bool)+963
                #10  0x40279dd5  GIOP_S::dispatcher(Strand *)+449
                #11  0x4029bd70  tcpSocketWorker::_realRun(void *)+116
                #12  0x402b7cfb 
omniORB::giopServerThreadWrapper::run(void (*)(void *), void *)+35
                #13  0x4029bce8  tcpSocketWorker::run(void *)+64
                #14  0x402feab1  omni_thread_wrapper+273

--------------------------------------------------------------

The asserting thread is completing a method invocation on an object;
while this invocation was in progress, another thread called
deactivate_object() on the same object's POA passing the OID of the
same object.

At the time of the assertion failure, deactivate_object() has not
yet returned.

So, I started looking at what happens in deactivate_object() and
found that, while holding the internal lock, deactivate() is called
on the omniLocalIdentity for the object (poa.cc:832). Further
along, the internal lock is dropped (line 857) and detached_object()
is called.

The race condition is that once deactivate_object() has called
deactivate on the omniLocalIdentity and dropped the internal lock,
the thread handling the invocation can now see that the 
omniLocalIdentity has been deactivated in the
omniLocalIdentity_RefHolder destructor (localIdentity.cc:78).

However, deactivate_object() has not yet called detached_object(),
so pd_nDetachedObjects in omniObjAdapater has not been updated,
resulting in the assertion from the invocation thread in
met_detached_object().

Sorry for the long-winded naration; hopefully it makes some sense...

-Chris Newbold
Laurel Networks, Inc.