[omniORB] omni::createObjRef thread safety

Luke Deller ldeller at iress.com.au
Tue Oct 27 18:27:27 GMT 2009


Hi,

We've been getting an occasional but recurring crash under load in the omni::createObjRef function in omniInternal.cc.  It looks to me like a thread safety issue.  I have some some confidence that the error is indeed in omniORB rather than our own code because the attached patch to omniORB seems to have stopped the crash from occurring.

This is using a snapshot of omniORB (and omniORBpy) from 20080908, but I think that there have been no relevant changes since then.

>From analysing the crash dump files (Windows "mini-dump" files), I can see that the crash is caused by memory corruption in the "omniIdentity" object pointed to by the "id" local variable, resulting in an access violation on line 1064 or 1069 of omniInternal.cc (varies slightly according to the nature of the corruption)

1064:  omniObjRef* objref = pof->newObjRef(ior, id);
1065:  if( target_intf_not_confirmed )  objref->pd_flags.type_verified = 0;
1066:
1067:  {
1068:    omni_optional_lock sync(*internalLock, locked, locked);
1069:    id->gainRef(objref);
1070:  }

The call stack is always as follows:

[1]  omniORB413_vc9_rt!omni::createObjRef
[2]  omniORB413_vc9_rt!omniObjRef::_unMarshal
[3]  omniORB413_vc9_rt!CORBA::Object::_unmarshalObjRef
[4]  omniDynamic413_vc9_rt!CORBA::Object_Member::operator<<=
[5]  omniDynamic413_vc9_rt!omni::fastCopyUsingTC
[6]  omniDynamic413_vc9_rt!omni::tcParser::copyStreamToStream
[7]  omniDynamic413_vc9_rt!CORBA::Any::operator<<=
[8]  _omniOTS!CosTransactions::PropagationContext::operator<<=
[9]  _omniOTS!GetTxnContext
[10] omniORB413_vc9_rt!omni::GIOP_S::handleRequest
[11] omniORB413_vc9_rt!omni::GIOP_S::dispatcher

See that this is called from the "serverReceiveRequest" interceptor installed by omniOTS (which would be called very often from multiple threads).

It is trying to unmarshal the CosTransactions::PropagationContext struct.  Here is the IDL (note that the implementation_specific_data field in CosTransactions::PropagationContext contains an omniOTS::SpecificData struct).

    module CosTransactions { // from standard OMG IDL
        struct PropagationContext {
            unsigned long timeout;
            TransIdentity current;
            sequence <TransIdentity> parents;
            any implementation_specific_data;
        };
    };
    module omniOTS { // from omniOTS source code
        struct SpecificData {
            octet magic[7];
            short version_major;
            short version_minor;
            CosTransactions::Control ctrl;
        };
    };

I can see from inspecting variables in the crash dump that it is trying to unmarshal the final CosTransactions::Control object reference.  I can't tell you yet whether this object is in-process or not - either case could arise.

In the omni::createObjRef function I notice that if locked=0 then there is a gap during which omni::internalLock is NOT held, after "omniIdentity* id" has been obtained from omni::createIdentity but before it is used (at which point I am seeing that it is occasionally corrupted).  My theory is that omni::createIdentity is returning a valid object but that the object is deallocated or otherwise rendered invalid by another thread during this gap.  Is this possible?

See the attached patch which seems to stop the crash happening.  It is not an ideal solution but I think it does support my theory of what the problem is?

Regards,
Luke.

**********************************************************************************************
Important Note
This email (including any attachments) contains information which is confidential and may be subject to legal privilege.  If you are not the intended recipient you must not use, distribute or copy this email.  If you have received this email in error please notify the
sender immediately and delete this email. Any views expressed in this email are not necessarily the views of IRESS Market Technology Limited.

It is the duty of the recipient to virus scan and otherwise test the information provided before loading onto any computer system.
IRESS Market Technology Limited does not warrant that the information is free of a virus or any other defect or error.
**********************************************************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: omni-objreflocking.patch
Type: application/octet-stream
Size: 493 bytes
Desc: omni-objreflocking.patch
Url : http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20091027/59b80382/omni-objreflocking.obj


More information about the omniORB-list mailing list