[omniORB] omniORB bug

Marcus.Bullingham@aculab.com Marcus.Bullingham@aculab.com
Wed, 28 Feb 2001 09:32:13 -0000


We have a problem with omniORB 3.0.2 which is manifested on Windows NT
(intel)
in a release build; it probably applies to other platforms and other
versions
of omniORB.

The method 'CORBA::TypeCode::_nil()' in the file
'omni/src/lib/omniORB2/dynamic/typecode.cc' allocates a 'TypeCode'
object on the heap; it is possible for some other code in the same
module to gain access to this object and treat it as a 'TypeCode_base'
object, leading to corruption of the heap. In our case, it corrupts
the mutex returned by 'omni::nilRefLock()', which eventually causes
the program to hang. The corruption occurs during the calling of
constructors on static objects before 'main' is entered.

We have the following construct in an IDL file 'kvset.idl':
<<
    struct S3kvsPair {
        S3symbol key;

        union KVSValue switch(KVSType) 
        {
        case ctkvs_valtypINT:               Int int_; 
	...
        case ctkvs_valtypKVS:               sequence<S3kvsPair> kvs_;
	...
        case ctkvs_valtypSYMBOLARRAY:       S3symbolSeq symbolSeq_;
        } value;
    };
>>

In the generated file 'kvsetDynKK.cc', we have:
<<
static CORBA::PR_unionMember _0RL_unionMember_ECTF_mS3kvsPair_mKVSValue[] =
{
  {"int_", _0RL_tc_ECTF_mInt, ECTF::ctkvs_valtypINT},
	...
  {"kvs_", CORBA::TypeCode::PR_recursive_sequence_tc(0, 2),
ECTF::ctkvs_valtypKVS},
	...
  {"symbolSeq_", _0RL_tc_ECTF_mS3symbolSeq, ECTF::ctkvs_valtypSYMBOLARRAY}
};

static CORBA::TypeCode_ptr _0RL_tc_ECTF_mS3kvsPair_mKVSValue =
CORBA::TypeCode::PR_union_tc("IDL:ectf.org/ECTF/S3kvsPair/KVSValue:1.0",
"KVSValue", _0RL_tc_ECTF_mKVSType,
_0RL_unionMember_ECTF_mS3kvsPair_mKVSValue, 16);
static CORBA::PR_structMember _0RL_structmember_ECTF_mS3kvsPair[] = {
  {"key", _0RL_tc_ECTF_mS3symbol},
  {"value", _0RL_tc_ECTF_mS3kvsPair_mKVSValue}
};

static CORBA::TypeCode_ptr _0RL_tc_ECTF_mS3kvsPair =
CORBA::TypeCode::PR_struct_tc("IDL:ectf.org/ECTF/S3kvsPair:1.0",
"S3kvsPair", _0RL_structmember_ECTF_mS3kvsPair, 2);

>>

Corruption of the heap occurs during the invocation of
'CORBA::TypeCode::PR_struct_tc()'. This calls the constructor for
'TypeCode_struct', which calls the 'NP_complete_recursive_sequences()'
method, which recursively descends the typecode structures, eventually
calling 'TypeCode_collector::markLoops' on the 'TypeCode' object
allocated by 'TypeCode::_nil()'. The first such call sets
'TypeCode_base::pd_mark' to 0, but further calls occur later and do
more damage.

Relevant code from 'omni/src/lib/omniORB2/dynamic/typecode.cc':

<<
CORBA::TypeCode_ptr
CORBA::TypeCode::_nil()
{
  static TypeCode* _the_nil_ptr = 0;
  if( !_the_nil_ptr ) {
    omni::nilRefLock().lock();
    if( !_the_nil_ptr )  _the_nil_ptr = new TypeCode;
    omni::nilRefLock().unlock();
  }
  return _the_nil_ptr;
}
>>

Regards,

Marcus.