[omniORB] Event producer/consumer app crash

Sai-Lai Lo s.lo@uk.research.att.com
Thu, 24 May 2001 14:31:10 +0100


Hard to tell what could be the problem. Just some suggestions:

1. The same Any should not be marshalled by more than one thread at the same
time.
   This is an implementation restriction. Looks like you have already
avoided that by
   creating copies of the Any per channel.
2. The Any's copy ctor copy the data from the original but share the same
typecode.
   However, the reference count of the typecode object is incremented so
there is no
   chance for the typecode to be released until all Anys are deleted.
3. LCD mode implies that the typecode in an Any are alias expanded before it
is marshalled.
   The trace you have supplied point to aliasExpand which is doing exactly
that.
   You could try disable the alias expansion by doing this in your code
after ORB_init()
   if you set the LCD mode by -ORBlcdMode:

      omniORB::tcAliasExpand = 0;

   Hopefully, javaIDL is able to deal with non-alias expanded typecode.

Sai-Lai


-----Original Message-----
From: owner-omniorb-list@uk.research.att.com
[mailto:owner-omniorb-list@uk.research.att.com]On Behalf Of Rowe, Simon
Sent: 22 May 2001 11:51
To: omniorb-list@uk.research.att.com
Subject: [omniORB] Event producer/consumer app crash


We been trying to hunt down a problem in one of our applications with little
success, can anyone give us some insight into what the underlying cause is?

Our application is somewhat like a "push-only" event channel - incoming data
is broadcast to each registered consumer (excepting for some coarse
filtering which is applied, but omitted from this discussion to keep things
simple).

Each registered consumer has an associated output queue. Each incoming Any
is copied using the CORBA Any copy constructor for storage on the output
queues - a separate copy is made for each output queue.

A thread pool is used to dispatch the Any copies by removing them from the
relevant queue and calling push() on the associated consumer. We have had a
number of experienced multithread-savvy programmers cast their eyes over our
code and we believe (but I guess cannot be certain) that this aspect of our
code is fine. This view is supported by having run Purify against the code
for prolonged periods without complaint.

On occassion our pushed data is a sequence of Anys, itself encoded as an Any
(we use this batching technique to improve throughput).

The segmentation fault (as shown in the Purify output) occurs when
attempting to delete the copy of the Any after having called push(). We
suspect that all the copies of the Anys point to the same data under the
hood and that perhaps there is a race condition here - it often takes
prolonged periods to reproduce the problem.

We are running a fully-patched omniORB 2.8.0 version with LCD mode enabled
(as we require interoperability with Java IDL). The problem occurs on
WindowsNT, Linux and Solaris.


      FMR: Free memory read
      This is occurring while in thread 36:
            TypeCode_base*TypeCode_base::aliasExpand(TypeCode_base*)
[typecode.cc:989]
            unsigned CORBA::Any::NP_alignedSize(unsigned)const [any.cc:224]
            unsigned long
_0RL_pc_bd959cc57b2e138f_00000000::alignedSize(unsigned long)
[CosEventCommSK.cc:100]
            void
OmniProxyCallWrapper::invoke(omniObject*,OmniProxyCallDesc&)
[proxyCall.cc:88]
            void CosEventComm::_proxy_PushConsumer::push(const CORBA::Any&)
[CosEventCommSK.cc:127]
            abaccess::LiteEventChannelSendComponent::SendResult
abaccess::LiteProxyPushSupplier::singleDeferredSendAny(CosEventComm::PushCon
sumer*,const CORBA::Any&) [LiteProxyPushSupplier.cc:444]
            void abaccess::LiteProxyPushSupplier::multipleDeferredSendAny()
[LiteProxyPushSupplier.cc:389]
            void abaccess::LiteProxyPushDeliveryTask::doTask()
[LiteProxyPushSupplier.cc:67]
            void abaccess::LiteTaskManager::run() [LiteTask.cc:177]
            void abaccess::LiteTaskManager::threadBody(void*)
[LiteTask.cc:43]
            omni_thread_wrapper [posix.cc:406]
            _thread_start  [libthread.so.1]
      Reading 4 bytes from 0x264044 in the heap.
      Address 0x264044 is 181 bytes past end of a freed block at 0x263d90 of
512 bytes.
      This block was allocated from thread 36:
            malloc         [rtlib.o]
            c2n6Fi_Pv___1  [libCrun.so.1]
            void*operator new(unsigned) [rtlib.o]
            void*operator new[](unsigned) [rtlib.o]
            void MemBufferedStream::grow(unsigned) [mbufferedStream.cc:209]

void*MemBufferedStream::align_and_put_bytes(omni::alignment_t,unsigned)
[bufferedStream.h:723]
            void MemBufferedStream::put_char_array(const unsigned
char*,int,omni::alignment_t) [mbufferedStream.cc:230]
            void marshal_ss<MemBufferedStream>(char**,unsigned
long,__type_0&) [stringtypes.h:400]
            void
_CORBA_Sequence__String::operator>>=(MemBufferedStream&)const
[corbaString.cc:288]
            void
TypeCode_enum::NP_marshalComplexParams(MemBufferedStream&,TypeCode_offsetTab
le*)const [typecode.cc:2719]
            void
TypeCode_marshaller::marshal(TypeCode_base*,MemBufferedStream&,TypeCode_offs
etTable*) [typecode.cc:4004]
            void
TypeCode_struct::NP_marshalComplexParams(MemBufferedStream&,TypeCode_offsetT
able*)const [typecode.cc:1953]
            void
TypeCode_marshaller::marshal(TypeCode_base*,MemBufferedStream&,TypeCode_offs
etTable*) [typecode.cc:4004]
            void
TypeCode_struct::NP_marshalComplexParams(MemBufferedStream&,TypeCode_offsetT
able*)const [typecode.cc:1953]
            void
TypeCode_marshaller::marshal(TypeCode_base*,NetBufferedStream&,TypeCode_offs
etTable*) [typecode.cc:3700]
      There have been 694 frees since this block was freed from thread 36:
            free           [rtlib.o]
            c2k6FPv_v___1  [libCrun.so.1]
            void operator delete(void*) [rtlib.o]
            void operator delete[](void*) [rtlib.o]
            MemBufferedStream::~MemBufferedStream() [mbufferedStream.cc:147]
            void
TypeCode_marshaller::marshal(TypeCode_base*,MemBufferedStream&,TypeCode_offs
etTable*) [typecode.cc:4038]
            void
TypeCode_struct::NP_marshalComplexParams(MemBufferedStream&,TypeCode_offsetT
able*)const [typecode.cc:1953]
            void
TypeCode_marshaller::marshal(TypeCode_base*,MemBufferedStream&,TypeCode_offs
etTable*) [typecode.cc:4004]
            void
TypeCode_struct::NP_marshalComplexParams(MemBufferedStream&,TypeCode_offsetT
able*)const [typecode.cc:1953]
            void
TypeCode_marshaller::marshal(TypeCode_base*,NetBufferedStream&,TypeCode_offs
etTable*) [typecode.cc:3700]
            void
CORBA::TypeCode::marshalTypeCode(CORBA::TypeCode*,NetBufferedStream&)
[typecode.cc:323]
            void CORBA::Any::operator>>=(NetBufferedStream&)const
[any.cc:179]
            void
fastCopyUsingTC<MemBufferedStream,NetBufferedStream>(TypeCode_base*,__type_0
&,__type_1&) [tcParser.cc:231]
            void
fastCopyUsingTC<MemBufferedStream,NetBufferedStream>(TypeCode_base*,__type_0
&,__type_1&) [tcParser.cc:344]
            void tcParser::copyTo(NetBufferedStream&,int) [tcParser.cc:499]