[omniORB] "Unrecoverable error" in omniORB

Jochen Gern jochen.gern at acterna.com
Thu Oct 30 17:27:45 GMT 2003


Hello,

in encoutered the following error with omniORB 4.0.0 and omniORB 4.0.2 when
using it under vxWorks, where the latter shows the error much more seldom:

omniORB: Unexpected exception caught by giopRendezvouser
omniORB: Unrecoverable error for this endpoint: giop:tcp:10.49.75.107:1024,
it will no longer be serviced.

As the result of my investigations I found that under circumstances I
cannot clearly point out, omniAsyncWorker objects delete themselves by
means of the exit() method of their base class omni_thread while they are
still registered with the idle queue of the omniAsyncInvoker. If
omniAsyncInvoker tries to use such an invalid idle queue entry to execute a
task, this leads to the error mentioned above.

As a workaround, I placed code which unregisters the omniAsyncWorker into
omniAsyncWorker's destructor. Below is the manipulated code region (file
invoker.cc):

  ~omniAsyncWorker() {

    if (omniORB::trace(10)) {
      omniORB::logger l;
      l << "AsyncInvoker: thread id = " << pd_id
      << " has exited. Total threads = " << pd_pool->pd_totalthreads
      << "\n";
    }

    delete pd_cond;
    pd_pool->pd_lock->lock();

    // BEGIN OF THE ADDITONAL CODE SECTION
    {
        omniAsyncWorker** pp = &pd_pool->pd_idle_threads;
        while (*pp && *pp != this) {
            pp = &((*pp)->pd_next);
        }
        if (*pp)
        {
            *pp = pd_next;
            cout << "omniAsyncWorker " << this << " deleted while still in
idle queue" << std::endl
                 << "Task: " << taskName(taskIdSelf()) << " (0x" <<
std::hex << taskIdSelf() << std::dec << ")" << std::endl
                 << "File: " << __FILE__ << ", line: " << __LINE__ <<
std::endl;
        }
        pd_next = 0;
    }
    // END OF THE ADDITIONAL CODE SECTION

    if (--pd_pool->pd_totalthreads == 0)
      pd_pool->pd_cond->signal();

    pd_pool->pd_lock->unlock();
  }

What I further noticed is, that the warning text is output much more often
than the error occurred. So I think in most cases an invalid entry doesn't
move to the head of the idle queue and therefore isn't used any more.

The change seems to work well but as I'm not familar with omniOrb I'm not
sure if this is the right solution. Maybe the deletion of omniAsyncWorker
objects is illegal? Maybe the error is in the implementation of
omniAsyncWorker::real_run()?
I would be very pleased if one of the omniORB experts would have a look at
the problem. For additional information, feel free to contact me.

Regards

Jochen Gern





More information about the omniORB-list mailing list