[omniORB] Urgent: omniORB::fatalException in omni2.6.1

Randy Shoup rshoup@tumbleweed.com
Fri, 06 Aug 1999 10:18:07 -0700


Hello, all -- 

  A key customer of ours has been experiencing a suspicious problem for
a while now, which we believe now is traceable to a fatalException in
omniORB.  Of course, it is only reproducible on their machine! :-)  We
are using version 2.6.1. 

  The symptoms are these:  After several hours of some (rather light)
load, the following exception is thrown from Rope_iterator::operator()
():

	throw omniORB::fatalException(__FILE__,__LINE__,
			"Rope_iterator::operator() () tries to delete a strand that is not
idle.");

  From other discussions on the list, we assumed that this was related
to the scavengers.  We (believe that we have) turned the scavengers
completely off with:

	omniORB::idleConnectionScanPeriod(omniORB::idleOutgoing,0); 
	omniORB::idleConnectionScanPeriod(omniORB::idleIncoming,0); 

  The problem still persists.  BTW, this is related to a problem we were
having a few weeks back, and so we have backported some 2.7.1
strand/scavenger code to 2.6.1.  This seemed to make the problem better,
but clearly has not fixed it completely.  (There have been several more
issues discovered with the scavenger, at least in 2.7.1, since we
applied this patch).  It is certainly possible that we broke something
when applying the patch.

I have several questions:

(1) We turn the scavengers off after the ORB object is created, but
before the BOA object is created.  Is this sufficient to make sure that
the scavengers stop running?  (From examining the omniORB code, I think
the answer is yes, but now I am not 100% sure)

(2) What else could cause this fatalException?  It seems to occur
because of a mismatch in the "idle" states between the Rope and the
Strand -- the Rope is idle, but the Strand is not.  Is there any other
way that a Rope could be set to idle, and the Strand not be set to idle,
other than by the action of the scavenger?  Idleness appears to be
related to the reference counts on these objects, so perhaps there is a
problem there?

(3) Could we fix the mismatch of "idle" states in another way -- i.e.,
could we perhaps un-idle the Rope if we discover one of the Rope's
Strands is not idle?  I am wondering here if we could avoid throwing
this exception altogether by cleaning up the inconsistency
automatically.


We are a bit at the end of our Rope (pun intended) with this problem, so
we would appreciate any suggestions!  I'd be happy to provide more
details if required.

Thanks,
-- Randy
_________________________________________________________________  
Randy Shoup                                     (650)216-2038  
Software Architect                              rshoup@tumbleweed.com  
Tumbleweed Communications Corporation