[omniORB] Callback loop stalling: sample code

David Riddoch djr@uk.research.att.com
Thu, 2 Nov 2000 20:59:36 +0000 (GMT)


Hi Kevin,

> Now, in your situation, you have two machines, A and B. A passes a
> reference to a call-back object to B. B performs the call-back,
> causing a connection to be opened back to A, and a new thread to start
> on A. B then drops the reference to the call-back object; it is the
> only reference to an object in A's address space, so the connection is
> closed, and A's thread exits. Then the whole thing repeats.
> 
> The key to the problem is that A is repeatedly creating and deleting
> threads. FreeBSD obviously isn't coping very well with that. Linux
> also gets upset, although only after a few thousand calls, when your
> program receives a COMM_FAILURE.
> 
> There are a few things we could do in omniORB to help with this
> situation, and we might think about it for future versions.

I've recently being doing some work on the omniORB transport, and looked
at this issue.  The following patch allows Ropes to be scavenged (just
like Strands are), so if an object reference goes away, and then comes
back again soon, the Rope (and any connections that were open) are likely
to still be around.  Thus in your application, the connection from B to A
should remain open, and threads will no longer be repeatedly created and
deleted.

This is untested, but I think it should work okay for you.  Note that if
you are not running the scavenger, Ropes will never be deleted.  For this
reason the patch will probably not make it into the current version of
omniORB, but something along these lines is likely to happen in a future
version.


Cheers,
David



Index: include/omniORB3/rope.h
===================================================================
RCS file: /project/omni/cvsroot/omni/include/omniORB3/rope.h,v
retrieving revision 1.1.2.4
diff -c -r1.1.2.4 rope.h
*** include/omniORB3/rope.h	2000/06/27 15:43:31	1.1.2.4
--- include/omniORB3/rope.h	2000/11/02 20:45:49
***************
*** 711,717 ****
--- 711,719 ----
  };
  
  
+ class omniORB_Scavenger;
  
+ 
  class Rope {
  public:
    Rope(Anchor *a,
***************
*** 799,804 ****
--- 801,807 ----
    friend class Strand_iterator;
    friend class Rope_iterator;
    friend class Strand::Sync;
+   friend class omniORB_Scavenger;
  
  protected:
  
Index: src/lib/omniORB2/orbcore/scavenger.cc
===================================================================
RCS file: /project/omni/cvsroot/omni/src/lib/omniORB2/orbcore/scavenger.cc,v
retrieving revision 1.10.6.4
diff -c -r1.10.6.4 scavenger.cc
*** src/lib/omniORB2/orbcore/scavenger.cc	2000/01/07 14:51:14	1.10.6.4
--- src/lib/omniORB2/orbcore/scavenger.cc	2000/11/02 20:44:34
***************
*** 361,374 ****
  	Rope *r;
  	while ((r = next_rope())) {
  	  // For each rope, scan all the strands
! 	  Strand_iterator next_strand(r);
! 	  Strand *s;
! 	  while ((s = next_strand())) {
! 	    if (!s->_strandIsDying() && 
! 		Strand::Sync::clicksDecrAndGet(s) < 0) {
! 	      s->shutdown();
  	    }
  	  }
  	}
        }
      }
--- 361,381 ----
  	Rope *r;
  	while ((r = next_rope())) {
  	  // For each rope, scan all the strands
! 	  int can_delete;
! 	  {
! 	    Strand_iterator next_strand(r);
! 	    Strand *s;
! 	    while ((s = next_strand())) {
! 	      if (!s->_strandIsDying() && 
! 		  Strand::Sync::clicksDecrAndGet(s) < 0) {
! 		s->shutdown();
! 	      }
  	    }
+ 	    can_delete = r->is_idle(1) && r->pd_head == 0;
  	  }
+ 	  // If the rope is idle, and all its strands are
+ 	  // gone, kill it.
+ 	  if( can_delete )  delete r;
  	}
        }
      }
Index: src/lib/omniORB2/orbcore/strand.cc
===================================================================
RCS file: /project/omni/cvsroot/omni/src/lib/omniORB2/orbcore/strand.cc,v
retrieving revision 1.10.6.9
diff -c -r1.10.6.9 strand.cc
*** src/lib/omniORB2/orbcore/strand.cc	2000/06/22 10:37:50	1.10.6.9
--- src/lib/omniORB2/orbcore/strand.cc	2000/11/02 20:38:15
***************
*** 808,845 ****
        if (rp) 
  	{
  	  pd_r = pd_r->pd_next;
- 	  if (rp->is_idle(1)) 
- 	    {
- 	      // This Rope is not used by any object reference
- 	      // First close down all the strands before calling
- 	      // the dtor of the Rope.
- 	      LOGMESSAGE(10,"Rope_iterator","delete unused Rope.");
-               CORBA::Boolean can_delete;
- 	      {
- 		omni_mutex_lock sync(rp->pd_lock);
- 		Strand_iterator next_strand(rp,1);
- 		Strand* p;
- 		while ((p = next_strand())) {
- 		  if (p->is_unused(1)) {
- 		    p->_setStrandIsDying();
- 		  }
- 		  else {
- 		    LOGMESSAGE(0,"Rope_iterator","Detected Application error. An object reference returned to the application has been released but it is currently being used to do a remote call. This thread will now raise a omniORB::fatalException.");
- 		  }
- 		}
- 		// notice that Strand_iterator does not return
- 		// strands that are in the dying state. So there
- 		// may still be (dying) strands associated with the Rope
- 		// even when Strand_iterator returns no strands.
- 		// The only way to ensure there is no strand left is to
- 		// look for pd_head == 0.
-                 can_delete = ((!rp->pd_head) ? 1 : 0);
- 	      }
- 	      if (can_delete) {
- 		delete rp;
- 	      }
- 	      continue;
- 	    }
  	}
        break;
      }
--- 808,813 ----