[omniORB] Canceling a blocking function

Tres Seaver tseaver@palladion.com
Wed, 05 Apr 2000 10:53:24 -0500


Han Kiliccote wrote:
> <I wrote>
> > I guess I'm dense, but I can't see the point of distributing such
> > a request across 10E5 peers -- each client will spend vastly more time
> > negotiating with its peers than it would spend processing the whole
> > request internally.  In particular, what is the point of cancelling
> > requests?  The chances of being able to signal a cancellation to the
> > peer before the peer completes the request are vanishingly small.
> >
> > Looking back at your original question, I guess you may not need to
> > signal cancellation to peers -- you seem merely to want to free up
> > client-side resources associated with the request.  You also seem to
> > be talking only to 10E2 of the possible 10E5 at a time:  How do you
> > manage this without some form of centralization?  Random-selection
> > from a known universe of object references equires either a centralized
> > server which performs the selection, or one which serves up the entire
> > list of OR's to clients (ICK!)).  Do you intend to "compute" the OR's,
> > somehow?
> 
> Excellent points. This is why what we are doing is called a research
> project. Our solution involves generating a virtual interconnection network
> (e.g., a hypercube) on top of the existing network. This way every client
> (or server, there is no difference) knows/maintains connectivity with only a
> small percentage (e.g., 20-1000 out of 10E5) of other clients but their
> combined effort maintains the network reliably.
> 
> The location of the data/services is selected deterministically so that when
> the users need to read the data or request a service, the users can find
> which 100 out of 10E5 stores it (or serves the service) and uses the
> interconnection network to eliminate dependencies on individuals.

Ok, here is a solution which doesn't rely on any centralized event channel
server (it effectively places an "event channel" within each client):

 * Before pushing the request out to the 10E2 servers, construct a callback
   object to collect the responses.  Have its notification method block
   on a condition variable which you lock.

 * Push the request out to the peers, passing the callback object as a
   parameter.  Drop into a loop, waiting on the condition, and break out
   of it when you have received "enough" responses.  Extract the
   collected results, destroy the callback object, and continue
   processing.

 * Peers push the response to the callback object on request completion.
   If the callback has already been destroyed, they should receive a
   CORBA::NOT_EXIST exception, which they can safely ignore.  Actually
   the peer should ignore any CORBA exceptions -- you are working with
   what is effectively "best effort" semantics, anyway.

This design won't scale well beyond the number of peers you are planning
for (actually, I have only used it with dozens of peers, rather than
hundreds, and I was waiting for *all* responses, with a timeout).  It does
have the nice property that it requires no new threads in the client, as
long as the ORB can process the incoming callback invocations while the
"main" request is blocked.

> I can send a copy of the paper we are submitting to ieee srds if you are
> interested.

I'd be glad to look at it.

> > > I though omniorb does not support AMI. Am I wrong?
> >
> > I don't know -- TAO has increasingly good support for it; 
> > comp.soft-sys.ace is where I have seen the benefits of AMI discussed.
> 
> We already invested a lot of time on omniorb (and we really like it). I'm
> looking for a solution that uses omniorb. For example can we send a dummy
> message to the port that waits for the responses to cancel these calls? Will
> this work? Of course how do we find the port number of that the thread that
> waits?
> 
> Or can I kill the thread and except not to leave the omniorb in an unstable
> state?

I wouldn't try that -- hard thread kills are dangerous for lots of reasons. 
Even if it happens to work in some circumstances, it can never be portable or
"safe":  think of locks held by the killed thread, for instance -- the OS
may or may not release them (most don't).

-- 
=========================================================
Tres Seaver  tseaver@digicool.com   tseaver@palladion.com