[omniORB] Long discourse/proposal: omniOrb Etherealize/Incarnate races

David Riddoch djr@uk.research.att.com
Thu, 12 Jul 2001 11:07:20 +0100 (BST)

Hi William,

Thanks for the epic!  In short, you have identified a genuine problem, and
your suggested fix is about right.  The main difficulty is in implementing
it in a way that doesn't increase per-object memory usage or damage
performance too much.

Your analysis of the omniORB code is spot-on.  Where you've make a query
I've added an explaination.  More info below ...


On Wed, 11 Jul 2001, William H Jones wrote:

> Summary
>   Synchronization between closely-spaced etherealization and incarnate
> operations on a single object is less than completely perfect.  Objects
> which propagate state between incarnations can be corrupted.  Object
> systems which interlock between etherealization and incarnation (most
> easily, on the first etherealization/incarnation cycle) can deadlock.
> Discussion of the Present State of OmniOrb303
>   The sychronization between servant activator etherealize and incarnate
> functionality seems to be somewhat less than fully exact.
>   As near as I can determine, the standard process of omniORB303 is as
> follows.
> 1.  Some outside mechanism (in my case, a Least Recently Used Evictor
>     mechanism) determines that an object is of some appropriate state
>     and invokes the deactivate_object member function in presentation
>     by the Portable Object Adapter (POA) serving the servant.
> 2.  After some preliminary actions to determine the validity of the
>     proposed operation and otherwise prepare for actions to come,
>     deactivate_object invokes omni::deactivateObject, which takes the
>     following further actions.
>     a)  The omniLocalIdentity object associated with the deactivating
>         servant is identified through a hash table.
>     b)  The omniLocalIdentity object is examined to determine whether
>         or not there are, in fact, local object references to the
>         servant (presumably the result of _this() operations).  If this
>         is the case, the following further actions are taken.
>           i)  A new omniLocalIdentity object is obtained having the same
>               key values as those of the identity object previously
>               identified, but with none of the other connections to the
>               existing servant.  The newly-obtained identity object is
>               substituted for the original in the hash table, implicitly
>               removing the original identity object from the hash table.
>          ii)  A loop is executed to convert each existing local object
>               reference over to a remote object reference through a
>               defaultLoopBack rope facility (whatever that is).
>               I'm not sure what this does, but I suppose it might circumvent
>               direct object reference to servant object member function calls
>               and route things back through dispatch_to_sa so that the
>               incarnate mechanism will be invoked should a local method
>               delivery occur.

In omniORB 3, local references to objects that are not in the Active
Object Map (hash table) use the TCP loopback.  This was done because it
allows a number of simplifications to various parts of the ORB.  Certain
call paths only have to be deal with considering the `remote' case (and
don't have to worry about the `local' case).  The servant activator stuff
is an example.

>         If there were no local object references, the following alternative
>         actions are taken.
>           i)  The identified omniLocalIdentity object is extracted from the
>               hash table.
>         Note that, regardless of which set of actions is taken, the
>         original, identified omniLocalIdentity object is extracted from
>         the hash table.  A new local identity giving a remote location
>         for the object, but no connection to the actual local servant
>         components, may or may not exist in the hash table.
>     c)  A backward edge from the servant to the original identity object
>         is broken and a pointer to the identity object is returned as the
>         functional result of the omni::deactivateObject operation.
> 3.  The deactivate member function in presentation by the returned (by
>     pointer) omniLocalIdentity object is invoked.  This serves merely to
>     decrement a pending method (?) counter maintained by the identity
>     object.
>     If this counter reaches 0, there are no method deliveries pending
>     for (or currently being executed by) the servant and, thus, the
>     servant would be considered idle.
> 4.  The removeFromOAObjList member function in presentation by the returned
>     omniLocalIdentity object is invoked to extract that object from a
>     doubly-linked list maintained by the POA (?) of objects which it is
>     serving.
>     I'm not sure what this list does, but it seems to have more to do with
>     rounding up servants when the POA dies than with dispatching methods
>     to the servant.

Yes, this list exists solely so we can find all the objects managed by a
POA when that POA is shutdown.

> 5.  If the object appears to be idle (as revealed by the method count
>     maintained by the identity object), the following actions are taken.
>     a)  If a servant activator can be identified, a task to etherealize
>         the servant is queued; otherwise, the reference count of the
>         servant is simply decremented, possibly resulting in the
>         destruction of the servant should this be, in fact, the last
>         reference to the servant.
>     b)  The identify object is discarded by invoking its own die member
>         function.
>     If the object is not idle, the following alternative actions are
>     taken.
>     a)  detached_object is invoked, which simply increments a counter
>         of objects that have been detached from the POA.  I'm sure this
>         is wonderful, but I don't really understand just what flows
>         from this as a result.

This is important because when shutting down a POA we have to know when
all objects in that POA have finished being etherealised.  When a
`detached object' is etherealised, the counter is decremented, and the
appropriate synchronisation with the thread that is shutting down the POA
then happens (if the POA is being shutdown).

> 6.  At this point, if you are not terribly lucky, a method delivery for
>     the object now queued for etherealization now arrives.  Apparently,
>     nothing in the delivery process checks for the existence of the object
>     until it arrives at the dispatch_to_sa member function presented by
>     the POA.  Various checks to assure the validity of the proposed
>     method delivery, etc., are performed.
>     One of these preparatory operations is the obtaining of a lock on 
>     the mutual exclusion lock serializing access to the attached servant
>     activator object.  Why this lock is obtained at such an early point,
>     especially since it is not yet certain that the servant activator
>     object will actually be accessed, is unclear.

The reason the lock is taken early is to avoid deadlock.  Because this
lock is held whilst making an up-call into application code, it is
possible that _any_ of the ORBs locks might be taken when the application
code calls back down into the ORB.  Therefore the servant_activator_lock
has to come first in the partial order of locks taken.

Thus the servant_activator_lock has to be taken before any other ORB
locks.  You will notice that it is taken just before taking POA.pd_lock,
and so no earlier than it absolutely has to be.

> 7.  The hash table from which the original omniLocalIdentity object was
>     extracted is examined to determine if such a local identity object
>     presently exists.  If such an identity object exists and it actually
>     has a connection to a servant, then the method is dispatched to the
>     servant (through the identity object) and functional return is made;
>     however, if no such identity exists, or if the identity exists, but
>     does not have a connection to the servant, as in the case of the
>     new identity substituted in the has table to satisfy the needs of
>     local object references, then the following actions are taken.
>     a)  The incarnate member function in presentation by the servant
>         activator attached to the POA is invoked to create a fresh
>         incarnation of the servant.
>         Note that this invokation on the servant activator is protected
>         by a mutual exclusion lock upon which a lock was obtain during
>         the preparation phase of the dispatch_to_sa member function.
>     b)  The method is delivered to the newly-reincarnated servant and
>         functional return is made.  Possible exceptions to the method
>         execution are fielded, converted as appropriate, and rethrown.
> The Difficulty
> 1.  The problem between etherealization and incarnation operations arises
>     because the appropriate omniLocalIdentity (with the connections to the
>     local servant object) is extracted from the hash table (the active
>     object map?) just prior to (or, if method deliveries are still pending,
>     possibly considerably before) the queueing of the etherealization task.
> 2.  The absense of this identity object from the hash table is then taken
>     by dispatch_to_sa as a sign that etherealization has occurred when,
>     in fact there is some finite probability that the etherealization act,
>     itself, has only been queued and is still pending.
> 3.  A mutual exclusion lock is held on the entire servant activator object
>     while the dispatch_to_sa/incarnate complex is executing.  If a queued
>     etherealize action attempts to execute during that time (even if it is
>     not etherealizing the object being simulatenously incarnated), it will
>     be blocked until that incarnation is completed.  This can lead to the
>     reversal of the sequence: etherealizing an object after its (supposedly)
>     subseqent reincarnation.
> 4.  If the act of incarnation requires that the previous etherealization
>     be complete (for example, to obtain the prior state of the object in
>     its re-incarnated form, as is the case in my own application), then
>     this mis-synchronization of the two acts leads to probably invalid
>     operation and, possibly, to a deadlock.
>     a)  If an etherealization/incarnation cycle occurred prior to the
>         pending etherealization task, a mis-synchronized incarnate can
>         run to completion, but will pick up a possibly out-of-date object
>         state.  Presuming that the pending etherealization does run while
>         the re-incarnated object fields method deliveries, the subsequent
>         etherealization of the re-incarnated object will overwrite that
>         state with the now corrupted state of the object.
>     b)  If no previous etherealization recorded a previous object state,
>         the incarnate operation may hang or fail for lack of an object
>         state to restore.  Attempts to wait for the etherealization task
>         to complete are futile since the dispatch_to_sa/incarnate operation
>         holds a blocking lock on the servant activator object.  Should
>         the etherealization task be dequeued and executed, it will block
>         pending the completion of the incarnate task, which itself is
>         waiting for the completion of the block etherealization task.

You are absolutely right, this is a fault in omniORB.  Someone noticed
this back in November '99


Here's my reply at the time:


I'm afraid that I didn't fully appreciate back then just how bad the
effect would be.  Your explaination above makes it very clear, thanks.

>   On can, of course, scoff at the improbability of such a race condition
> ever resulting in such a deadlock.  The window of opportunity is narrow,
> comprising merely the time necessary for an independent thread to dequeue
> the etherealization task and execute it.  Scoff, scoff, scoff.  It, quite
> naturally, happened to me on my first test case.

;-)  Actually I think that it is quite likely to happen, so I'm not
suprised you encountered it!  It just depends on whether the dispatch
thread or the etherealisation thread gets scheduled first.

> Solutions
>   Unfortunately, this is a little tougher.  I am hesitant to propose code
> adjustments since I have only been studying omniOrb303 source code for a
> few days.
>   Further, commentary in the source code is sparse and alludes, to the
> extent that I understand it, to a place-holder in the active object map
> (the hash table?).  Commentary concludes with the laconic
>   "Ignore for now..."

The solution is certainly not trivial, and that bit of the ORB is
suprisingly subtle and complex!

>   Finally, perhaps omniORB4 has treated this problem?

I don't know, but probably not yet.  Duncan?

>   Perhaps the simplest solution that occurs to this newbie mind is as
> follows.
> 1.  Add a pointer in the omniLocalIdentity object to a pending
>     etherealization task element and a backward pointer from that
>     element to the identity object.  By default, both pointers will
>     be null.
> 2.  Leave an identification object in the hash table corresponding to the
>     etherealizing servant.
>     a)  In the event that object references to the servant exist, this can
>         be the new identity object substituted by the present code.
>     b)  If no such object references existed, a new identity object with
>         only the appropriate object keys (and no servant connections) can
>         be left in the hash table.
>         I have not investigated the full ramifications of this.  I hope that
>         leaving an unconnected-in-any-way identification object won't break
>         everything in sight and beyond.  There is, unfortunately, a lot of
>         code somewhat beyond the visible horizon.
> 3.  Without regard to which option left an omniLocalIdentity object in the
>     hash table, adjust add_object_to_etherialization_queue code to form
>     connections from the identification object to the etherealization task
>     element when that task is queued.
> 4.  Adjust code in the etherealization task element doit() code to break
>     the connections between identification and task element when the task
>     completes.
> 5.  Adjust code in omniOrbPOA::dispatch_to_sa to examine any found
>     omniLocalIdentity object for the presense of a etherealization task.
>     Presumably there is a lock of the task queue that will have to be held
>     before this examination can be made.  A lock on the servant activator
>     should further synchronize access to these decision elements since
>     that will either pervent the task from running if it is not already,
>     or delay the dispatch_to_sa code until that running task completes.
>     If such an etherealization task is found to exist, dequeue it and
>     run it to completion from dispatch_to_sa before proceeding on to
>     examine the identification object.  If the identification object
>     reveals no connection to a servant, incarnate may now be run with
>     the assurance that the corresponding etherealize has been completed.
> 6.  (Optional) Adjust omniOrbPOA::dispatch_to_sa code to delay acquisition
>     of the servant_activator_lock until it is clear that incarnate will
>     necessarily be run.  

As explained above, not possible.  The only way to mitigate this problem
is to not be holding a lock when we make the up-call, and use some a
counter and condvar to provide the mutual exclusion.  Of course, this
means taking more locks overall, and so may not be a win.

>     I do not see an immediate reason to suspend etherealization operations
>     each time a method is to be dispatched to a servant in the purview of
>     the servant activator.  A method may initiate object deactivation, but
>     I see no cause of conflict between that and an ongoing etherealization
>     of some other object.

Dictated by the standard I'm afraid.  incarnate() and etherealize() must
be serealised and mutually exclusive (CORBA 2.3 11.3.5).

Anyway, your suggested fix is certainly along the right lines, and roughly
in line with what I came up with.  But I'm not convinced it is a complete
solution just yet.  And given that omniORB 4 is in the offing, it is
may be worth targetting any major effort at that ...