[omniORB] Timed-out operation retries instead of issuing COMM _FAILURE

Claudio Natoli claudio.natoli@memetrics.com
Wed, 4 Jul 2001 17:21:18 +1000


Hello all,

had a moment today to more fully diagnose (ie. capturing a full trace and
following the omniOrb source) the behaviour that I described in the previous
message .

During the call to the proxy object's Foo method,
omniRemoteIdentity::dispatch is invoked twice. 

	On the first dispatch call, the call to
giop_c.isReUsingExistingConnection returns true. When the omniORB_Scavenger
thread detects a time-out during this call (after 5 seconds or so), it calls
shutdown on the strand, which ultimately causes an omniConnectionBroken
exception to be raised within the context of the dispatching thread, which
is caught within omniRemoteIdentity::dispatch. This catch block, noting that
giop_c.isReUsingExistingConnection returned true in this invocation of
dispatch, re-throws a TRANSIENT exception, which by the default behaviour of
the proxy object causes a retry.

	On the second/retry dispatch call, the call to
giop_c.isReUsingExistingConnection returns false, which then, after the
remote-call again times-out, results in the omniRemoteIdentity::dispatch
catch block throwing a COMM_FAILURE.


Now our question is: Is this behaviour correct and by-design?


What we had expected in this scenario, having read the omniOrb connection
model docs, was that the first (only!) invocation of
omniRemoteIdentity::dispatch ought to result in a COMM_FAILURE some
clientCallTimeOutPeriod seconds after the method invocation on the proxy
object. 

However I strongly suspect, given the wide-spread use of omniORB (not to
mention our limited experience with CORBA ;-), that we have misunderstood
the documentation and that the behaviour we are observing is by-design. 

As an aside, like many systems, the majority of operations in our system are
in no way idempotent, and although the particular semantics of failure modes
are not overly important (in our case anyway), understanding those semantics
exactly is crucial.

Hoping not to be told off too badly for our (presumed) misunderstanding :-)
Claudio