[omniORB] Handling Physical Network Failures

Rob Kovalchik robk@marketorder.com
Fri, 23 Apr 1999 11:53:27 -0700


We have a client that must be able to detect a failure of the ability to
communicate with its  server objects (even due to a physical network
failure) to a remote host.

If the network fails (pull out network cable) before a client makes a call,
we see a COMM_FAILURE exception we can handle it and all is well.  If the
network fails after the client makes a call, but before we get a response
from the server, the client blocks in the call forever, even after the
network is restored at times.  This is similar to the recent discussion on
one way calls except we are using normal calls and talking about a physical
network failure in the middle of the call.

Processing on the server for our application can take up to a few minutes
for some requests, increasing the likelihood of getting into this situation.

My questions:

1. How hard does an object try to return the results of a method invocation?

2. Does the scavenging of idle connections apply in this case, or can I
assume that the connection will not be shut down by either the client or
server since there is a call-in-flight (albeit a long call)?

3. How can I detect in the server object that the method results cannot be
returned?

4. If this problem cannot be directly solved, is there a way for the client
to either:
    a) limit the time for a call to complete and return an error if it never
did or
    b) cancel a call-in-flight on one thread from second thread.

Thanks for any help,
Rob