[omniORB] Failure to connect to server/execute user functions

Judy Anderson yduj@harlequin.com
Thu, 11 Mar 99 17:27:07 EST


We're having an intermittent connection problem which of course only
occurs when we use our large application, and doesn't seem to occur
when we try to reproduce it with a small test case.

In particular, what's happening is that upon the client's attempt to
get a handle on the server object, we will intermittently receive a
communications failure.  Originally this was happening about 1/3 of
the time, but it seems to have receded back to 5% (worse, from a
debugging standpoint).  So, I have a batch file which launches 40
clients with start/b, and one or two of them will fail to connect.
Interestingly, if I crank the debug level on the server, I get
messages that show me that all 40 clients succeeded in getting a
thread started for them, but only 38 of them will get the printout
from my user function.  So, something is going wrong somewhere
intermittently, between the receiving of the connection, and the
calling of the user function.  The clients are, of course, identical,
and so there is no particular reason for one or another to fail.  It
is simply random.

Likely, the communications failure is simply an indication that there
was a problem -- as I have whined before, when there is a bug, I don't
get into the debugger, but rather than thread handling the operation
simply aborts or exits or something, and throws a communications
failure to the client, and so it is difficult to tell exactly what
might be wrong.

You can imagine that I am somewhat frustrated by this bug.  Does
anybody have any clues?

I know!  We'll just tell our users, "be gentle with the server, it has
a delicate constitution."

					Judy Anderson "yduJ"
					yduJ@harlequin.com
					617-374-2547