[omniORB] Failure to connect to server/execute user functions

Sai-Lai Lo S.Lo@uk.research.att.com
12 Mar 1999 10:41:08 +0000


Remember that by default we have the scavengers scanning incoming and
outgoing connections every 30 seconds. They may close down connections when
they decided that the connections are idle. To eliminate this factor in
your debugging, turn off the scavengers by doing this:

  omniORB::idleConnectionScanPeriod(omniORB::idleIncoming,0);
  omniORB::idleConnectionScanPeriod(omniORB::idleOutgoing,0);

If what you are doing is very computational intensive, it may be the case
that some server threads never got scheduled before the incoming scavenger 
decide the connection is idle. If that is the case, it may worth putting in
a thread yield call in appropriate places to kick the NT scheduler into
action.

Sai-Lai


>>>>> Judy Anderson writes:

> We're having an intermittent connection problem which of course only
> occurs when we use our large application, and doesn't seem to occur
> when we try to reproduce it with a small test case.

> In particular, what's happening is that upon the client's attempt to
> get a handle on the server object, we will intermittently receive a
> communications failure.  Originally this was happening about 1/3 of
> the time, but it seems to have receded back to 5% (worse, from a
> debugging standpoint).  So, I have a batch file which launches 40
> clients with start/b, and one or two of them will fail to connect.
> Interestingly, if I crank the debug level on the server, I get
> messages that show me that all 40 clients succeeded in getting a
> thread started for them, but only 38 of them will get the printout
> from my user function.  So, something is going wrong somewhere
> intermittently, between the receiving of the connection, and the
> calling of the user function.  The clients are, of course, identical,
> and so there is no particular reason for one or another to fail.  It
> is simply random.

> Likely, the communications failure is simply an indication that there
> was a problem -- as I have whined before, when there is a bug, I don't
> get into the debugger, but rather than thread handling the operation
> simply aborts or exits or something, and throws a communications
> failure to the client, and so it is difficult to tell exactly what
> might be wrong.

> You can imagine that I am somewhat frustrated by this bug.  Does
> anybody have any clues?

> I know!  We'll just tell our users, "be gentle with the server, it has
> a delicate constitution."

> 					Judy Anderson "yduJ"
> 					yduJ@harlequin.com
> 					617-374-2547




-- 
Sai-Lai Lo                                   S.Lo@uk.research.att.com
AT&T Laboratories Cambridge           WWW:   http://www.uk.research.att.com 
24a Trumpington Street                Tel:   +44 223 343000
Cambridge CB2 1QA                     Fax:   +44 223 313542
ENGLAND