[omniORB] Lost one-way calls

Gary D. Duzan gdd0@gte.com
Thu, 22 Apr 1999 09:57:45 -0400

   Right. I was just scanning the source code looking for the
behavior. It is a bit tough to sort through everything, but it appears
that you are correct and the scavenger algorithm doesn't take into
account dispatch time, so if a call takes longer than the scavenger
interval it will shutdown your connection, discarding any oneway
messages queued up behind it in the TCP buffer. It seems to me that it
would be worth it to add a "dispatching" flag to Strand and have
Strand::Sync::WrTimedLock() always set heartbeat to zero if it is set.
That way we shouldn't have any prematurely shutdown connections.

					Gary Duzan
					GTE Laboratories

In Message <9F5DA2D8E614D211A21100805FA72C3AE4EA@DRKBFFTC0006> ,
   "Zurek, Jan" <Jan.Zurek@dresdner-bank.com> wrote:

=>we had exactly the same behaviour. Some of the calculation processes also
=>run on the local machine and it happens that they don't receive the
=>oneway-calls! So it doesn't matter of the processes are running on the same
=>or on different machines.
=>There are aprox. 400 one-way calls for every calculation process during the
=>simulation. We suspect that it may depend on the idle time of a connection
=>when a calulation needs some time and no communication happens.
=>Plamen Neykov, Jan Zurek, Matthias Fengler
=>> -----Original Message-----
=>> From:	Teemu Torma [SMTP:tot@trema.com]
=>> Sent:	22 April 1999 14:11
=>> To:	Duncan Grisby
=>> Cc:	omniorb-list@orl.co.uk
=>> Subject:	Re: [omniORB] Lost one-way calls
=>> Duncan Grisby <dgrisby@uk.research.att.com> writes:
=>> > Oneway calls cannot throw any sort of exceptions, so if they fail you
=>> > never get to know about it. In fact, there are no guarantees that a
=>> > oneway call will actually happen. That said, omniORB does try quite
=>> > hard to send oneway calls -- can you give more details about what you
=>> > are doing when this problem occurs?  Also, is there a good reason why
=>> > you're using oneways rather than normal operations?
=>> I have also seen lost one-way calls without any reasonable
=>> explanation.  I understand that they are not guaranteed, but using tcp
=>> connection in local machine without any communication failures one
=>> would assume that nothing gets lost.
=>> I don't think that this is related to communication failure or lost
=>> exceptions, because in my case no exceptions were raised, and there
=>> were no communication failures.  Using normal calls everything works
=>> as expected. The one-way calls are simply silently lost somewhere, not
=>> very often (maybe 1 out of 50), but still.
=>> Teemu