[omniORB] Lost one-way calls

Sai-Lai Lo S.Lo@uk.research.att.com
22 Apr 1999 16:18:48 +0100


>>>>> Gary D Duzan writes:

>    Right. I was just scanning the source code looking for the
> behavior. It is a bit tough to sort through everything, but it appears
> that you are correct and the scavenger algorithm doesn't take into
> account dispatch time, so if a call takes longer than the scavenger
> interval it will shutdown your connection, discarding any oneway
> messages queued up behind it in the TCP buffer. It seems to me that it
> would be worth it to add a "dispatching" flag to Strand and have
> Strand::Sync::WrTimedLock() always set heartbeat to zero if it is set.
> That way we shouldn't have any prematurely shutdown connections.

Let me fill in a bit more details.

On the server side this is what happens:

The server thread blocks on socket recv(). When it has got a complete
request and is about to dispatch an upcall, it acquires the WrLock().
Once the WrLock() is acquired, the scavenger thread won't touch the
connection. Therefore the connection would not go away while the server
thread is working inside an object implementation.

It is however possible that the connection may be shutdown by the scavenger
while the server thread is still receiving parts of a request
message. This is because at this time the WrLock() is not held.

The reason behind this behaviour is to protect the server from denial of
service attack. Think of a malicious/buggy client that just send part of a
request message and stop sending the rest. The server should not wait
indefinitely for the bits that will never come. The current behaviour is
that the scavenger will clean up these wedged connections.

I do take your point that the scavenger is actually better to sync the
timeout with the moment the server thread receives the first byte of a
request message. However, this would not solve the race condition in
general because the scavenger may close the connection at the same time as
the client starts sending bytes down its tcp socket. The client may have
finish with the send() call before the TCP close status is propagated back
to the client's OS.


Sai-Lai


-- 
Sai-Lai Lo                                   S.Lo@uk.research.att.com
AT&T Laboratories Cambridge           WWW:   http://www.uk.research.att.com 
24a Trumpington Street                Tel:   +44 223 343000
Cambridge CB2 1QA                     Fax:   +44 223 313542
ENGLAND