[omniORB] deadlock with distributed callback application

Lars Immisch lars@ibp.de
Tue, 5 Jun 2001 16:42:09 +0200


Dear Christof,

thank you.

> omniORB only creates one thread on the server side for each incoming
> connection - this is fine as long as each client creates a new connection
> for concurrent requests to the same server. But with oneway requests the
> client side doesn't know when the server has finished processing the request
> (as there is no reply) and AFAIK omniORB doesn't create a new connection for
> oneways (or mark the connection as used) -- it just assumes that processing
> of oneway requests doesn't take long on the server.

That is what I thought, but I am not convinced yet it is what I see :-)

> The problem starts when the server is still busy with processing a oneway
> request while the client wants to send another two-way request (over the
> same connection). The client then has to wait for the reply from the server,
> but the server doesn't even see the request as it is still busy with
> processing the oneway request.

I thought of that, but I deliberately don't invoke twoway request from oneway  
requests back to the client. This is why I suspected the LocationRequests were  
causing my problem - if switched on, they turn a oneway request into a twoway  
request and I couldn't see whether they created a new connection. I need to  
look harder into this.

> > Our 'real' system deadlocks immediately when verifyObjectExistsAndType is  
> > enabled. When I look where it is hanging, both processes are blocked on the
> > select in tcpSocketStrand::ll_recv called from the _locateRequest inside
> > the omniObjRef::_invoke.
>
> And I guess they are processing oneway requests from each other...

Not as far as I can see - when the deadlock occurs, all other threads are  
idle, i.e. blocking where they should.

> > My suspicion was that in this case, the LocateRequest is sent over a reused
> > connection, and the other oneway invocation gets into the way. But I
> > haven't been able to verify that - mainly because my attempts to recreate
> > the problem
>
> Hmm, I have hacked together some simple Python scripts that show a possible
> deadlock situation with oneways:

Thanks. It deadlocks nicely, indeed, but it's not the problem I have here. I  
am looking into the problem, if currently rather indirectly.

I will post a summary once I know what is going on.

Thanks,

Lars