[omniORB] deadlock

Duncan Grisby duncan at grisby.org
Tue Sep 27 17:55:06 BST 2016


On Fri, 2016-09-16 at 09:48 +0200, Michael Teske via omniORB-list wrote:

> we recently experienced a deadlock in omniORB 4.2.1. I list the stack traces of the
> relevant threads here:

[...]

> #4  omni::omniOrbPOA::synchronise_request (this=this at entry=0x14c7140, lid=lid at entry=0x14d8080) at poa.cc:2906

> This leads to Thread 14 having pd_lock and wanting *omni::internalLock
>           and Thread 96 having *omni::internalLock and wanting pd_lock.   
> 
> Mutexes should always locked in the same order. 

Indeed they should. POA::pd_lock comes before omni::internalLock in the
partial lock order, so the code in synchronise_request is wrong to try
to acquire them in the opposite order. It's quite rare for
synchronise_request to be called at all, because it's only invoked when
a POA is in holding state, so this bug slipped through the net.

The right fix is actually to not aquire pd_lock at all. The
omniLocalIdentity object it is checking is protected by
omni::internalLock, so it doesn't need pd_lock. I've checked a fix in to
the 4_2 branch.

Thanks for the bug report.

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --





More information about the omniORB-list mailing list