[omniORB] omniORB exception and application crash in giopServer::removeConnectionAndWorker

Thu Jan 25 16:37:22 GMT 2018

On Mon, 2018-01-22 at 09:50 +0100, Markus Czernek via omniORB-list
wrote:

> we are using omniORB 4.2.2 on Windows Server 2012R2/2016 (x86 builds)
> for inter server process communication.
> Since we switched from orb 4.1.6 to orb 4.2.2 we saw several process
> crashes with the same callstack inside the orb implementation:

[...]
> omniORB422_vc9_rt!omni::giopServer::removeConnectionAndWorker+0x70
> [orb422src\omniorb\dist\src\lib\omniorb\orbcore\giopserver.cc @ 1053]

[...]
> FAULTING_SOURCE_CODE:  
>  
> > 1053:     pd_lock.lock();

[...]
> From source
> src\lib\omniorb\orbcore\giopserver.cc line 1048
> 
> conn->clearSelectable(); seems to be the line where the crash occurs.
> Could it that the object giopConnection* conn is already in
> destruction so that calling
> conn->clearSelectable();
> results in a pure virtual function call?

Well actually, the crash dump says it is crashing on the line that
locks the mutex. Maybe that is just because it is offset a line, but
maybe it is a sign that the memory has become corrupted and this code
is an innocent victim of something trampling on it.

It is very hard to say what is causing the problem. Certainly your
suggestion that the connection object has already been deleted is
plausible, but I am not aware of anyone else encountering this.

Are you able to trigger the crash with omniORB running at
-ORBtraceLevel 25 ?  If so, that will give a good idea of what is
happening at the time.

If that produces too much logging (or prevents the crash due to changed
timing!), we can look at producing some targeted logging to indicate
what is going on.

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --