[omniORB] Application gets stuck when using thread pool connection mode with bidirectional GIOP

kolos serguei.kolos at cern.ch
Mon Jan 31 13:11:56 UTC 2022


HI Duncan

Thank you for your help. Please find a log file in Google Drive:
https://drive.google.com/file/d/17Ux9HVmyBe8U-CkJk__gFgcWijU6SoOW/view?usp=sharing <https://drive.google.com/file/d/17Ux9HVmyBe8U-CkJk__gFgcWijU6SoOW/view?usp=sharing>

This application, called RootController, can receive commands from another process and when this happens it starts a pre-defined number of processes on a set of remote computers by sending requests to the CORBA servers running on these computers. This is exactly where the BIDIR communication happens as the processes are started asynchronously: the RootController sends requests to the CORBA servers running on remote computers asking them to start new processes and when the process are started these CORBA servers send back notifications to the RootController. When BIDIR protocol is used, the RootController very often gets into a state where it takes more than a minute to process a new command. The error is manifested by the following pattern in the log file:
A message indicating a new connection attempt from the application that sends command to the current one is printed. It looks like: 
omniORB: (1) 2022-01-31 13:36:02.679839: Server accepted connection from giop:tcp:188.184.2.99:46166
But the actual processing of this request happens only a minute later: 
omniORB: (42) 2022-01-31 13:37:01.521949: Accepted connection from giop:tcp:188.184.2.99:46166 because of this rule: "* unix,tcp,bidir"
omniORB: (32) 2022-01-31 13:37:01.522292: Dispatching remote call 'makeTransition' to: root/rc/commander<SwRodTest.rc.commander.RootController> (active)

I have kept the RootController running for a while and made multiple attempts to send commands to it, so there are several cases in the log file showing that external commands had been processed with a very long delay.

Just for the record, here is a log file of the same RootController application with bidirectional GIOP been disabled. In this case everything works as expected.
https://drive.google.com/file/d/1ZzRyRSH1MWg_LxquFNSkb-3Hax1sU59j/view?usp=sharing <https://drive.google.com/file/d/1ZzRyRSH1MWg_LxquFNSkb-3Hax1sU59j/view?usp=sharing>

Cheers,
Serguei

> On 30 Jan 2022, at 15:24, Duncan Grisby <duncan at grisby.org> wrote:
> 
> Hi Serguei,
> 
> It's not a known issue, although the combination of bidirectional with thread pool mode is somewhat unusual. Bidirectional GIOP requires extra threads to manage the interactions on the bidirectional connections, so something is probably interacting badly.
> 
> Please can you get traces from the server with these tracing options:
> 
>   traceLevel 25
>   traceInvocations 1
>   traceInvocationReturns 1
>   traceThreadId 1
>   traceTime 1
> 
> That will give a lot of output, but it should make it possible to see what is going on.
> 
> Duncan.
> 
> 
> On Fri, 2022-01-28 at 16:05 +0100, kolos via omniORB-list wrote:
>> Hi guys,
>> 
>> Can anyone help me out please with the following problem. 
>> 
>> We are using omniORB 4.2.5 on CentOS 7 Linux and we have recently tried to use bidir GIOP for some of our
>> services. Everything works fine for the servers that use thread-per-connection mode, but the applications which are
>> using thread-pool mode get stuck when the number of external connections reaches the number of threads in the 
>> pool. There is a clear correlations between these numbers, which we verified by varying the number of threads in the 
>> pool. The symptom is that when the number of external connections reaches the number of threads in the pool the 
>> server starts accumulating connections in the CLOSE_WAIT state and an attempt to communicate with it resulted 
>> in a timeout. After a while (a few tens of seconds) all connections in the CLOSE_WAIT state disappeared and the 
>> server becomes responsive again. 
>> 
>> Is this a known behaviour or otherwise where should I look at to investigate and possibly fix this problem?
>> 
>> Cheers,
>> Serguei
>> _______________________________________________
>> omniORB-list mailing list
>> omniORB-list at omniorb-support.com <mailto:omniORB-list at omniorb-support.com>
>> https://www.omniorb-support.com/mailman/listinfo/omniorb-list <https://www.omniorb-support.com/mailman/listinfo/omniorb-list>
> 
> -- 
>  -- Duncan Grisby --
>   -- duncan at grisby.org --
>    -- http://www.grisby.org --
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.omniorb-support.com/pipermail/omniorb-list/attachments/20220131/56592747/attachment.html>


More information about the omniORB-list mailing list