[omniORB] high Windows sockets strike again

Fri Sep 24 14:08:39 BST 2004

Hi,
    this time I have to reply to myself :-(
I apologize for this, but further investigations showed that I was wrong
about pointing to the socket number as the failure reason. This is handled
correctly in fd_set, but what actually fails is getpeername(), while
select() succeeds (it reports one socket ready, e.g. connected).
The value of ERRNO is WSAENOTCONN, despite of what select reported.
I also noticed that redoing the whole connection usually succeeds.
The overall context deals with firing a burst of threads (say 20), each one
attempting to connect to the same target obj (actually hosted by another
process on the same host). It seems that a variable number of them succeeds
(nearly 12).
Thanks,

Renzo


----- Original Message ----- 
From: "Renzo Tomaselli" <renzo.tomaselli at tecnotp.it>
To: "Omniorb list" <omniorb-list at omniorb-support.com>
Sent: Thursday, September 23, 2004 4:07 PM
Subject: [omniORB] high Windows sockets strike again


> Hi all,
>     a couple of months ago I raised the issue of not being able to run
> OmniORB 4.04 on some Windows platforms because of high socket numbers we
> occasionally receive. Thus we switched back to 4.03.
> Now it seems that a similar problem occurs from time to time even on 4.03.
> We use our own alternative transport for security reasons, but that damned
> tcp must stay there anyway for legacy reasons (Duncan says :-)
> Then we noticed that from time to time, outgoing connections were
performed
> from tcp as well.
> Following the whole story through the MSVC debugger, I noticed that
> sometimes we get high socket numbers, (say higher than 3000). Since
> FD_SETSIZE is set to 2048, then select fails.
> This turned out to have our transport failing, which caused giopRope to
> fallback to tcp, which got a lower socket number and connected
successfully.
> I'm considering to patch our transport by looping over socket() until I
get
> something < 2048, but I would prefer a cleaner solution.
> I don't even know whether this approach would lead to a potential infinite
> loop or not.
> Btw, I guess that when having just one transport, this loop occurs anyway
by
> rolling over pd_address_in_use (see giopRope.cc, line 559).
> Any comment is welcome,
>
> Renzo Tomaselli
>
>
> _______________________________________________
> omniORB-list mailing list
> omniORB-list at omniorb-support.com
> http://www.omniorb-support.com/mailman/listinfo/omniorb-list
>