[omniORB] Py_ServantLocator Core Dump

Gary Pennington Gary.Pennington@uk.sun.com
Mon, 18 Sep 2000 18:55:35 +0100


Hi,

I've done some more investigating, still no debug build unfortunately, but I
have got some information.

I tried replacing my postinvoke method with a simple pass, e.g.

    def postinvoke(self, oid, poa, operation, cookie, serv):
        #For certain operations, we need to persist changes to the servant
        #Again use the factory to do this
        if operation in self.persistList:
            log("postinvoke: Persisting servant for "+str(oid))
            self.factory.persist(serv)

was converted into

    def postinvoke(self, oid, poa, operation, cookie, serv):
        pass


I then repeated my tests very many times. When I use the pass implementation
I never get a single core dump. However as soon as I revert to the original
implementation the core dumps resume.

The persist method takes quite a long time, could there be a race condition
somewhere that is being triggered by the comparatively long time spent in
postinvoke?

Some explanation.

Most operations on the objects accessed do not need to be persisted, ie they
do not change the state of the object. However, some operations do require
persistence and I thought it would be neat to have the postinvoke check to
see if persisting the state of the servant was required. It would appear that
this is
causing problems, as anyone any idea why?

More information

self.factory.persist is locked on entry and released on termination.
I mess around with serv state using __getstate__ to copy the serv dict object
and delete the "__omni_svt" entry from the copied dictionary.

I guess I can work around this problem by invoking the update directly from
the methods that cause updates, it's just not as elegant. So I would like to
make this approach work if it's possible.

Gary

Gary Pennington wrote:

> Duncan Grisby wrote:
>
> > On Friday 15 September, Gary Pennington wrote:
> >
> > > I have some code which implements the Evictor pattern using Servant
> > > Locators (lifted from Henning/Vinoski and adapted to Python). The
> > > application is multi-threaded and I am using multiple POA and
> > > POAManagers to try and optimize throughput on SMP boxes.
> >
> > It doesn't have anything to do with your core dump, but do you realise
> > that Python has a global interpreter lock?  Only one thread can be
> > executing Python code at a time -- every few bytecode instructions,
> > the interpreter lock is unlocked to switch to a different thread. This
> > means that when your server is running Python code, only one processor
> > will be running at a time. Other processors will still be able to do
> > other omniORB tasks, like the early stages of operation dispatch, so
> > you will still get some advantages from the multiple processors.
>
> I was aware of the Python global lock. I implemented as I did for two
> reasons :-
>
> 1. Maybe in the future this limitation will be removed and my code will
> still be around and will be "ready to go".
> 2. Large parts of the processing are outside of Python, and they can be
> multi-threaded (as you describe above.)
>
> >
> >
> > > When the system is under heavy load, i.e. load testing, I get a core
> > > dump in the server process. The core indicates that the problem arises
> > > in :-
> > >
> > > -----------------  lwp# 12 / thread# 162  --------------------
> > >  ff0afa5c
> > > __0fRPy_ServantLocatorKpostinvokeRC65OPortableServerIObjectIdP65OPorta
> > > bleServerDPOAPCcPvP65OPortableServerLServantBase (36a930, fe00fa3c,
> > > 33fbe0, fe00
> > > fbe4, a087c, 4ba610) + 40
> >
> > I don't know what's going on here. There is very little code in
> > Py_ServantLocator::postinvoke() which isn't calls into other
> > functions. If anything was invalid, I would expect it to coredump in
> > some sub-function. Can you build a version of omniORBpy with debugging
> > information so we can see exactly where the error happens?
>
> I tried, I can't get omniORB to compile on my system. I need to spend more
> time investigating this and will try to put together a debug build next
> week so that I can investigate it thoroughly.
>
> Gary
>
> >
> >
> > Cheers,
> >
> > Duncan.
> >
> > --
> >  -- Duncan Grisby  \  Research Engineer  --
> >   -- AT&T Laboratories Cambridge          --
> >    -- http://www.uk.research.att.com/~dpg1 --