[omniORB] Problem shuting down server on Aix 4.3.2

Thomas Zumbiehl zumbiehl@bvassociates.fr
Mon Aug 5 12:41:01 2002


I made some tests using the eg2 example of omniORB distrib.
Here's what I've done:
 - I have modified the echo.idl, eg2_clt and eg2_impl files to add a stop()
method to the interface, and to enable the client to call that method after
the echo() one. For the eg2_impl part, I have done two things. I have
implemented the stop() method to shut down the ORB, and I have added a
signal catcher to react to SIGUSR1. I also have added a "shutdown" thread to
wait for some kind of event: in one case (a), it is waiting on a
pthread_cond (the POSIX semaphore is not implemented on AIX 4.3.2) ; on a
second case (b), it is waiting on a global flag.
 - I have compiled the ORB and the eg2 examples with g++ and with Visual Age
6.
 - I have launched the following tests:
   1. Launch the eg2_impl in background, wait for 1s, send it the SIGUSR1
signal,
   2. Launch the eg2_impl in background, then start eg2_clt (which will call
the stop() method on the interface),
   3. In both cases, repeat 10.000 times.

Here come the results:
  - test1-a (pthread_cond), with Visual Age 6: 206 core dumps,
  - test1-a, with g++: 155 core dumps,
  - test1-b (global flag), with Visual Age 6: no core dump, 
  - test1-b with g++: Cannot be done for some reasons,
  - test2 with Visual Age 6: no core dump,
  - test2 with g++: 100% core dump, couldn't even go more than 326 tries,
the eg2_impl wouldn't stop. It seems that my g++ is some kind of buggy. I
will redo the test with gcc-3.1 retrieved from UCLA library.

Anyway, it seems that one of the best way to shutdown a server on a AIX
machine is to use a global flag, as you said.

I will continue testing, and keep you informed of it.

Cheers,
Thomas

-----Message d'origine-----
De : dpg1@grisby.org [mailto:dpg1@grisby.org]De la part de Duncan Grisby
Envoyé : vendredi 2 août 2002 13:48
À : Thomas Zumbiehl
Cc : OmniORB List (E-mail)
Objet : Re: [omniORB] Problem shuting down server on Aix 4.3.2 


On Tuesday 30 July, "Thomas Zumbiehl" wrote:

> I have a soft made using 3 tiers architecture. One client, one main server
> and one slave server. I am trying to make the slave server having a smooth
> shutdown. I am doing this by catching the USR1, USR2 and TERM signals.
When
> I get one of those, I do an orb shutdown (that was my first attempt). This
> works fine on Linux 2.2 and Solaris 7. But it does'nt work on Aix (I get
an
> abort after the orb has been shut down).
> I tried the solution of a "shutdown thread" waiting on a semaphore. I
did'nt
> get any better result.

You can't call shutdown from inside a signal handler. You've just been
luck so far if it hasn't blown up on Linux and Solaris. When you say
you've tried with a thread waiting on a semaphore, do you mean a Posix
semaphore?  An omni_semaphore won't do. The only primitive that is
guaranteed signal-safe is the Posix semaphore (and even then it isn't
always). Have you got any AIX documentation to say what's signal safe
and what isn't?

As a real hack, if you don't need to shut down immediately you get a
signal, just quite soon, you could use a global flag to indicate that
you should shut down. Then you have a thread that wakes up every
second or so and checks the flag. I suppose that could still cause
problems on a multi-processor machine if the shutdown thread spotted
the flag changing while the signal handler was still running, but it
would probably be OK.

As a final suggestion, can you use a CORBA call to kill the slave
server, rather than giving it a signal?

Cheers,

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan@grisby.org     --
   -- http://www.grisby.org --