[omniORB] Lockup on Solaris 2.5.1 with Y2K patch

David.Chung@USPTO.GOV David.Chung@USPTO.GOV
Wed, 23 Jun 1999 12:20:36 -0400


	Prior to the entry of main(), several global objects are created.
However, there is
no interdependence among object constructors -- so, any initialization =
prior
to entry to main()
probably is not the problem.

	After main()'s exit but prior to the actual termination of your
code, the static
killer object will be destroyed.  For your code to work, the =
destruction of
killer must begin
prior to the termination of the worker thread AND the destruction of =
the
mutex and the thread
object itself.  This may not be the order in which the actual object =
cleanup
is done -- the
cleanup order is compiler and linker dependent.

	Take MSVC++ 5.0 for example.  According to MSVC manual,
 static objects are supposed to be destroyed in the
reverse of the order of their creation.  It seems simple enough --- but
then, in which order are static objects created?  This happens to =
depend on=20
the order of linkage of the source files. =20
It also follows that maintaining the order of destruction=20
with multiple source files will depend on the order or their linkage.

------------------------------------------------------------------------=
----
------------------------------------------------------

	It seems to me that you are creating and destroying threads "behind
the scenes", so that the application programmers who link against or =
use
your code=20
need not be concerned about initializing or destroying the worker =
thread
in main().  To make creation and destruction of the thread invisible to
those who merely
want to use your library and not care about all your thread activities, =

you might want to supplement your technique with something else --
perhaps  your worker thread can acquire the exit code of your main() =
and=20
terminate on zero condition.

=09

> -----Original Message-----
> From:	jfpoilpret@hn.vnn.vn [SMTP:jfpoilpret@hn.vnn.vn]
> Sent:	Tuesday, June 22, 1999 8:41 PM
> To:	omniorb-list@uk.research.att.com
> Subject:	Re: [omniORB] Lockup on Solaris 2.5.1 with Y2K patch
>=20
> Hi,
>=20
> It seems to me that in your sample there is something you do not =
(cannot)
> master : when the "theInstance" of class killer is created.
> Anyway, it is certain it is created BEFORE the main(), ie before you =
get
> opportunity to initialize the cond and mutex. I'm sorry because I =
don't
> know much about pthreads, so I don't know if this kind of =
initializing is
> necessary in this case (maybe it only intializes structures to 0; in =
such
> case, it is just the same as the way global variables will be inited
> before the main() starts ; in fact, maybe this is the issue : are you =
sure
> that cond and mutex are all zeros before "theInstance" gets =
instantiated.
>=20
> Just my two cents ;-)
>=20
>     Jean-Fran=E7ois
>=20
> -----Original Message-----
> From: Sai-Lai Lo <S.Lo@uk.research.att.com>
> To: omniorb-list@uk.research.att.com =
<omniorb-list@uk.research.att.com>
> Date: mardi 22 juin 1999 06:05
> Subject: Re: [omniORB] Lockup on Solaris 2.5.1 with Y2K patch
>=20
>=20
> >The problem seems to occur only on one machine at our site. I'm able =
to
> >produce a lockup with this simple test:
> >
> >The problematic machine is a dual processor ultra 2. I've tested the =
same
> >program on a 14-processor, a 4-processor, and various uniprocessor
> solaris
> >machines. Only one machine has this problem. So the problem may just =
be
> >a special combination of patches. In any case, it is worth checking =
if
> this
> >is causing problem with your machines.
> >
> >Sai-Lai
> >
> >------------------- cut here  =
-------------------------------------------
> >// $ CC -o thr thr.cc -mt -lposix4
> >// $ while [ 1 ]; do ./thr; done
> >#include <pthread.h>
> >#include <iostream.h>
> >#include <time.h>
> >
> >pthread_cond_t  cond;
> >pthread_mutex_t mutex;
> >pthread_t       thr;
> >int             died =3D 0;
> >
> >extern "C"=20
> >void*
> >worker(void* ptr)
> >{
> >  while (!died) {
> >    struct timespec abs;
> >    clock_gettime(CLOCK_REALTIME,&abs);
> >    abs.tv_sec +=3D 1;
> >    pthread_mutex_lock(&mutex);
> >    pthread_cond_timedwait(&cond,&mutex,&abs);
> >    pthread_mutex_unlock(&mutex);
> >  }
> >  cerr << "worker exit" << endl;
> >  pthread_exit(0);
> >  return 0;
> >}
> >
> >class killer {
> >public:
> >  ~killer() {
> >    void** status =3D 0;
> >    pthread_mutex_lock(&mutex);
> >    died =3D 1;
> >    pthread_mutex_unlock(&mutex);
> >    pthread_join(thr,status);
> >    cerr << "killer done." << endl;
> >  }
> >  static killer theInstance;
> >};
> >
> >killer killer::theInstance;
> >
> >
> >
> >int
> >main(int,char**)
> >{
> >  pthread_cond_init(&cond,0);
> >  pthread_mutex_init(&mutex,0);
> > =20
> >  pthread_attr_t attr;
> >
> >  pthread_attr_init(&attr);
> >
> >  pthread_create(&thr,&attr,worker,0);
> >  return 0;
> >}
> >--------------------------------------------------------------------
> >
> >
> >
> >--=20
> >Sai-Lai Lo                                   =
S.Lo@uk.research.att.com
> >AT&T Laboratories Cambridge           WWW:
> http://www.uk.research.att.com=20
> >24a Trumpington Street                Tel:   +44 1223 343000
> >Cambridge CB2 1QA                     Fax:   +44 1223 313542
> >ENGLAND
> >
> >
> >
>=20