[omniORB] Lockup on Solaris 2.5.1 with Y2K patch

Bruce Visscher visschb@rjrt.com
Wed, 23 Jun 1999 12:00:33 -0400



Poilpret Jean Francois wrote:

> Hi,
>
> It seems to me that in your sample there is something you do not (canno=
t) master : when the "theInstance" of class killer is created.
> Anyway, it is certain it is created BEFORE the main(), ie before you ge=
t opportunity to initialize the cond and mutex. I'm sorry because I don't=
 know much about pthreads, so I don't know if this kind of initializing i=
s necessary in this case (maybe it only intializes structures to 0; in su=
ch case, it is just the same as the way global variables will be inited b=
efore the main() starts ; in fact, maybe this is the issue : are you sure=
 that cond and mutex are all zeros before "theInstance" gets instantiated.
>
> Just my two cents ;-)
>

Actually, the pthread data structures shouldn't be a problem, IMHO.  They=
 are all PODs so should be initialized to 0 automatically at program star=
tup (I don't know about Sun, but this is normally fairly easy to get righ=
t: just tell the linker to create a 0 filled section of memory when the p=
rogram is first activated).

I was wondering about the usage of cerr, though.  It accessed from the de=
structor of a static variable.  I believe this is supposed to work (but t=
he wording of the standard seems a little ambiguous to me) but maybe ther=
e's a bug?  You could essentially be trying (under the covers) to call at=
exit from a function (killer::~killer) that was registered via atexit.

>
>     Jean-Fran=E7ois
>
> -----Original Message-----
> From: Sai-Lai Lo <S.Lo@uk.research.att.com>
> To: omniorb-list@uk.research.att.com <omniorb-list@uk.research.att.com>
> Date: mardi 22 juin 1999 06:05
> Subject: Re: [omniORB] Lockup on Solaris 2.5.1 with Y2K patch
>
> >The problem seems to occur only on one machine at our site. I'm able t=
o
> >produce a lockup with this simple test:
> >
> >The problematic machine is a dual processor ultra 2. I've tested the s=
ame
> >program on a 14-processor, a 4-processor, and various uniprocessor sol=
aris
> >machines. Only one machine has this problem. So the problem may just b=
e
> >a special combination of patches. In any case, it is worth checking if=
 this
> >is causing problem with your machines.
> >
> >Sai-Lai
> >
> >------------------- cut here  ----------------------------------------=
---
> >// $ CC -o thr thr.cc -mt -lposix4
> >// $ while [ 1 ]; do ./thr; done
> >#include <pthread.h>
> >#include <iostream.h>
> >#include <time.h>
> >
> >pthread_cond_t  cond;
> >pthread_mutex_t mutex;
> >pthread_t       thr;
> >int             died =3D 0;
> >
> >extern "C"
> >void*
> >worker(void* ptr)
> >{
> >  while (!died) {
> >    struct timespec abs;
> >    clock_gettime(CLOCK_REALTIME,&abs);
> >    abs.tv_sec +=3D 1;
> >    pthread_mutex_lock(&mutex);
> >    pthread_cond_timedwait(&cond,&mutex,&abs);
> >    pthread_mutex_unlock(&mutex);
> >  }
> >  cerr << "worker exit" << endl;
> >  pthread_exit(0);
> >  return 0;
> >}
> >
> >class killer {
> >public:
> >  ~killer() {
> >    void** status =3D 0;
> >    pthread_mutex_lock(&mutex);
> >    died =3D 1;
> >    pthread_mutex_unlock(&mutex);
> >    pthread_join(thr,status);
> >    cerr << "killer done." << endl;
> >  }
> >  static killer theInstance;
> >};
> >
> >killer killer::theInstance;
> >
> >
> >
> >int
> >main(int,char**)
> >{
> >  pthread_cond_init(&cond,0);
> >  pthread_mutex_init(&mutex,0);
> >
> >  pthread_attr_t attr;
> >
> >  pthread_attr_init(&attr);
> >
> >  pthread_create(&thr,&attr,worker,0);
> >  return 0;
> >}
> >--------------------------------------------------------------------
> >
> >
> >
> >--
> >Sai-Lai Lo                                   S.Lo@uk.research.att.com
> >AT&T Laboratories Cambridge           WWW:   http://www.uk.research.at=
t.com
> >24a Trumpington Street                Tel:   +44 1223 343000
> >Cambridge CB2 1QA                     Fax:   +44 1223 313542
> >ENGLAND
> >
> >
> >