Mailing-List: contact cygwin-developers-help AT sourceware DOT cygnus DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-developers-owner AT sources DOT redhat DOT com Delivered-To: mailing list cygwin-developers AT sources DOT redhat DOT com Message-ID: <3B1B504C.8090808@141monkeys.org> Date: Mon, 04 Jun 2001 05:09:32 -0400 From: Jeff Waller User-Agent: Mozilla/5.0 (X11; U; Linux 2.4.2-2 i686; en-US; rv:0.9) Gecko/20010507 X-Accept-Language: en MIME-Version: 1.0 To: cygwin-developers AT cygwin DOT com Subject: pthread_cond_timedwait, etc Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit The port attempt of bind9 has uncovered some serious problems with pthread. Probably one of the major differences between bind 8 and bind 9 is the use of threads for not only named but also the resolver with in fact the lightweight resolver being a hard link to named -- which is another problem with the build process BTW. Also, it seems dig and nslookup are threaded, they share some of the same sourcecode, it apprears as they segfault in exactly the same place: The call to pthread_cond_timedwait using gdb it appears the the segfault occurs the last line of the following, note the FIXME comment. // FIXME: pshared mutexs have the cond count in the shared memory area. // We need to accomodate that. int __pthread_cond_timedwait (pthread_cond_t * cond, pthread_mutex_t * mutex, const struct timespec *abstime) { // and yes cond_access here is still open to a race. (we increment, context swap, // broadcast occurs - we miss the broadcast. the functions aren't split properly. int rv; if (!abstime) return EINVAL; pthread_mutex **themutex = NULL; if (*mutex == PTHREAD_MUTEX_INITIALIZER) __pthread_mutex_init (mutex, NULL); if ((((pshared_mutex *)(mutex))->flags & SYS_BASE == SYS_BASE)) // a pshared mutex themutex = __pthread_mutex_getpshared (mutex); if (!verifyable_object_isvalid (*themutex, PTHREAD_MUTEX_MAGIC)) return EINVAL; Even taken out of context like it is, this is obviously buggy, themutex is initialized to NULL and then is only re-initialized to a "valid" value if the mutex is a pshared mutex, if it is not, then themutex is left == to NULL. And in fact when the above pthread_mutex **themutex = NULL; is replaced with pthread_mutex_t *themutex = mutex; to mimic the initalization that takes place in pthread_cond_wait, the segmentation fault goes away, and the program dig ran part-way successfully, but not totally: $ ./dig 141monkeys.org ; <<>> DiG 9.1.2 <<>> 141monkeys.org ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38854 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0 ;; QUESTION SECTION: ;141monkeys.org. IN A ;; AUTHORITY SECTION: 141monkeys.org. 60 IN SOA bubba.141monkeys.org. root.141mo nkeys.org. 36 300 120 21600 60 ;; Query time: 270 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) ;; WHEN: Mon Jun 4 04:17:02 2001 ;; MSG SIZE rcvd: 79 0 [unknown (0xFFFCEE81)] dig 199739 pthread_cond::BroadCast: Broadcast cal led with invalid mutex from grepping through thread.cc, this message is raised in pthread_cond::BroadCast (), and appears to be called from int __pthread_cond_broadcast (pthread_cond_t * cond) { if (!verifyable_object_isvalid (*cond, PTHREAD_COND_MAGIC)) return EINVAL; (*cond)->BroadCast (); return 0; } perhaps a matter of not getting it properly from the shared area as is done in __pthread_cond_timedwait? Unfornutately, the exact context could not be determined as using gdb caused the program to freeze and eventually, the machine had to be rebooted. ==================================================== ==================================================== Ok so much for the background, now the question. Apparently from the comments and the to-do list, the pthread impl is not completed, could someone give me or point me to some documentation that describes the architecture of cygwin and how threads fit into it? Also, what part is done or generally considered solid by now? Also, what IS the shared area BTW? -Jeff P.S. Oh yes, newbie here if the last question didn't give me away.