www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin-developers/2001/11/08/12:10:11

Mailing-List: contact cygwin-developers-help AT sourceware DOT cygnus DOT com; run by ezmlm
List-Subscribe: <mailto:cygwin-developers-subscribe AT sources DOT redhat DOT com>
List-Archive: <http://sources.redhat.com/ml/cygwin-developers/>
List-Post: <mailto:cygwin-developers AT sources DOT redhat DOT com>
List-Help: <mailto:cygwin-developers-help AT sources DOT redhat DOT com>, <http://sources.redhat.com/ml/#faqs>
Sender: cygwin-developers-owner AT sources DOT redhat DOT com
Delivered-To: mailing list cygwin-developers AT sources DOT redhat DOT com
Date: Thu, 8 Nov 2001 12:09:56 -0500
From: Christopher Faylor <cgf AT redhat DOT com>
To: cygwin-developers AT cygwin DOT com
Subject: Re: Debugging problem in peek_pipe in select.cc
Message-ID: <20011108120956.A2730@redhat.com>
Reply-To: cygwin-developers AT cygwin DOT com
Mail-Followup-To: cygwin-developers AT cygwin DOT com
References: <20011108155542 DOT 19905 DOT qmail AT lizard DOT curl DOT com>
Mime-Version: 1.0
In-Reply-To: <20011108155542.19905.qmail@lizard.curl.com>
User-Agent: Mutt/1.3.21i

On Thu, Nov 08, 2001 at 10:55:42AM -0500, Jonathan Kamens wrote:
>I'm trying to debug why "make -j2" continues to hang for us
>occasionally even after cgf's recent fix to the code in this area.
>
>After deploying a cygwin1.dll with his fix, I ran two builds in a row
>which both hung.  I didn't get much useful information out of them, so
>I set things up to be able to debug better in case of future hangs,
>and then started running builds.
>
>I ran a whole bunch of builds over several days and none of them
>hung.  Finally, one of them hung, and then one of my coworkers killed
>and restarted it before I could debug it :-).
>
>Shortly after that, I finally got another build to hang, and I'm
>looking at that one now.  Here's the current roadblock preventing me
>from understanding what's going on....
>
>I attached to a hung process.  The top of its stack trace in thread 1
>looks like this:
>
>  #0  0x77f67a5b in ?? ()
>  #1  0x61053b08 in peek_pipe (s=0x24aeee4, ignra=0, guard_mutex=0x1dc)
>      at /u/jik/cygwin-cvs/src/winsup/cygwin/select.cc:453
>  #2  0x61053eba in fhandler_pipe::ready_for_read (this=0x61544920, fd=6, 
>      howlong=4294967295, ignra=0)
>      at /u/jik/cygwin-cvs/src/winsup/cygwin/select.cc:512
>  #3  0x61062b97 in _read (fd=6, ptr=0x24aeff2, len=1)
>      at /u/jik/cygwin-cvs/src/winsup/cygwin/syscalls.cc:315
>  #4  0x6108cbce in read (fd=6, buf=0x24aeff2, cnt=1)
>      at /u/jik/cygwin-cvs/src/newlib/libc/syscalls/sysread.c:15
>
>Line 453 of select.cc is a call to PeekNamedPipe.  According to the
>MSDN documentation for PeekNamedPipe, it never hangs.  So, thinking
>that frame 0 must be the PeekNamedPipe invocation, I typed "frame 0"
>and then "finish" in a "gdb -nw" window (running inside an ssh session
>to the Windows servers), and now it's hung.  How can that be?  I don't
>get it.

The point of my addition of a mutex to peek_pipe was to prevent occurrences
of PeekNamedPipe blocking, actually.  It can block in pathological situations
when another thread/process is doing a blocking read.  From your backtrace,
it looks like you are running an older version of the sources.  I have been
making a lot of changes to select to try to fix this problem.

One change in particular allowed me to run "make -j2" for more than 24 hours
with no hang.

I'm sorry that I didn't specifically send you email about this.

>^C has no effect at this point, so I can't get get to stop the process
>and tell me where it is now.

If cygwin is in a blocking win32 API call, then ^C will not work.  ready_for_read
is specifically designed to not block so that signals will work wrt
blocking reads.

If you are still seeing hangs in the most recent sources, then there is
still some kind of race with the guard mutex in peek_pipe.  That is
where you will need to investigate.

cgf

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019