X-Spam-Check-By: sourceware.org Message-ID: <442C7B8F.9070000@pondol.com> Date: Thu, 30 Mar 2006 18:45:03 -0600 From: David Carter User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: cygwin AT cygwin DOT com Subject: Re: problems with gawk 3.1.5-3 hanging -- more info References: <442C25D0 DOT 7030605 AT pondol DOT com> <442C3197 DOT 7090309 AT pondol DOT com> <20060330200757 DOT GO20907 AT calimero DOT vinschen DOT de> <442C408B DOT 3080409 AT carter DOT to> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Igor Peshansky wrote: > On Thu, 30 Mar 2006, David Carter wrote: >> It appears to me that by opening the file as O_TEXT, that gawk is >> hanging because it is waiting for that LF char to follow the CR (which >> never comes). Does this sound likely to you? > > If this theory were true, "echo -ne 'aa\rb' | gawk '{print $0}'" would > hang. It doesn't for me, even with textmode pipes... Yes, I realized this myself soon after posting. Your echo command doesn't hang for me either. As I said in my original post, this is one of those annoying bugs that if I try to make it hang interactively, it always works correctly (never hangs), but if I try to do it with my regular script, it (usually, but not always) hangs. This is another clue that my initial "theory" was incorrect: if it were true, the program would hang regardless. Here's an example line, callable from a prompt, that usually hangs: $ rsync -Pv sourcefile rmachine:/rpath/ | \ gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}' To test this, I recommend using a source/remote combination for rsync that will take about 30 seconds to a minute to complete. This will create enough output for gawk to replicate the issue. If this hangs (it may not hang the first time; give it 2 or 3 runs), you'll stop getting output to stdout and it will just sit there. If you go to another prompt to do a ps, you'll see that rsync is done running but gawk is still sitting there. CTRL+C in the window running the script does nothing. You need to kill the gawk process from another bash prompt. > Try saving the output of rsync to file and running gawk over that > separately... Good idea. Per your advice, I tried doing something like the following: $ rsync -Pv sourcefile rmachine:/rpath/ > rsync.out $ cat rsync.out | \ gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}' Surprisingly, that code never hangs. Also, this never hangs: $ rsync -Pv sourcefile rmachine:/rpath/ | xxd | xxd -r | \ gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}' However, this usually hangs: $ rsync -Pv sourcefile rmachine:/rpath/ | cat | gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}' > Also, if gawk really hangs, you can run it under strace to > see exactly what it was doing up to the hang (but please don't post the > strace output unless you're asked to do so by Corinna or CGF). I tried something like the following: $ rsync -Pv sourcefile rmachine:/rpath/ | strace \ gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}' But, unfortunately, this never hangs. So I tried this: $ ( sleep 10; rsync -Pv sourcefile rmachine:/rpath/ ) | \ gawk 'BEGIN { RS="\r|\n" } {print $0; fflush();}' and then I go to another window and start strace on the gawk PID. This hangs (usually). Looking at the strace output, the last thing gawk does is: 87 22612601 [read_pipe] gawk 188 fhandler_base::read: returning 1, text mode Every time it hangs, I get "read returning 1, text mode". If I look at strace output for the sucessful (non-hanging) executions, i never get a "read returning 1, text mode." All of this makes me wonder if: a) rsync is perhaps doing something with its stdout file descriptor that it shouldn't be doing, or that; b) gawk is perhaps doing something with its stdin file descriptor that it shouldn't be doing. If a), then why doesn't it break when I just redirect the output of rsync to a file? If b), then what is it about piping the output of rsync to gawk that is different (from gawk's point of view) than when I just save the rsync output to a file and then send the contents of the file to gawk? And another thing...why would any of this make any difference if gawk opens the file as O_TEXT vs O_BINARY? > HTH, It was a great help. Thanks, Igor. Any other light you can shed is much appreciated. Regards; David Carter -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/