X-Recipient: archive-cygwin AT delorie DOT com X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8CB67385DC3E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=tlinx.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=cygwin AT tlinx DOT org Message-ID: <5E85899A.9090408@tlinx.org> Date: Wed, 01 Apr 2020 23:43:38 -0700 From: L A Walsh User-Agent: Thunderbird MIME-Version: 1.0 To: Jay Libove Subject: Re: bug report: shell expansion in argv[] processing sensitive to LANG, e.g. "ls: cannot access '*.pdf': No such file or directory", but works okay in bash References: In-Reply-To: X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00, GIT_PATCH_2, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.29 List-Id: General Cygwin discussions and problem reports List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "cygwin AT cygwin DOT com" Content-Type: text/plain; charset="utf-8"; Format="flowed" Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 0326iE4u027918 On 2020/03/24 00:18, Jay Libove via Cygwin wrote: > Problem: > Under certain circumstances (see Steps to Reproduce, below) Cygwin programs' built-in argv[] globbing will produce unexpected: > "{programName}: cannot access '{glob pattern}: No such file or directory" > e.g. > "ls: cannot access '*.pdf': No such file or directory" > .. despite the fact that e.g. *.pdf definitely exists. > ---- This isn't a bug or a problem, it is working normally as expected. Cygwin programs don't have built-in argv[] globbing or processing. The problem you are seeing is because you are calling cygwin programs from a windows shell. On windows, every program has to be built with glob processing. On unix, glob processing happens in the shell, so all unix (linux+cygwin) type programs have no glob processing because they know that globbing is built into the shell (like bash or csh, or dash, etc). If you run 'ls' *.pdf in bash, bash expands the *.pdf into arguments that don't contain a glob (if the glob matches a file). So 'ls' sees only fixed filenames and no globs. When you run 'ls from the Windows shell, Windows cmd.exe doesn't expand glob chars into anything. so 'ls' sees a literal file name of '*.pdf'. On linux you can name a file '*.pdf' (using an asterisk as a valid character). Unless you have a file named, literally '*.pdf', ls won't see it. Cygwin does simulate this: example: > cd /tmp /tmp> touch \*.pdf /tmp> ls *.pdf *.pdf /tmp cmd Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft Corporation. All rights reserved. C:\tmp>ls *.pdf ls *.pdf '*.pdf' ^^ note that now windows find *.pdf because there is a file named '*.pdf' (quotes added by 'ls'). Does this explain your issue, or am I not understanding it? Thanks (I'm not a cygwin author; just answering the question) Linda > Steps to Reproduce: > * Have some files in the local director with accented characters in the names, e.g.: > C:> mkdir c:\temp\test > C:> cd c:\temp\test > C:> touch h�llo.pdf > C:> touch g�odbye.pdf > C:> touch normal.pdf > * DON'T have the LANG= environment variable set to anything > * NOT in bash or Cygwin Terminal, but rather within Windows CMD.exe, execute a Cygwin command which needs to do file name globbing because the Windows CMD.exe shells does not do so for it, e.g. > C:> ls *.pdf > C:> cat *.pdf > These will produce "ls: cannot access '*.pdf': No such file or directory" > Although, curiously, > C:> ls *or* > does correctly produce: > normal.pdf > > Also, display output of the �cc�nted characters is incomplete: > C:> ls > 'g'$'\303\262''odbye.pdf' 'h'$'\303\251''llo.pdf' normal.pdf > C:> bash > jay_l AT DESKTOP-I9MRIE3 /cygdrive/c/Temp > $ ls > 'g'$'\303\262''odbye.pdf' 'h'$'\303\251''llo.pdf' normal.pdf > > > Analysis: > I've verified that it's not about case sensitivity. That is, it's not a matter of ls *.pdf vs. ls *.PDF. > If these test commands are run either under bash.exe or within a Cygwin Terminal window, the problem does not occur. > I've verified that the Windows system locale (per Windows' Region setting) actually doesn't matter. (I've reproduced this both on systems in Region Spain with language English-International and English-Ireland, and in a VM with a bog standard vanilla US English Windows). > > Credits to Paul for suggesting deleting files one by one until the problem goes away, and to Andrey for pointing out `locale` and the LANG= setting. > > Set LANG=en_US.UTF-8, e.g. > C:> set LANG=en_US.UTF-8 > .. and the problem goes away. > C:> ls *.pdf > g�odbye.pdf > h�llo.pdf > normal.pdf > C:> ls > g�odbye.pdf > h�llo.pdf > normal.pdf > > Interestingly, Andrey mentioned that he sets LANG=ru_RU.CP866 and he doesn't see the problem. When I tried that exact setting, I still had the problem. > So it's maybe not just that LANG must be set to *something*, but that somehow LANG must be set to something that matches something in Windows? (Sorry, I know that's nearly uselessly vague). > > > In summary, it appears that the way that the argv[] globbing code which gets compiled in to Cygwin programs functions a bit differently than the way the shell globbing code works within bash.exe. > And this produces unexpected globbing failures. > > > Thanks to all the Cygwin maintainers for this amazing software, for so many years! > -Jay > > > > ------------------------------------------------------------------------ > > -- > Problem reports: https://cygwin.com/problems.html > FAQ: https://cygwin.com/faq/ > Documentation: https://cygwin.com/docs.html > Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple > -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple