www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2018/03/22/17:56:52

X-Recipient: archive-cygwin AT delorie DOT com
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:reply-to:message-id
:date:mime-version:in-reply-to:content-type
:content-transfer-encoding; q=dns; s=default; b=v3mY5rG06kr5m6I4
P8DvTDWogyEAUwctQyOWk8INNnlOdt+x+jTcjdy4sLdqdY+v4D86oq5NSuRT22WO
7O9BBER8AhdBYsz8FmYfDCrrp6R6vbZupAJsJ6ZyvuqEHE93Wsub21hrO5v5gFXU
gMZyKXtCNRA+r1x4aXY5AlrpkW0=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
:list-unsubscribe:list-subscribe:list-archive:list-post
:list-help:sender:subject:to:references:from:reply-to:message-id
:date:mime-version:in-reply-to:content-type
:content-transfer-encoding; s=default; bh=C5EnRIu4RwbqzQkAvWmvcz
UqwxU=; b=RzGFTr/rEuFRFBQ1vLplF2VLx/6xgOj6USB8TM2lGR5oVJmTqHfNzF
KFqA2GkIyQZWdN2R3zGA6BuUXToEguaZ5j8h5nRav8bEVL4I3/ynL18solEktDOw
jCPAVt1uRZ7p20/IefBhf2l6JLnZ9pJIFDExYa/QNN908SXmKoTh8=
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
List-Id: <cygwin.cygwin.com>
List-Subscribe: <mailto:cygwin-subscribe AT cygwin DOT com>
List-Archive: <http://sourceware.org/ml/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-help AT cygwin DOT com>, <http://sourceware.org/ml/#faqs>
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com
Delivered-To: mailing list cygwin AT cygwin DOT com
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=2.1 required=5.0 tests=BAYES_00,BODY_8BITS,FREEMAIL_FROM,GARBLED_BODY,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS autolearn=no version=3.3.2 spammy=UD:mail.ru, UD:ru, H*u:6.1, H*UA:6.1
X-HELO: smtp57.i.mail.ru
Subject: Re: Quotes around command-line argument that has unicode characters are not removed
To: cygwin AT cygwin DOT com
References: <08d9621d-b9a0-c0d7-b58b-581ab957a08c AT mail DOT ru> <20180322152437 DOT a37c3dd3b778bba765e2124c AT inbox DOT ru> <162182215 DOT 20180322162501 AT yandex DOT ru>
From: "Dmitry Katsubo via cygwin" <cygwin AT cygwin DOT com>
Reply-To: Dmitry Katsubo <dma_k AT mail DOT ru>
Message-ID: <93d66ed8-4dea-ddec-e731-43301ce57271@mail.ru>
Date: Thu, 22 Mar 2018 22:56:35 +0100
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <162182215.20180322162501@yandex.ru>
Authentication-Results: smtp57.i.mail.ru; auth=pass smtp.auth=dma_k AT mail DOT ru smtp.mailfrom=dma_k AT mail DOT ru
X-7FA49CB5: 0D63561A33F958A5A27B1BCA46B3C4F630B604AC49FA23DB89B8658562CC1CD6725E5C173C3A84C311BA4339981C382A17C7C968CF0A659DBAAD9279A72BC9ABC4224003CC836476C0CAF46E325F83A50BF2EBBBDD9D6B0F05F538519369F3743B503F486389A921A5CC5B56E945C8DA
X-Mailru-Sender: 6DAAA20F2058E07D134D6D8D77B89E7FCF04DA8F659BFC66FDDE539238E4C0ECFCC05D241E210761501E7C294F69090ED50E20E2BC48EF5AE609D43F356B221EEAB4BC95F72C04283CDA0F3B3F5B9367
X-IsSubscribed: yes
Note-from-DJ: This may be spam

On 2018-03-22 14:25, Andrey Repin wrote:
> Greetings, Mikhail Usenko!
> 
>> In bare cmd.exe native-msvcrt binary is working OK with quoted non-ascii
>> arguments, while cygwin-flavor binary is not. But I don't know exactly which
>> level here: cmd.exe or msvcrt.dll/cygwin1.dll is responsible for
>> such a behavior.

Thanks, Mikhail! I generally agree with you. If you follow the links I've
provided in my original mail, you can see that cmd.exe does not do any argument
splitting. I also see that from this method signature [1]:

build_argv (char *cmd, char **&argv, int &argc, int winshell)

which basically takes a string as input and returns an array of strings plus
number of arguments as output. So this is either done by msvcrt.dll or by
cygwin1.dll and they have different ways of doing that, which is OK provided
it is documented and done consistently. I refer back to dcrt0.cc where the
woodoo is done. In particular in line 165 [2] it checks that execution was
performed from bare Windows, and behaves differently.

On 2018-03-22 12:24, Andrey Repin wrote:
> Run it in bash. I'm pretty sure you will see your results more consistent.

When "test.exe" is run from bash, it behaves correctly because as you said
bash did the most of dirty work. I also tried to workaround like below,
but it does not work:

D:\cli> bash -c "./test 'текст плюс.txt'"
bash: ./test 'текст плюс.txt': No such file or directory

> Locale settings affecting Cygwin binary.
> 
> If you
> set LANG=ru_RU.CP866
> (f.e.)
> before invoking cygwin testcase in native CMD, you will likely see it
> working better.

Thanks for this advise, Andrey. I see that it reacts, but works worth :)
I think it advises to output characters in CP866, but console is UTF-8:

D:\cli> set LANG=ru_RU.CP866

D:\cli> test "текст плюс.txt"
param 0 = test
param 1 = ⥪▒▒ ▒▒▒▒.txt
Failed to open '⥪▒▒ ▒▒▒▒.txt': No such file or directory

But.. ta-da! I made it working like that:

D:\cli> set LANG=ru_RU.UTF-8

D:\cli> test "текст плюс.txt"
param 0 = test
param 1 = текст плюс.txt
File 'текст плюс.txt' was opened

Hooray, it worked!

> Alternatively, you could try
> chcp 65001

That does not help:

D:\cli> chcp 65001
Active code page: 65001

D:\cli> test "текст плюс.txt"
param 0 = test
param 1 = "текст плюс.txt"
Failed to open '"текст плюс.txt"': No such file or directory

[1] https://github.com/openunix/cygwin/blob/master/winsup/cygwin/dcrt0.cc#L297
[2] https://github.com/openunix/cygwin/blob/master/winsup/cygwin/dcrt0.cc#L165

-- 
With best regards,
Dmitry

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright 2019   by DJ Delorie     Updated Jul 2019