Mail Archives: cygwin/2009/05/29/17:22:02
Alexey Borzenkov wrote:
> No, the bug is not that it gets wrong number of arguments. In fact,
> Windows has no concept of arguments, only C runtime does, which parses
> the command line. If command line is truncated, then C runtime will
> have missing arguments when it tries to parse it.
Sorry, I had meant to comment on this previously but hit send too soon.
I think the problem I'm running into is:
- I give cygwin 1.7's bash a string that is in my system default code page.
- cygwin 1.7 thinks the string is actually UTF-8 and tries to convert it
as UTF-8 into UTF-16, resulting in a truncated command line that is
passed to child process.
Here's some more investigation:
$ cat bug.c
#include <stdio.h>
int wmain(int argc, wchar_t *argv[], wchar_t *envp[])
{
int i;
for (i = 0; i < argc; i++)
wprintf(L"%d: %s\n", i, argv[i]);
return 0;
}
... and compiled using MSVC ....
$ ./bug arg1 "before `cat copyright.txt` after" arg3
0: E:\cygwin1.7\tmp\bug.exe
1: arg1
2: before
So note that even when I'm seems to be an UNICODE-AWARE child process,
I'm still getting a truncated command line. In fact, call
GetCommandLineW() directly seems to give a truncated command line
as well.
Regards,
-Edward
Alexey Borzenkov wrote:
> On Sat, May 30, 2009 at 12:10 AM, Edward Lam <edward AT sidefx DOT com> wrote:
>> Thanks for explaining the UTF8 changes in cygwin 1.7. However, the decision
>> to use UTF-8 for the C locale is questionable.
>
> Not at all, because utf-8, as far as I understand, is used for
> communication with the system in this context, and does not force
> anything to the application. Most modern unixes use utf-8 nowadays, it
> means that even if you have a C locale your terminal outputs text in
> utf-8, your input is utf-8, your filenames are utf-8 (well, not
> really, but the rest of the system sees them that way). Same stuff
> here, except that launching non-cygwin processes is communication with
> the system as well, and it needs conversion. And where is conversion
> there is always possible loss of data. One way or the other.
>
>> It seems to me that it would be much safer to use the SYSTEM DEFAULT code
>> page (ie. the return value of the system GetACP() function) for CYGWIN
>> instead, ensuring compatibility for the large class native Windows
>> applications that are non-Unicode, non-CodePage aware.
>
> It might be safe for you, but not for other people. If you have a
> Russian default codepage and ever need to work with chineese/japanese
> filenames and cygwin uses default codepage for filesystem operations
> (as in 1.5 right now), then you are really screwed. In my opinion
> utf-8 is a silver bullet here, and I'm very glad it went that way.
>
>> I think it's very bad that changing LANG can result in a truncated *command
>> line*, that has nothing to do with printf. The printf in the code was just
>> for testing. The HUGE bug is that the application gets the WRONG NUMBER OF
>> ARGUMENTS.
>
> No, the bug is not that it gets wrong number of arguments. In fact,
> Windows has no concept of arguments, only C runtime does, which parses
> the command line. If command line is truncated, then C runtime will
> have missing arguments when it tries to parse it.
>
> I mentioned wprintf because recently I was wondering why
> mkpasswd/mkgroup had a strange truncating behavior with russian
> usernames and it turned out that wprintf, when it can't encode some
> characters, stops right there and returns an error code. But, honesly,
> who ever checks return codes from printf?
>
> Here might be something similar. When constructing command line some
> function is called and can't encode some character, returns error
> status, but it's never checked, and you get truncated command line.
>
> And btw, I'm not cygwin developer here, I'm just a speculating user
> right now, because I haven't been searching this problem in the code.
>
> --
> Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
> Problem reports: http://cygwin.com/problems.html
> Documentation: http://cygwin.com/docs.html
> FAQ: http://cygwin.com/faq/
>
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
- Raw text -