X-Recipient: archive-cygwin AT delorie DOT com DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; q=dns; s=default; b=w1q+F+kUM2HpcGio KT3ls2612wo9ZexHpJ0IX77dusNq6xz5MCyALFmGSiYRG8oM4UwqYkV+pQRJ/kEF HKjq9BQckZePTW13pNBX6Eu8Llzw9oZkCztkkp55mlRMYfBCtYVMRZ1h8Ta9VI2t uFclSlDb7FJZ+7vzkjesgWJwI2w= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:subject:to:references:from:message-id:date :mime-version:in-reply-to:content-type :content-transfer-encoding; s=default; bh=7uUk9PgDsMXrOQcbTkpnN9 8IAqI=; b=l49/zSWtqjqk4BjaYVGhi3d24ZZIv6En4P/9x/MchcHLaM+dexkBBi LBcMWJx8Aqq+sMoSQ8Aem21LLErxOyI5gF3hOVvpCuGY43mDo+e3KbmbaS2tOwlQ pY8kC08eAzsqP/S1135gh9U+1RH/6NEYPTAtqd34TrKXNOAoqG49c= Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: cygwin-owner AT cygwin DOT com Mail-Followup-To: cygwin AT cygwin DOT com Delivered-To: mailing list cygwin AT cygwin DOT com Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.7 required=5.0 tests=AFFECTIONATE_BODY,BAYES_50,RP_MATCHES_RCVD,SPF_PASS autolearn=no version=3.3.2 spammy=Repin, repin, interaction, CYGWIN X-HELO: mail.lysator.liu.se Subject: Re: Formatting command line arguments when starting a Cygwin process from a native process To: cygwin AT cygwin DOT com, David Allsopp References: <005c01d1a6e2$30270ba0$907522e0$@metastack.com> <000101d1a76d$c37c6b80$4a754280$@metastack.com> <967954968 DOT 20160506172040 AT yandex DOT ru> <006301d1a834$6ccd1380$46673a80$@cantab.net> From: Peter Rosin Message-ID: <6e1f2ba2-0c8e-4659-4c5a-d748e90c1ced@lysator.liu.se> Date: Mon, 9 May 2016 11:43:19 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 In-Reply-To: <006301d1a834$6ccd1380$46673a80$@cantab.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Hi! On 2016-05-07 09:45, David Allsopp wrote: > Andrey Repin wrote: >> Greetings, David Allsopp! > > And greetings to you, too! > > > >>> I'm not using cmd, or any shell for that matter (that's actually the >>> point) - I am in a native Win32 process invoking a Cygwin process >>> directly using the Windows API's CreateProcess call. As it happens, >>> the program I have already has the arguments for the Cygwin process >>> in an array, but Windows internally requires a single command line >>> string (which is not in any related to Cmd). >> >> Then all you need is a rudimentary quoting. > > Yes, but the question still remains what that rudimentary quoting is - i.e. > I can see how to quote spaces which appear in elements of argv, but I cannot > see how to quote double quotes! > >> The rest will be handled by getopt when the command line is parsed. > > That's outside my required level - I'm interested in Cygwin's emulation > handling the difference between an operating system which actually passes > argc and argv when creating processes (Posix exec/spawn) and Windows (which > only passes a single string command line). The Microsoft C Runtime and > Windows have a "clear" (at least by MS standards) specification of how that > single string gets converted to argv, I'm trying to determine Cygwin's - > getopt definitely isn't part of that. > >>>> However, I've found Windows's interpretation to be inconsistent, so >>>> often have to play with it to find what the "right combination" is >>>> for a particular instance. >>>> >>>> I find echoing the parameters to a temporary text file and then >>>> using the file as input to be more reliable and easier to >>>> troubleshoot, and it breaks apart whether it is Windows cli >>>> inconsistencies or receiving program issues very nicely with the >>>> text file content as an intermediary >> >>> This is an OK tack, but I don't wish to do this by experimentation >>> and get caught out later by a case I didn't think of, so what I'm >>> trying to determine is *exactly* how the Cygwin DLL processes the >>> command line via its source code so that I can present it with my >>> argv array converted to a single command line and be certain that >>> the Cygwin will >> recover the same argv DLL. >> >>> My reading of the relevant sources suggests that with globbing >>> disabled, backslash escape sequences are *never* interpreted (since >>> the quote function returns early - dcrt0.cc, line 171). If there is >>> no way of encoding the double quote character, then perhaps I have >>> to run with globbing enabled but ensure that the globify function >>> will never actually expand anything - but as that's a lot of work, I >>> was wondering >> if I was missing something with the simpler "noglob" case. >> >> The point being, when you pass the shell and enter direct process >> execution, you don't need much of shell magic at all. >> Shell conventions designed to ease interaction between system and >> operator. >> But you have a system talking to the system, you can be very literal. > > Indeed, which is why I'm trying to avoid the shell! But I can't be entirely > literal, because Posix and Windows are not compatible, so I need to > determine precisely how Cygwin's emulation works... and so far, it doesn't > seem to be a terribly clearly defined animal! > > So, resorting to C files to try to demonstrate it further. spawn.cc seems to > suggest that there should be some kind of escaping available, but I'm > struggling to follow the code. Consider these two: > > callee.c > #include > int main (int argc, char* argv[]) > { > int i; > > printf("argc = %d\n", argc); > for (i = 0; i < argc; i++) { > printf("argv[%d] = %s\n", i, *argv++); > } > return 0; > } > > caller.c > #include > #include > > int main (void) > { > LPTSTR commandLine; > STARTUPINFO startupInfo = {sizeof(STARTUPINFO), NULL, NULL, NULL, 0, 0, > 0, 0, 0, 0, 0, 0, 0, 0, NULL, NULL, NULL, NULL}; > PROCESS_INFORMATION process = {NULL, NULL, 0, 0}; > > commandLine = "callee.exe \"@\"te\"\n\"st fo AT o bar\" \"baz baz *"; > if (!CreateProcess("callee.exe", commandLine, NULL, NULL, FALSE, 0, > NULL, NULL, &startupInfo, &process)) { > printf("Error spawning process!\n"); > return 1; > } else { > WaitForSingleObject(process.hProcess, INFINITE); > CloseHandle(process.hThread); > CloseHandle(process.hProcess); > return 0; > } > } > > If you compile as follows: > > $ gcc -o callee callee.c > $ i686-w64-mingw32-gcc -o caller caller.c > $ export CYGWIN=noglob # Or the * will be expanded > $ ./caller > > and the output is as required: > argc = 6 > argv[0] = callee > argv[1] = @te > st > argv[2] = fo AT o > argv[3] = bar baz > argv[4] = fliggle > argv[5] = * > > But if I want to embed an actual " character in any of those arguments, I > cannot see any way to escape it which actually works at the moment. For > example, if you change commandLine in caller.c to be "callee.exe test\\\" > argument" then the erroneous output is: > > argc = 2 > argv[0] = callee > argv[1] = test\ argument > > where the required output is > > argc = 3 > argv[0] = callee > argv[1] = test" > argv[2] = argument > > Any further clues appreciated. Is it actually even a bug?! I think cygwin emulates posix shell style command line parsing when invoked from a Win32 process (like you do). So, try single quotes: commandLine = "callee.exe \"@\"te\"\n\"st fo AT o bar\" \"baz baz '*' '\"\\'\"'"; I get this (w/o noglob): argc = 7 argv[0] = callee argv[1] = @te st argv[2] = fo AT o argv[3] = bar baz argv[4] = baz argv[5] = * argv[6] = "'" Cheers, Peter -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple