www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2025/06/25/11:01:03

DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 55PF12dC560490
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 55PF12dC560490
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=SWWHmy+U
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4702A385AC22
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1750863660;
bh=T3rpAplytxsh6D3pYmtyVVUKlj1DBFx6t9HcT354u5w=;
h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
From;
b=SWWHmy+UMq3ZySXaDgLRza39XTj/25+u0rwBXs/hegEzowatNsKhVv3g5d8fv1uuO
6szeMxSXxBYcWCb2XPTYljorAotbQlK/Fr/0Q647ZZel681YuaNEwS1lWfjsGZeZ8q
+B6e/p2ZQ8Zi5lYoyNT8NWzlOc3Z9GDfZ0hKOHj8=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4ADD13856DED
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4ADD13856DED
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1750863550; cv=none;
b=UVEJX/suJ9rfyBvcewrUM8oRsTLM4TV0h3VVSCR6rYuZoGR2gU486sLIvRf4RndYq4ZiUdMdBf1TV3IecorbyNzOe3rJ/uUN4S91LPZC6uLRfzM/iS+H+k47tNQY9OBGyHpTRPFfZhTHqWRU8qHcM/lwUC9881uaw6xn5qBo9AQ=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1750863550; c=relaxed/simple;
bh=gG/M3AqGS9BXmeKBbrbC6TVYtwe/DGriPPq2XnuUB5U=;
h=Subject:From:To:Message-ID:Date:MIME-Version;
b=invKVrqql2b5HkvpvGuQcBIKvFQWywjMERfnMkJhxHdlIZ5zZBcwr5HW0LC2ESmAVmKTTrOrI4MUTXvwvmxz9lJEeOWHTBHiA0tjY9IevNfYCkv/Wv6CmAyZU1AAGBXNFM91hPER1DXOuj6oHHr0VO2IrvfoUm0SqlDvNZ4nWWM=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4ADD13856DED
Subject: Re: readdir() returns inaccessible name if file was created with
invalid UTF-8
To: cygwin AT cygwin DOT com
References: <96f2253b-791b-b8a0-97dd-8d257eefb9b1 AT t-online DOT de>
Message-ID: <03c4fae7-7322-572c-ae72-52e300f0b438@t-online.de>
Date: Wed, 25 Jun 2025 16:59:04 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:128.0) Gecko/20100101
SeaMonkey/2.53.20
MIME-Version: 1.0
In-Reply-To: <96f2253b-791b-b8a0-97dd-8d257eefb9b1@t-online.de>
X-TOI-EXPURGATEID: 150726::1750863546-5F7FC4E1-A8F6384A/0/0 CLEAN NORMAL
X-TOI-MSGID: f792e4fa-8ff7-4347-8ebc-c71428cfcfe2
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Christian Franke via Cygwin <cygwin AT cygwin DOT com>
Reply-To: cygwin AT cygwin DOT com
Cc: Christian Franke <Christian DOT Franke AT t-online DOT de>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>
X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 55PF12dC560490

On Sun, 15 Sep 2024 19:47:11 +0200, Christian Franke wrote:
> If a file name contains an invalid (truncated) UTF-8 sequence, open() 
> does not refuse to create the file. Later readdir() returns a 
> different name which could not be used to access the file.
>
> Testcase with U+1F321 (Thermometer):
>
> $ uname -r
> 3.5.4-1.x86_64
>
> $ printf $'\U0001F321' | od -A none -t x1
>  f0 9f 8c a1
>
> $ touch 'file1-'$'\xf0\x9f\x8c\xa1''.ext'
>
> $ touch 'file2-'$'\xf0\x9f\x8c''.ext'
>
> $ touch 'file3-'$'\xf0\x9f\x8c'
>
> $ ls -1
> ls: cannot access 'file2-.?ext': No such file or directory
> ls: cannot access 'file3-': No such file or directory
> 'file1-'$'\360\237\214\241''.ext'
> file2-.?ext
> file3-
>
>
> Name mapping according to "fhandler_disk_file::readdir" strace lines:
>
> "file1-\xF0\x9F\x8C\xA1.ext" -(open)-> L"file1-\xD83C\xDF21.ext" 
> -(readdir)->
> "file1-\xF0\x9F\x8C\xA1.ext"
>
> "file2-\xF0\x9f\x8C.ext" -(open)-> L"file2-\xD83C\xF02Eext" -(readdir)->
> "file2-.\xE1\x9E\xB3ext"
>
> "file3-\xF0\x9F\x8C" -(open)-> L"file3-\xD83C\xF000" -(readdir)->
> "file3-"
>
> Issue found because 'stress-ng --filename ...' could not cleanup its 
> temp directory.
>

A closer look many month later with Cygwin 3.7.0-0.137.g756669312c97 and 
current upstream of stress-ng reveals a related problem which is 
possibly more serious:

In cases like file3-... above, the converted Windows path ends with 
0xF000. This suggests that this is an accidental conversion of the 
terminating null to the 0xF0xx range.

In some cases, the created Windows file name has random garbage behind 
the 0xF000. Then even Cygwin is not able to access or unlink the file 
after creation.

In fortunately very rare cases, the created Windows file is not 
accessible from Win32 layer itself because it looks like
   L"file3-\xD83C\xF000garbage."
or
   L"file3-\xD83C\xF000garbage "
which is invalid on Win32 layer due to trailing '.' or space. Then a 
tool which removes the file via Nt*() layer is required.

Could not provide a reproducible testcase, sorry.

'stress-ng --filename 1' succeeds, but may silently leave temp files 
behind. The next stress-ng release will report an error if unlink() of 
such a file fails.
Caution: Files created that way may be not removable with "onboard" 
tools, see above.

-- 
Regards,
Christian


-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019