www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2025/02/15/02:43:03

DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 51F7h3aj1922975
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 51F7h3aj1922975
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=DIjehwVQ
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 653453858415
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1739605382;
bh=DvnZd0kZT6N0OcDSt2xwhBzms4j8d7NcTWTdwypNEGY=;
h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:
From;
b=DIjehwVQOI8inGxaCMxgCdEIYoVjwqwOfm9J9zGyC8gIM887gJmN/x1T4R30yV1PS
js664eR0o2XiiwKNJmpcghoIDnbEkgBphW3PvKNKbHseCnGXejpZTXYJxoYGbApeR6
WHXFaTUdMErbXEjeFi+B6VeL7tHRmyDOBudnoOds=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AC5673858C53
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AC5673858C53
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1739605347; cv=none;
b=pU2tTMiHEyxmz7llb3w+Nl0jM99swM54TX/CLgNSVhW8HDEFxTQX73mpBbjoz7S3FkIeF7mTnYqsEIbUdCGFZbNfjg7TWptcofAQWY/cm18zWIz5qQD5yfvha7jjunoH4VzILzD7+AarOtAenol7ImQdyRiZI7Hwyh8d99HqJNw=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
t=1739605347; c=relaxed/simple;
bh=1wdfs3Id8Z3wkg9j53cH3L1nUfGsQcECIj1aBCSUKew=;
h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To;
b=P898z3jYviG3mkPa+p8O9Xm1Z7/w4QFaFDUkqrmwQ9dw4FmMtQ4WQry34dAnLMz26bB9m2HwV4ye9UJXYetHK5HEh1yNK/S2OCR4ORNGqJumxaP2IBXIfAvZYlUdjx/sC7EJxBIEI/gvBKQfm7vRnzyT6hmdpfvnbdK/+9f2S9Y=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AC5673858C53
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20230601; t=1739605346; x=1740210146;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=r4qn6GmTpOeUYulbCpampvB0ErknNss7WiSdTtlg48c=;
b=DhJ5ls+24W2Z0RLxCmPdBIBtbksk1GXSID9XY5Lp4DiDkIhvTew88VZS5JTSqAdglq
NKFpSATH2qVbnRzVCuKPxbxM/nijNXPOtZv46g99WhQy40cHRsCEtO68Eug9J9/eBaoz
tsNB6fj1Eo2aeBG7SXzWO7q0zxI0beEErSyPGE2SIhXswNCIFrukLosrKYwXM4TQ1rRY
9d9hYZU/ycYo9MVNz5CXIG8T0u6a0b1Z6JOfqXglnN1+R8mR4g74UKcpmhmQGXTJSqIX
ugXIT7pJztY2s1kLUzT1+OGTcn6aoD3BFf71Nc9lv19MRDw06WHOzSQYtJbJz7XduosC
OA5w==
X-Gm-Message-State: AOJu0Yw+UfOxcD7HKGsf7a9PxO43mCu5+p9dtzF2UVhND+P3W0b6RPnE
9ELlYDDUU9o3KDrqX4kIwObFNFVjCgwL3xZ4iXzlfpBb5EGxUyE1ut3xKp7JSNbDGxL6CqoUwdD
PGjRGtQjwkQoPuxvywkKSTxvITehuvw==
X-Gm-Gg: ASbGncuip6rOoPOI/hpUqiTlfgdr+BkkaPZ/p7eXZT0YtqWxTd3VLtTGHZL17V+0yEP
PcRKepsaQDVhs2sBEcA62V9esac0Fe4CsD7g8FuUoPu6ybgFlwWvFnYWVMrJRlbxX21OJVkW/+Q
==
X-Google-Smtp-Source: AGHT+IG++cYTaD0CI96iMsejKgV3aPXsW2kcbudl9sk1B1xNG3VBCyyTn+cYiRlKuiyuwpj9zT3MMpQzk7BcPPIVZq8=
X-Received: by 2002:a17:90b:2243:b0:2fa:1a8a:cff8 with SMTP id
98e67ed59e1d1-2fc41150925mr3505191a91.29.1739605345991; Fri, 14 Feb 2025
23:42:25 -0800 (PST)
MIME-Version: 1.0
References: <614771e9-592c-6154-d56d-13842b6fc6ac AT t-online DOT de>
<5ccdf4be-4e4b-1846-9fd6-cba29c9dbb11 AT t-online DOT de>
In-Reply-To: <5ccdf4be-4e4b-1846-9fd6-cba29c9dbb11@t-online.de>
Date: Sat, 15 Feb 2025 08:42:00 +0100
X-Gm-Features: AWEUYZkurvLvKtbmr5MeYjZGbC_K2K8jW5xnwDifxdAJAonGHtdczxb0u_Tcshc
Message-ID: <CALXu0UcYLMhnOtQ2emvoH7=EpjuxcZmqf8+B8xY8YU8dxk7_Og@mail.gmail.com>
Subject: Re: SEEK_DATA should fail at EOF (was: coreutils-9.6-1 (TEST): cp:
infinite SEEK_SET/DATA/HOLE loop if file is compressed)
To: cygwin AT cygwin DOT com, illumos-dev <developer AT lists DOT illumos DOT org>
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Cedric Blancher via Cygwin <cygwin AT cygwin DOT com>
Reply-To: Cedric Blancher <cedric DOT blancher AT gmail DOT com>
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>

On Fri, 14 Feb 2025 at 12:25, Christian Franke via Cygwin
<cygwin AT cygwin DOT com> wrote:
>
> Christian Franke via Cygwin wrote:
> > Testcase:
> >
> > $ uname -r
> > 3.5.7-1.x86_64
> >
> > $ cygcheck -f /bin/cp.exe
> > coreutils-9.6-1
> >
> > $ for i in 1 2 3; do cat /bin/cygwin1.dll > file$i; done
> >
> > $ compact /C file2 # NTFS compression
> > ... (1.7 : 1) ...
> >
> > $ compact /C /EXE:LZX file3 # Compact OS LZX compression
> > ... (2.8 : 1) ...
> >
> > $ stat -c '%b %s %n' file?
> > 2928 2995253 file1
> > 1720 2995253 file2
> > 1044 2995253 file3
> >
> > $ cp file1 copy1 # OK
> >
> > $ cp file2 copy2 # Hangs
> > ...[^C]
> >
> > $ cp file3 copy3 # Hangs
> > ...[^C]
> >
> > $ md5sum file? copy?
> > 2954646a9a0fe4579c3fc1f44dd4bb6a *file1
> > 2954646a9a0fe4579c3fc1f44dd4bb6a *file2
> > 2954646a9a0fe4579c3fc1f44dd4bb6a *file3
> > 2954646a9a0fe4579c3fc1f44dd4bb6a *copy1
> > 2954646a9a0fe4579c3fc1f44dd4bb6a *copy2
> > 2954646a9a0fe4579c3fc1f44dd4bb6a *copy3
> >
> > $ (sleep 2; pskill strace) & strace cp file3 copy3
> > ...
> >    47 2004141 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 0) #
> > SEEK_SET
> >    46 2004187 [main] cp 5546 fhandler_base::lseek: setting file
> > pointer to 2995253 # EOF
> >    47 2004234 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 3) #
> > SEEK_DATA
> >    46 2004280 [main] cp 5546 fhandler_base::lseek: setting file
> > pointer to 2995253
> >    47 2004327 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 4) #
> > SEEK_HOLE
> >    46 2004373 [main] cp 5546 fhandler_base::lseek: setting file
> > pointer to 2995253
> >    46 2004419 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 0)
> >    51 2004470 [main] cp 5546 fhandler_base::lseek: setting file
> > pointer to 2995253
> >    47 2004517 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 3)
> >    47 2004564 [main] cp 5546 fhandler_base::lseek: setting file
> > pointer to 2995253
> >    47 2004611 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 4)
> >    46 2004657 [main] cp 5546 fhandler_base::lseek: setting file
> > pointer to 2995253
> > Process strace killed.
> >
> >
> > file1/2 are detected as a possible sparse files but the optimized copy
> > algorithm does not properly handle the non-sparse case.
>
> Should be "file2/3" of course.
>
> > Upstream bug?
> >
>
> Possibly not. A closer look shows that the main loop in
> copy.c:lseek_copy() expects that SEEK_DATA fails with ENXIO at EOF.
>
> https://github.com/coreutils/coreutils/blob/v9.6/src/copy.c#L543
>
>   lseek_copy(..., off_t ext_start, ...)
>   {
>     ...
>     while (0 <= ext_start) {
>       {
>        ...
>        ext_start = lseek (src_fd, dest_pos, SEEK_DATA);
>        if (ext_start < 0 && errno != ENXIO)
>          goto cannot_lseek;
>       }
>     ...
> }
>
> This works on Linux (checked on Debian 12) but Cygwin returns the offset
> if it is equal to the file size.
>
> Recent POSIX says:
> "[ENXIO] The whence argument is SEEK_HOLE or SEEK_DATA, and offset is
> greater than or equal to the file size"
> https://pubs.opengroup.org/onlinepubs/9799919799/functions/lseek.html
>
> But (at least older) Linux man pages suggest that Cygwin behavior may be
> correct also:
> "In the simplest implementation, a filesystem can support the operations
> by making ... SEEK_DATA always return offset."
> "ENXIO - whence is SEEK_DATA or SEEK_HOLE, and offset is beyond the end
> of the file"
> https://man7.org/linux/man-pages/man2/lseek.2.html
>
> Hmm... does "beyond" mean '>=' or '>' ?

cc: illumos-dev@ list. How does Solaris or Illumos behave? SUN/Solaris
invented SEEK_DATA/SEEK_HOLE, so this should be - aside from looking
at the OpenGroup/POSIX specs - the reference implementation.

Ced
-- 
Cedric Blancher <cedric DOT blancher AT gmail DOT com>
[https://plus.google.com/u/0/+CedricBlancher/]
Institute Pasteur

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019