DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 51F7h3aj1922975 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 51F7h3aj1922975 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=DIjehwVQ X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 653453858415 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1739605382; bh=DvnZd0kZT6N0OcDSt2xwhBzms4j8d7NcTWTdwypNEGY=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=DIjehwVQOI8inGxaCMxgCdEIYoVjwqwOfm9J9zGyC8gIM887gJmN/x1T4R30yV1PS js664eR0o2XiiwKNJmpcghoIDnbEkgBphW3PvKNKbHseCnGXejpZTXYJxoYGbApeR6 WHXFaTUdMErbXEjeFi+B6VeL7tHRmyDOBudnoOds= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AC5673858C53 ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AC5673858C53 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1739605347; cv=none; b=pU2tTMiHEyxmz7llb3w+Nl0jM99swM54TX/CLgNSVhW8HDEFxTQX73mpBbjoz7S3FkIeF7mTnYqsEIbUdCGFZbNfjg7TWptcofAQWY/cm18zWIz5qQD5yfvha7jjunoH4VzILzD7+AarOtAenol7ImQdyRiZI7Hwyh8d99HqJNw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1739605347; c=relaxed/simple; bh=1wdfs3Id8Z3wkg9j53cH3L1nUfGsQcECIj1aBCSUKew=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=P898z3jYviG3mkPa+p8O9Xm1Z7/w4QFaFDUkqrmwQ9dw4FmMtQ4WQry34dAnLMz26bB9m2HwV4ye9UJXYetHK5HEh1yNK/S2OCR4ORNGqJumxaP2IBXIfAvZYlUdjx/sC7EJxBIEI/gvBKQfm7vRnzyT6hmdpfvnbdK/+9f2S9Y= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AC5673858C53 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739605346; x=1740210146; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=r4qn6GmTpOeUYulbCpampvB0ErknNss7WiSdTtlg48c=; b=DhJ5ls+24W2Z0RLxCmPdBIBtbksk1GXSID9XY5Lp4DiDkIhvTew88VZS5JTSqAdglq NKFpSATH2qVbnRzVCuKPxbxM/nijNXPOtZv46g99WhQy40cHRsCEtO68Eug9J9/eBaoz tsNB6fj1Eo2aeBG7SXzWO7q0zxI0beEErSyPGE2SIhXswNCIFrukLosrKYwXM4TQ1rRY 9d9hYZU/ycYo9MVNz5CXIG8T0u6a0b1Z6JOfqXglnN1+R8mR4g74UKcpmhmQGXTJSqIX ugXIT7pJztY2s1kLUzT1+OGTcn6aoD3BFf71Nc9lv19MRDw06WHOzSQYtJbJz7XduosC OA5w== X-Gm-Message-State: AOJu0Yw+UfOxcD7HKGsf7a9PxO43mCu5+p9dtzF2UVhND+P3W0b6RPnE 9ELlYDDUU9o3KDrqX4kIwObFNFVjCgwL3xZ4iXzlfpBb5EGxUyE1ut3xKp7JSNbDGxL6CqoUwdD PGjRGtQjwkQoPuxvywkKSTxvITehuvw== X-Gm-Gg: ASbGncuip6rOoPOI/hpUqiTlfgdr+BkkaPZ/p7eXZT0YtqWxTd3VLtTGHZL17V+0yEP PcRKepsaQDVhs2sBEcA62V9esac0Fe4CsD7g8FuUoPu6ybgFlwWvFnYWVMrJRlbxX21OJVkW/+Q == X-Google-Smtp-Source: AGHT+IG++cYTaD0CI96iMsejKgV3aPXsW2kcbudl9sk1B1xNG3VBCyyTn+cYiRlKuiyuwpj9zT3MMpQzk7BcPPIVZq8= X-Received: by 2002:a17:90b:2243:b0:2fa:1a8a:cff8 with SMTP id 98e67ed59e1d1-2fc41150925mr3505191a91.29.1739605345991; Fri, 14 Feb 2025 23:42:25 -0800 (PST) MIME-Version: 1.0 References: <614771e9-592c-6154-d56d-13842b6fc6ac AT t-online DOT de> <5ccdf4be-4e4b-1846-9fd6-cba29c9dbb11 AT t-online DOT de> In-Reply-To: <5ccdf4be-4e4b-1846-9fd6-cba29c9dbb11@t-online.de> Date: Sat, 15 Feb 2025 08:42:00 +0100 X-Gm-Features: AWEUYZkurvLvKtbmr5MeYjZGbC_K2K8jW5xnwDifxdAJAonGHtdczxb0u_Tcshc Message-ID: Subject: Re: SEEK_DATA should fail at EOF (was: coreutils-9.6-1 (TEST): cp: infinite SEEK_SET/DATA/HOLE loop if file is compressed) To: cygwin AT cygwin DOT com, illumos-dev X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 List-Id: General Cygwin discussions and problem reports List-Archive: List-Post: List-Help: List-Subscribe: , From: Cedric Blancher via Cygwin Reply-To: Cedric Blancher Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "Cygwin" On Fri, 14 Feb 2025 at 12:25, Christian Franke via Cygwin wrote: > > Christian Franke via Cygwin wrote: > > Testcase: > > > > $ uname -r > > 3.5.7-1.x86_64 > > > > $ cygcheck -f /bin/cp.exe > > coreutils-9.6-1 > > > > $ for i in 1 2 3; do cat /bin/cygwin1.dll > file$i; done > > > > $ compact /C file2 # NTFS compression > > ... (1.7 : 1) ... > > > > $ compact /C /EXE:LZX file3 # Compact OS LZX compression > > ... (2.8 : 1) ... > > > > $ stat -c '%b %s %n' file? > > 2928 2995253 file1 > > 1720 2995253 file2 > > 1044 2995253 file3 > > > > $ cp file1 copy1 # OK > > > > $ cp file2 copy2 # Hangs > > ...[^C] > > > > $ cp file3 copy3 # Hangs > > ...[^C] > > > > $ md5sum file? copy? > > 2954646a9a0fe4579c3fc1f44dd4bb6a *file1 > > 2954646a9a0fe4579c3fc1f44dd4bb6a *file2 > > 2954646a9a0fe4579c3fc1f44dd4bb6a *file3 > > 2954646a9a0fe4579c3fc1f44dd4bb6a *copy1 > > 2954646a9a0fe4579c3fc1f44dd4bb6a *copy2 > > 2954646a9a0fe4579c3fc1f44dd4bb6a *copy3 > > > > $ (sleep 2; pskill strace) & strace cp file3 copy3 > > ... > > 47 2004141 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 0) # > > SEEK_SET > > 46 2004187 [main] cp 5546 fhandler_base::lseek: setting file > > pointer to 2995253 # EOF > > 47 2004234 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 3) # > > SEEK_DATA > > 46 2004280 [main] cp 5546 fhandler_base::lseek: setting file > > pointer to 2995253 > > 47 2004327 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 4) # > > SEEK_HOLE > > 46 2004373 [main] cp 5546 fhandler_base::lseek: setting file > > pointer to 2995253 > > 46 2004419 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 0) > > 51 2004470 [main] cp 5546 fhandler_base::lseek: setting file > > pointer to 2995253 > > 47 2004517 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 3) > > 47 2004564 [main] cp 5546 fhandler_base::lseek: setting file > > pointer to 2995253 > > 47 2004611 [main] cp 5546 lseek: 2995253 = lseek(3, 2995253, 4) > > 46 2004657 [main] cp 5546 fhandler_base::lseek: setting file > > pointer to 2995253 > > Process strace killed. > > > > > > file1/2 are detected as a possible sparse files but the optimized copy > > algorithm does not properly handle the non-sparse case. > > Should be "file2/3" of course. > > > Upstream bug? > > > > Possibly not. A closer look shows that the main loop in > copy.c:lseek_copy() expects that SEEK_DATA fails with ENXIO at EOF. > > https://github.com/coreutils/coreutils/blob/v9.6/src/copy.c#L543 > > lseek_copy(..., off_t ext_start, ...) > { > ... > while (0 <= ext_start) { > { > ... > ext_start = lseek (src_fd, dest_pos, SEEK_DATA); > if (ext_start < 0 && errno != ENXIO) > goto cannot_lseek; > } > ... > } > > This works on Linux (checked on Debian 12) but Cygwin returns the offset > if it is equal to the file size. > > Recent POSIX says: > "[ENXIO] The whence argument is SEEK_HOLE or SEEK_DATA, and offset is > greater than or equal to the file size" > https://pubs.opengroup.org/onlinepubs/9799919799/functions/lseek.html > > But (at least older) Linux man pages suggest that Cygwin behavior may be > correct also: > "In the simplest implementation, a filesystem can support the operations > by making ... SEEK_DATA always return offset." > "ENXIO - whence is SEEK_DATA or SEEK_HOLE, and offset is beyond the end > of the file" > https://man7.org/linux/man-pages/man2/lseek.2.html > > Hmm... does "beyond" mean '>=' or '>' ? cc: illumos-dev@ list. How does Solaris or Illumos behave? SUN/Solaris invented SEEK_DATA/SEEK_HOLE, so this should be - aside from looking at the OpenGroup/POSIX specs - the reference implementation. Ced -- Cedric Blancher [https://plus.google.com/u/0/+CedricBlancher/] Institute Pasteur -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple