www.delorie.com/archives/browse.cgi   search  
Mail Archives: cygwin/2025/02/24/12:06:29

DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 51OH6T8A3812710
Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com
Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com
DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 51OH6T8A3812710
Authentication-Results: delorie.com;
dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=r4wtt6MY
X-Recipient: archive-cygwin AT delorie DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5B3813858CDB
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com;
s=default; t=1740416788;
bh=iYS/W5mzicodDMPFJDqVg/Rg/+5Bp8mtX2KKO8nc4oA=;
h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe:
List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:
From;
b=r4wtt6MYYq2SLTBlregUgnL8Vau5laIJx9UbMhkF5xbP0I1kwdHqqD7CROUlkKCL8
z54iAGJs+F1U1nRd6+s7a9LYQgGQekZDWE5XH5Q0iNWdU1eeI4rPjgbQhLN7LcxCtn
lY3Wfm92pScP5I97Myk0Gum8kvI1CQxp0QNVPuA8=
X-Original-To: cygwin AT cygwin DOT com
Delivered-To: cygwin AT cygwin DOT com
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 614D53858D34
Date: Mon, 24 Feb 2025 18:05:54 +0100
To: cygwin AT cygwin DOT com
Subject: Re: Bash 5 bug in Cygwin only: empty pattern inside a regular
expression
Message-ID: <Z7ym8lh0_X7Af4Lj@calimero.vinschen.de>
Mail-Followup-To: cygwin AT cygwin DOT com
References: <CACmJb3wD7-0HicUHxwk=9rTL6DJjiBL7nBWJ5KEQEhPks9aTxQ AT mail DOT gmail DOT com>
MIME-Version: 1.0
In-Reply-To: <CACmJb3wD7-0HicUHxwk=9rTL6DJjiBL7nBWJ5KEQEhPks9aTxQ@mail.gmail.com>
X-BeenThere: cygwin AT cygwin DOT com
X-Mailman-Version: 2.1.30
List-Id: General Cygwin discussions and problem reports <cygwin.cygwin.com>
List-Unsubscribe: <https://cygwin.com/mailman/options/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=unsubscribe>
List-Archive: <https://cygwin.com/pipermail/cygwin/>
List-Post: <mailto:cygwin AT cygwin DOT com>
List-Help: <mailto:cygwin-request AT cygwin DOT com?subject=help>
List-Subscribe: <https://cygwin.com/mailman/listinfo/cygwin>,
<mailto:cygwin-request AT cygwin DOT com?subject=subscribe>
From: Corinna Vinschen via Cygwin <cygwin AT cygwin DOT com>
Reply-To: cygwin AT cygwin DOT com
Cc: Corinna Vinschen <corinna-cygwin AT cygwin DOT com>
Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com
Sender: "Cygwin" <cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com>

On Feb 24 17:01, LLoyd via Cygwin wrote:
> Hello.
> 
> I'll try to keep this short:
> In Cygwin only, using bash 5.2.21-1 or 5.2.15-3 (the only 5.* versions
> available), "empty" in a regular expression is not properly matched
> and breaks the regular expression.
> It's not a quoting issue, I also tested with:
> reg='foo|'; [[ foo =~ $regex ]]
> 
> GNU bash, version 5.2.15(3)-release (x86_64-pc-cygwin)
> GNU bash, version 5.2.21(1)-release (x86_64-pc-cygwin)
> [[ foo =~ foo| ]] (is false, should be true)
> [[ foo =~ foo|a ]] (is true)
> [[ '' =~ foo| ]] (is false, should be true)
> 
> GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)
> GNU bash, version 5.2.21(1)-release (x86_64-pc-linux-gnu)
> [[ foo =~ foo| ]] (is true)
> [[ foo =~ foo|a ]] (is true)
> [[ '' =~ foo| ]] (is true)

This isn't actually a bug in Cygwin's bash, but a characteristic of the
underlying FreeBSD-based regex library.  Per POSIX, empty branches in a
regular expression are undefined.  Quoting from the Open Group:

  The <vertical-line> is special except when used in a bracket
  expression (see 9.3.5 RE Bracket Expression). A <vertical-line>
  appearing first or last in an ERE, or immediately following a
  <vertical-line> or a <left-parenthesis>, or immediately preceding a
  <right-parenthesis>, produces undefined results.

  (https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap09.html#tag_09_04_03)

Funny enough, even `man 7 regex' contains a description along the lines
of POSIX:

  A (modern) RE is one(!) or more nonempty(!) branches, separated by
  '|'.  It matches anything that matches one of the branches.

  (https://man7.org/linux/man-pages/man7/regex.7.html)

Given that, empty branches are an extension of GLibC which are not
necessarily supported by other libs.


Corinna

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

- Raw text -


  webmaster     delorie software   privacy  
  Copyright © 2019   by DJ Delorie     Updated Jul 2019