DMARC-Filter: OpenDMARC Filter v1.4.2 delorie.com 651GEYQ33673713 Authentication-Results: delorie.com; dmarc=pass (p=none dis=none) header.from=cygwin.com Authentication-Results: delorie.com; spf=pass smtp.mailfrom=cygwin.com DKIM-Filter: OpenDKIM Filter v2.11.0 delorie.com 651GEYQ33673713 Authentication-Results: delorie.com; dkim=pass (1024-bit key, unprotected) header.d=cygwin.com header.i=@cygwin.com header.a=rsa-sha256 header.s=default header.b=K5j2uzky X-Recipient: archive-cygwin AT delorie DOT com DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 163424BA2E08 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cygwin.com; s=default; t=1780330472; bh=uQSKuO9wfHi3/s1X8hDeNn+6onUWNekYHim+SKedfeY=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=K5j2uzkykNxdGolYpRZ7a23N22OxeUxGnhSsVH7pGJo+dLgvml/plwva77fgOV10g 9bwY/Qpg04TYLUnCGVhgZZW0zZTWu+PQVWNbIxOJUaYQYZb8ezkpfd6p2IRuEZ0zl/ wGxd6BceQySGAdORqAlHVz+UN8c6nD4HEiU8tCyA= X-Original-To: cygwin AT cygwin DOT com Delivered-To: cygwin AT cygwin DOT com DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B992F4BA2E1A ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B992F4BA2E1A ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1780330377; cv=none; b=hwRklL/pVmztV5FPhyHCQozJGbqQcvSmWf/2T2+jpA9bZx9yRtreelpmKbzoczGHHMT0xqdVJrqf0ttFKeaf0qS62Bv6XOEx7hWGjFsDUVh5Ul0ZntzxDYkqqOwQds6i6rC81ApG6154yN8U/CsQhP24XCtlrh4qIEVxEEexY0o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1780330377; c=relaxed/simple; bh=cWva9leZNGPykhfTfHorvQQYVX/MplB8zHsqQrqhPCk=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=hOz1Jx9ud2woPbXA8gw0cFXQvv28gEYKUW2kFdCPCzZPlYRaGNUkH7YRbs4AoxxVP7s/U38inDchjMnvlAtXBvLtu9t+vEbil1eFFSMAzLMV6QyCAgDHeXVjI0WxSre1rjOChFTUajeXYv+dk85tVubCzcfs2GkAtq6L40GMBI4= ARC-Authentication-Results: i=1; sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=towo.net header.i=towo AT towo DOT net header.a=rsa-sha256 header.s=s1-ionos header.b=CcePQ0vD DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B992F4BA2E1A X-UI-Sender-Class: 55c96926-9e95-11ee-ae09-1f7a4046a0f6 Message-ID: <4100a583-7419-4bb6-bd19-aee154dbec3b@towo.net> Date: Mon, 1 Jun 2026 18:12:55 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Thoughts on the wcwidth confusion To: cygwin AT cygwin DOT com References: <19c6f9b4-5f09-6929-891c-d25ebe48af82 AT wisemo DOT com> Autocrypt: addr=towo AT towo DOT net; keydata= xsDNBGNaf3QBDACVevqudcTSevLThXKQPU1QpaDxtGuYjtwmr7i9wXxVGih4Y4oxOJN4PYlu KBX9IVAI4651dA+xYtXuyIkWOPZWyyzkGKavQOn3Q7dk09oj7bh2IwOndpxXXde337D408EQ bQEGbMHr9lOWhSAideowzgCeFIvGTf2AovbPh97HpexJn1/HCRiRAhTNlrkS1DByUgCAeEMK fEr6aGM/Ou29MT+eTnQwOIZTnl9Z9LxM2FtqqMH3MycC7I2OoW3XXhuL8BPQdyJUjWa0/J11 Oo5jFkRXtWenIns6jGn18oW72jnDmo9jXwwS+iZWAV6Y51nhD7jSC+3xs9ORmPCdtHUSpTr1 zh67UueUJ3DUUNVuA25Hn/9EJMJ2L60BGUEr88NEB6pcZhmcwdkurAQeYT6t+frzBz2ctsoN BoxP/Xc02yd+z7hXWRRMrJWh9WHlQHA3Z4FfmyNhyPhs3MgKTJ1E9QfzGquigAmF3/k/Dc1m 7cSOKhGYhpEJdSpdXccJFKkAEQEAAc0cVGhvbWFzIFdvbGZmIDx0b3dvQHRvd28ubmV0PsLB BwQTAQgAMRYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn93AhsDBAsJCAcFFQgJCgsFFgID AQAACgkQxvPR7vYGnQKSMAv8Di+8MXB2mcfsemRdShfLLKcLOv+d0CXAtPVaY3XKxbKpRvC9 +AAT5wIHYjQft77/b2y87vGIh+nQ5hKLtNtQPSDtqG/Igkb5jAXpLi28fSUzgM96DvARmwve 5wSnAU3prxH+Y63YpOpslEcGMRoEtYCDy1ANMYPcEZT/YvDd4CplyyEai4VYrw3/LsESDYlY GK6uMQzZ1jl2cNOUFu6BwLUeZIcwaqGto8n4R4nbf4jxUEpa21bWBPqE+Jf49uipjPr/iJ72 5HbdWuuCfyTTJEJjfNEBigWP2RXM9iNDcO61V3aEjh76tThfBK2MMlLWfZkQaQziu24x8R4B I0efJYWBX2Sv2qnsH/EWj7FUIZjRqGG7LnWHLShfG6yjSOTOWYi8BbsvoftpaLWgZX28aGX4 uzuSZ5L0caXh/pr/gSgqoH/YbuFIgqtQH4seOBgTybd22Vpe78rnc+8450pN8qwchHAZaJka UxS0SpYxXzXmHUKILA4C43s0U/z2Mez9zsDNBGNaf3cBDADeJ7paMrb6f1+k8wM7tyk0/Ded KX/pOejt/D20Ceerw2iL/4tUmBL+A3ic2yjiSFUSsEfHwgCVwKrn4MwZtkesdiphm2lk6xWc k1ENCQy44QwQT6UZ/mHWYWcj5LS6ua183x1zdn9iF3lv150nm/ssw56D7USz/ap1Vh0lf5te D+CIheGLocVDqxWiu7rHP8jKRWFgq/+OU6HKX8p2Yv1oYsykh9qF2bFzawLDS+S1VbfRicfD G0RtceL/BAf7b6UE5u9TGdfrFEa2TKZeS/FS/ViKUfwsXQIki1sWt2FQENbuDY28vxyR46ZZ 0gixDCFUoBw5pkmOGVQa+1RQYrRqlN4X0CAgp7mFVeEHl5NTgiL1bemkQVmHOUDG+CzNg+Lk UGoedAtT672l3JjrnSs4j8zNshpgV2OfAhAC+V9XvqCjMnxzVfXkVlbuWpPfUWQeFclLGg8P agpQUE0Ux+VV4DoeQCxYEnRCf/n7n+IRfILj5+2l6Zw4M7zSu6ii0tUAEQEAAcLA9gQYAQgA IBYhBHUiRKsHn5d8BpWdP8bz0e72Bp0CBQJjWn97AhsMAAoJEMbz0e72Bp0CQr4L/REdT0SF mbapnZIe92THCdtAUgwEv8VdNiNFBJelz8P/fuXuNPtisYvQQD4e64zpWe2UC4Cxo9DUk/pW 6Qci1xaXRKEiSPjHdSGGVB1PFIcqiS75GCf/ga/Dnfsy0Y4Uh6OGTQnkvZLBCe3vvcVLDQ7F PuV79zA9/eOeOW6aGoO6bq/wH+z96f9LyTITkQDy07fm6JYTGuzAoJE2AEboU1mgbtlx+tAa QFkpAQkp2g1Vhc3A7k4vntlHOrjMC+uVFh7QTGFfIlLRF6izUjSe6EZ06LErzlIiE05RP3yF FSRWidW0wze26peYlxYVgH1+T9wMTW2oiTBybfAMHBAxUP7Gr1WUo/oJEr0srWhatz8AwydP y7NwFbdpYn0NcFBaIlLW/JL11Eovwlivow+oGpzGFuuzSuflp2q9s2JWtn4EhW0kEs93D0LP iuJWvRaCZ6aD3uF3FMW8wyVWZYsLrzune2jH8w/uKMprDEOGOm+BcyhEFedTyY1ygbZKl+0G kQ== In-Reply-To: X-Provags-ID: V03:K1:gezMH2PN3oi1V1jMB+rUnYjzs2ls3fzS+kwZdvR3IDarhN14vp/ 9rzXJu1yqtMSKaeOZQosKltzhM+WYDRTWMTZwEU67TBoiI0yaE7byXRXXmtvCNIpgb5tQmx hvhIUz/jsVs+IhTJDOOvZM7S9Xrdq2lnYkWt7m9PcLy3CD64+8i5KieGjwW2el/rk1+VnNP 4ulsz2pI3jw2WKd6EycqA== UI-OutboundReport: notjunk:1;M01:P0:lmw/uKYC6uk=;fxeM08xVS6qUD4T8eLWrheHkrox 89UBPCXF0SiQN1Ukit8EeYJGpV57hLMKlBkx9DtKyW8RiIl9bq9uDggoTJ9egFVqq78kUw9a3 j7hIoIV6HAV1Z0dvmIN+GI3azHP86fV7kKolkrWIYHWODdwVs1tyPsFC2/nkjANWebEr/+oit lv2p4I2D5EesaXBsZtEOkXJQ1T/6SHF7XJ7mMqr/n+AtIjyNpznfVR+L8w3il9ekWZh02zpIR A9tYydkYCi+w9HXc54sK2h4RBTwMfyjpQsWOBSFvbpMuI69Wv9AkuCpl5X4moH/fIa74BH3V1 zE6OqhFyYRZl9yesQROo08u6laIcUGLm8Lb/XPr/mtQGvfIgxd1yKSBR2zcf4RqO3mRqpZK6r C2kM8nAR6cVKfMPzYrs0g/JnPBMHgNXqKY6GGQOhjviY2QmFlfk+/KoyyzvUxX/5IM+D/zXSZ dAvRuG8j1c/oYKg9jbt/KiS94QtC3JW/vlzVWjLPi/2G6eqMTyvIye2VggXSpOs1nZcN/IFAi lC12FUuzurLZMMgf+u6uLldeldzPbLbtKRR+da3wKhZzDRt96oQPqPrhfjn0bTeIurSI6dlqZ DIUc0rEh7eefFWnfoKoEAjFqgtCeBQuhOLI246ticbUiZZxuAeK0CyAyGk78VZWnrgmTkzR1b J1+vFLwPuM3gSj7fzG5a1+N4mIqAtyddAqrMcRDQfnTLqANTiDj2+diAFEKkQ2DunBIBxCO+0 pMEoUZlP2IOl1oiRYEW2nB/XeweitmFSYEwuZ5oFDmrlq6/ufC0s5r9RHdXrnhBxgfw19/Amp PkaGzj3YioX6ul8Qmme00RYH+F+GvmyuN1v8nw8+2Nl+h5gfUlJcrbHpq5SX3a6TQlTI63O5U OYAFnPSZBGbih2+PVyTb3Se2A/cQvtd4D9moyY93cbwku/jEsn6JPE/a1T+9fIDZRkI0U6CZV hGt1nJ85EX4NrpVvB39XuJw+dJvO0L3QLAU8FW27HYQED4Scoeclmtl+Zx/9F+sXHxBpytVC3 +ZUAYto1odKBLi7cm8xAJxknT7NGB9ktMWyrelJX7v/D3VkyBaiTL5zZnAUjAT3dn7spTcQXQ Tdp+nw6oLnYCRiVrPM+6Jy0BcoWCtVPBHVFRsTcGc2hfqLZ5cl4+1QpAcIzJ6EAAop5kT/RHi xm4uivqPwilbk8mDXyQ5LNWmw5VfcAV36VAR6st93TExRSYGWXitIlWKW3kM4oIqVWnYAbS6a SnpfEcRhoZPFT3QqssRUQbJZNOQYh+x1AZ+GKCbjP74XIq4jVclxl3ECm7foTmrixsKHoW37x voeikUlp7QQufDHHGXP8n1TbEnUDt2lhZOA45Zgtx83z6JGyFcIFFsql1RFhl8yFk/RwP3y/7 xfbIDYJ8XQKK1WSEyUGPi0kJRDFakVELWyCBkPBLH+Toahq4gU6tzaryS6LHXF9zjvp+ypSlv OfyL/TaY/XR8dVBRxQcS+fHaNyQ3+LphV4Pd9jmJSDbUcM4EeqLjWL5SmavKKvQZCIyGWOV87 tGMnc6Es8TX9NAxW5e1W4JnW5m0Av2xyPi5B9E46zvV3JwjGsbHzAn1AsDVqXnJoXOLGiu+Lk SMMgp6xoJveozI1eFddlbVvzYGrOQiBNzzKECkR0EIZgqYL0+lF7OcWqe0myvh3GDQmitjmR6 bZMniXLLGpKsjrH5FPkMGxma8YFUHiDMuGxO/CXztISuOPoNTnc4FbCmtC5kcKu/f47bvv6/s dZFkbxs1UyZRgg09sImp36/UO5JQOAuDzYJwq7MirTyX14xGFud8iFRzBmqtGi6Ni+ncZWh8L KAle345YYwJ4h3dVbQYm4vjQkswWM6Op76RfVfIlmHGrDf+CrnPFs/udSIkU6VOd0JYyhaKkj ENDcmc4Onb0XAmc0nioiFS8fOpbZf9F5i8U66U= X-BeenThere: cygwin AT cygwin DOT com X-Mailman-Version: 2.1.30 Precedence: list List-Id: General Cygwin discussions and problem reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Thomas Wolff via Cygwin Reply-To: Thomas Wolff Content-Type: text/plain; charset="utf-8"; Format="flowed" Errors-To: cygwin-bounces~archive-cygwin=delorie DOT com AT cygwin DOT com Sender: "Cygwin" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by delorie.com id 651GEYQ33673713 Am 01.06.2026 um 17:59 schrieb Thomas Wolff via Cygwin: > Am 01.06.2026 um 17:34 schrieb Jakob Bohm via Cygwin: >> Dear list, >> >> Having read through the recent debate around the wcwidth() POSIX API, >> wchar_t definitions, gcc-16 and cygwin, I have an idea not >> mentioned in the list so far: >> >> Using C17 types char32_t and char16_t, the situation can be >> summarized as follows: >> >> - Many, but not all POSIX systems define wchar_t as char32_t and thus >> wint_t as uint_least32_t >> >> - Win32 and thus Cygwin defines wchar_t as char16_t and thus wint_t as >> uint_least16_t >> >> - All systems considered treat wchar_t as unicode, with Win32 supporting >>  UTF-16 since the NT 5.00 (Windows 2000). >> >> - For char16_t/UTF-16, wcwidth() should use the high surrogate to >>  determine the range of unicode symbols and return a width common to >>  that range, then return 0 for the low surrogates, thereby allowing >>  computation of string width without having to first assemble surrogates >>  into full char32_t values.  Deciding if char32_t implementations should >>  still lump groups of 4 Unicode rows for UTF-16 compatibility is up to >>  each implementation. > It's a neat idea to split the width calculation over the surrogates. > Unfortunately it does not work this way because widthness does not > change in full 1024-byte blocks. For example, U+1F4FC is Wide, U+1F4FD > and U+1F4FE are narrow/Neutral (N), and U+1F4FF is W again. > As a variant of your idea, wcwidth could return width 1 for every high > surrogate, remember it, and if the subsequent invocation is a low > surrogate, determine the combined width and return either 1 or 0. > Not quite standard behaviour, I suspect, so maybe not a good idea for > the purists, but maybe worth some discussion. On the other hand, there are also combining characters in the non-BMP, so the only way this could work is width 0 for high surrogates, then sum up to the actual width on the low surrogate. Leaving the question how to handle an (errorneously) single high surrogate... >> >> A practical solution would be for Cygwin/newlib to provide new functions >> c16width(), c32width(), c16swidth() and c32swidth(), each being the >> explicit size equivalants of their wc and wcs similarly named functions. >> >> Then wcwidth() can be a trivial inline alias of the explicit size >> equivalent for the compile target by having the newlib header checking a >> compiler or standard define indicating the chosen size of wchar_t. >> >> // possible wchar.h snippet >> // >> // C17+ required >> // For C2Y+ this should go in uchar.h >> // >> int c16width(char16_t c); >> int c32width(char32_t c); >> int c16swidth(const char16_t *s, size_t n); >> int c32swidth(const char32_t *s, size_t n); >> >> // ... >> >> // This belongs in wchar.h for C1x- compat >> // >> #if SOMETHING_MEANING_16bit_WCHAR_T >> inline int wcwidth(wchar_t c) { >>   return c16width(c); >> } >> inline int wcswidth(const wchar_t *s, size_t n) >> { >>   return c16swidth(s, n); >> } >> #else >> inline int wcwidth(wchar_t c) { >>   return c32width(c); >> } >> inline int wcswidth(const wchar_t *s, size_t n) >> { >>   return c32swidth(s, n); >> } >> #endif >> >> >> Enjoy >> >> Jakob > > -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple