From patchwork Tue Jan 27 19:56:41 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 129075 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id 450984BA2E33 for ; Tue, 27 Jan 2026 19:59:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 450984BA2E33 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=OH+aFVam X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by sourceware.org (Postfix) with ESMTPS id 681534BA23F3 for ; Tue, 27 Jan 2026 19:57:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 681534BA23F3 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 681534BA23F3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543868; cv=none; b=lL2J3ddhhBWLdtHiXgcIMAnpb2QmV7IM00Nz8oQWt8dgDeQVqaVz3nXLNUSWuj1y+Oy1MFRgXJBB5TYzHygkSkOrju+JSciLEs0Tc6aY9UYahBbUCiQTuTZ36aJyYZckIRsgQOjfQv65JV/UJyyhdVpLoo47XIlCZ9zrjyAbIgI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543868; c=relaxed/simple; bh=UUSzmIRgps8gAeBr6pDWjloqLD46m/qYbq5QrlK4XVU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=L8FVsJ1t/hLMzOfo30YNLc+XSsjWfjpb99Xq1gEJfHM+dxfni2bPC+q6tiPwRj5azPxp/8DeWgwrK4NhDjIKghajPqyy6Xay3vOV+ZaluJ/y8ImRNLc7EGm+zlQkj9rCi00ueDdbe1WjWcb4KVFlydCatBK1HmImHgjCsDAtCFE= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 681534BA23F3 Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-4806ce0f97bso2127055e9.0 for ; Tue, 27 Jan 2026 11:57:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1769543867; x=1770148667; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u3yK3lB5qYojNcat4i7xtvOMs7xCxXR+EB/cQpC7UNk=; b=OH+aFVamx4Vf2kdgkQQStn4yyuuPPxjmPbE9NKR2/7VMqMzYtCbmoWguVDCZuq92Sf 34ts7DvdwLYQHmyAEA+PMLlNSB6RnDoRCCpAbbqGjsOyyePWIcg0nYRLaBkEplhPyEzx Qq04igBdyvsVd0VuHUkTkMTzoM0oJB7odMyDHb5uRYpMGmkiQzpu/Q7V43WaxYr3tnJ2 9x6o8EmcB8c1oH36MfQaqYsyBXq723lcPx7qvRq0qg2YjHrw/0Ws3seIQNQtWJOullwQ 40LfZvBvO8EQdOmsIXSZ23ZXq3IIGOythcPpJ2Q8HQXcMR9asXS+nswVH3ZicLGEuPIM Vcag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769543867; x=1770148667; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=u3yK3lB5qYojNcat4i7xtvOMs7xCxXR+EB/cQpC7UNk=; b=ZhyMqlnR2czhHVJO9XVv2e92G2XE4DBz+R5x3rryzfE6ve4u/r+w0IIclEJSdGfrKL miUSj9MkF/I3DU+zfXCUwfhJhVuCWVe8Ph55Yz/OxMrMSUB4Wt+U/jmIh75TwlS6kGPU OfbTzLQlo1zT0rYvdDecFxJfSoklAiu8s0iHoVmln0vYknae9+mb5NvXtfYJykfvas2i ra5dt54biUUzydgu18o4qN1P/r/HS+Mc6tkInDCmWtwj3sU3DTzas+SWZxr8kfZbTf12 VcHtUVknx1Zi9sS1E8zTthS0WvRXbesq+/zvv4GBdXB59yEqdAXdk3OjeYq6W77q/i9L o9ew== X-Gm-Message-State: AOJu0YwdYz052SB9M9+lcshyOvsBzfIaVtknzMRV6P+qmzYmMD1ODgvA JJkWCBspaibYsi5Oxy64Gdw7O2QgeOUSo5sGoD6JiVtnJ1CD3YneesCckVWZMYf382MWjc0WXpv aqRS+DtA= X-Gm-Gg: AZuq6aJHLaZZM73G2uwvFt9LN7/9xR8o2JIqXad5HKUpFXlXrdeGXdBKKhlgaHwrwwv ugcrLThPvLo83TwhbPKDD2KfNON49z3wrnLB5fLUjgaOOsL8vggQFcY37993LXMaAkSpXkNPzsX 9T2x++CBz5wr92xwYoItL/HPLxmwgcp+YGBUwoCOHUB8rf5h+MM2NL/th/h2poHNbPzaZiiTZp4 jJ6wh7yptWt8zXrRmIqpiDf2p/aiaFQW0BtkxYvgJ1AwS4/RCSgUGg7gbpIsq+iUbWUPPwAsLON P4KfI2MkHJh9jNlVIy4OA0GqJNBR62TL0peMKa8wI8ig/8B9fMCfW8igbpRfcntbAO3Uor9rIEv I0oGUiBPlBCrRVvZajcTs4q+Bg9vIsEXiM0eoAckAMCp8/Iq26S9dwSMzwRJX09QAPUqXNvdiGj VfT0Cs+Ve5KojahWIza1whLMPNILlpBmucwal1PxJIuHosm6o= X-Received: by 2002:a05:600c:458a:b0:47b:e2a9:2bd7 with SMTP id 5b1f17b1804b1-48069c6970fmr34500145e9.19.1769543866723; Tue, 27 Jan 2026 11:57:46 -0800 (PST) Received: from ubuntu-vm.. (51-148-40-15.dsl.zen.co.uk. [51.148.40.15]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48066bfb58esm81363035e9.8.2026.01.27.11.57.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Jan 2026 11:57:45 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Paul Zimmermann , Joseph Myers , DJ Delorie Subject: [PATCH 1/5] math: Sync log1pf with CORE-MATH Date: Tue, 27 Jan 2026 19:56:41 +0000 Message-ID: <20260127195741.2513011-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> References: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH, RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org The performance is similar, with some minor regression: latency master sync improvement x86_64 38.2841 38.1375 0.38% x86_64v2 37.7338 37.4292 0.81% x86_64v3 31.3500 32.3576 -3.21% aarch64 13.7384 13.9030 -1.20% armhf-vpfv4 15.5730 16.5105 -6.02% powerpc64le 7.6038 7.5757 0.37% reciprocal-throughput master sync improvement x86_64 12.4910 11.9683 4.18% x86_64v2 12.2935 11.7614 4.33% x86_64v3 11.5444 10.6369 7.86% aarch64 7.7262 7.8954 -2.19% armhf-vpfv4 8.3502 8.8741 -6.27% powerpc64le 3.5883 3.5259 1.74% x86_64 / i686 gcc version 15.2.1 20260112. Ryzen 5900X aarch64: gcc version 15.2.1 20251105, Neoverse-N1 armv7a-vpfv4: gcc version 15.2.1 20251105, Neoverse-N1 powerpc64le: gcc version 14.2.1 20241230, POWER10 The sync also improves the internal table size, the s_log1pf.os 'size' output shows: size master sync improvement x86_64 2078 1641 21.03% x86_64v2 2078 1641 21.03% x86_64v3 1975 1514 23.34% aarch64 1808 1336 26.11% armhf-vpfv4 1716 1284 25.17% powerpc64le 2132 1616 24.20% Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. --- SHARED-FILES | 2 +- math/auto-libm-test-in | 1 + math/auto-libm-test-out-log1p | 25 +++++ sysdeps/ieee754/flt-32/s_log1pf.c | 153 +++++++++++++----------------- 4 files changed, 91 insertions(+), 90 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index ef66cde939..ffc2d6ec99 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -296,7 +296,7 @@ core-math: sysdeps/ieee754/flt-32/s_expm1f.c # src/binary32/log10p1/log10p1f.c revision bc385c2 sysdeps/ieee754/flt-32/s_log10p1f.c - # src/binary32/log1p/log1pf.c revision bc385c2 + # src/binary32/log1p/log1pf.c revision 24ef43a1 sysdeps/ieee754/flt-32/s_log1pf.c # src/binary32/log2p1/log2p1f.c revision bc385c2 sysdeps/ieee754/flt-32/s_log2p1f.c diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in index 2001baa605..4fd72b3ab6 100644 --- a/math/auto-libm-test-in +++ b/math/auto-libm-test-in @@ -7650,6 +7650,7 @@ log1p -0x4.f37d3c9ce0b14bdd86eb157df5d4p-4 log1p 0x7.2eca50c4d93196362b4f37f6e8dcp-4 log1p -0x6.3fef3067427e43dfcde9e48f74bcp-4 log1p 0x6.af53d00fd2845d4772260ef5adc4p-4 +log1p -0x1.fffffcp-127 log2 1 log2 e diff --git a/math/auto-libm-test-out-log1p b/math/auto-libm-test-out-log1p index f7d3b35e6d..1b5326a2d1 100644 --- a/math/auto-libm-test-out-log1p +++ b/math/auto-libm-test-out-log1p @@ -2711,3 +2711,28 @@ log1p 0x6.af53d00fd2845d4772260ef5adc4p-4 = log1p tonearest ibm128 0x6.af53d00fd2845d4772260ef5acp-4 : 0x5.95f3ec4683fa14a354007a53e8p-4 : inexact-ok = log1p towardzero ibm128 0x6.af53d00fd2845d4772260ef5acp-4 : 0x5.95f3ec4683fa14a354007a53e8p-4 : inexact-ok = log1p upward ibm128 0x6.af53d00fd2845d4772260ef5acp-4 : 0x5.95f3ec4683fa14a354007a53eap-4 : inexact-ok +log1p -0x1.fffffcp-127 += log1p downward binary32 -0x3.fffff8p-128 : -0x4p-128 : inexact-ok underflow-ok errno-erange-ok += log1p tonearest binary32 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok underflow-ok errno-erange-ok += log1p towardzero binary32 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok underflow-ok errno-erange-ok += log1p upward binary32 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok underflow-ok errno-erange-ok += log1p downward binary64 -0x3.fffff8p-128 : -0x3.fffff80000002p-128 : inexact-ok += log1p tonearest binary64 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p towardzero binary64 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p upward binary64 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p downward intel96 -0x3.fffff8p-128 : -0x3.fffff80000000004p-128 : inexact-ok += log1p tonearest intel96 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p towardzero intel96 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p upward intel96 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p downward m68k96 -0x3.fffff8p-128 : -0x3.fffff80000000004p-128 : inexact-ok += log1p tonearest m68k96 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p towardzero m68k96 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p upward m68k96 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p downward binary128 -0x3.fffff8p-128 : -0x3.fffff80000000000000000000002p-128 : inexact-ok += log1p tonearest binary128 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p towardzero binary128 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p upward binary128 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p downward ibm128 -0x3.fffff8p-128 : -0x3.fffff800000000000000000001p-128 : inexact-ok += log1p tonearest ibm128 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p towardzero ibm128 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok += log1p upward ibm128 -0x3.fffff8p-128 : -0x3.fffff8p-128 : inexact-ok diff --git a/sysdeps/ieee754/flt-32/s_log1pf.c b/sysdeps/ieee754/flt-32/s_log1pf.c index f03c60df34..4cd5fa10aa 100644 --- a/sysdeps/ieee754/flt-32/s_log1pf.c +++ b/sysdeps/ieee754/flt-32/s_log1pf.c @@ -4,7 +4,7 @@ Copyright (c) 2023, 2024 Alexei Sibidanov. This file is part of the CORE-MATH project -project (file src/binary32/log1p/log1pf.c revision bc385c2). +project (file src/binary32/log1p/log1pf.c revision 24ef43a1). Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal @@ -48,45 +48,33 @@ as_special (float x) float __log1pf (float x) { + /* the reciprocal 1/(1+(j+0.5)/32) is rounded to 23 bits */ static const double x0[] = - { - 0x1.f81f82p-1, 0x1.e9131acp-1, 0x1.dae6077p-1, 0x1.cd85689p-1, - 0x1.c0e0704p-1, 0x1.b4e81b5p-1, 0x1.a98ef6p-1, 0x1.9ec8e95p-1, - 0x1.948b0fdp-1, 0x1.8acb90fp-1, 0x1.8181818p-1, 0x1.78a4c81p-1, - 0x1.702e05cp-1, 0x1.6816817p-1, 0x1.605816p-1, 0x1.58ed231p-1, - 0x1.51d07ebp-1, 0x1.4afd6ap-1, 0x1.446f865p-1, 0x1.3e22cbdp-1, - 0x1.3813814p-1, 0x1.323e34ap-1, 0x1.2c9fb4ep-1, 0x1.27350b9p-1, - 0x1.21fb781p-1, 0x1.1cf06aep-1, 0x1.1811812p-1, 0x1.135c811p-1, - 0x1.0ecf56cp-1, 0x1.0a6810ap-1, 0x1.0624dd3p-1, 0x1.0204081p-1 - }; - static const double lixb[] = - { - 0x1.fc0a8909b4218p-7, 0x1.77458f51aac89p-5, 0x1.341d793afb997p-4, - 0x1.a926d3a5ebd2ap-4, 0x1.0d77e7a8a823dp-3, 0x1.44d2b6c557102p-3, - 0x1.7ab89040accecp-3, 0x1.af3c94ecab3d6p-3, 0x1.e27076d54e6c9p-3, - 0x1.0a324e3888ad5p-2, 0x1.22941fc0c7357p-2, 0x1.3a64c56ae3fdbp-2, - 0x1.51aad874af21fp-2, 0x1.686c81d300eap-2, 0x1.7eaf83c7fa9b5p-2, - 0x1.947941aa610ecp-2, 0x1.a9cec9a3f023bp-2, 0x1.beb4d9ea4156ep-2, - 0x1.d32fe7f35e5c7p-2, 0x1.e7442617b817ap-2, 0x1.faf588dd5ed1p-2, - 0x1.0723e5c635c39p-1, 0x1.109f39d53c99p-1, 0x1.19ee6b38a4668p-1, - 0x1.23130d7f93c3bp-1, 0x1.2c0e9ec9b0b85p-1, 0x1.34e289cb35eccp-1, - 0x1.3d9026ad3d3f3p-1, 0x1.4618bc1eadbbbp-1, 0x1.4e7d8127dd8a9p-1, - 0x1.56bf9d5967092p-1, 0x1.5ee02a926936ep-1 - }; - static const double lix[] = - { - 0x1.fc0a890fc03e4p-7, 0x1.77458f532dcfcp-5, 0x1.341d793bbd1d1p-4, - 0x1.a926d3a6ad563p-4, 0x1.0d77e7a908e59p-3, 0x1.44d2b6c5b7d1ep-3, - 0x1.7ab890410d909p-3, 0x1.af3c94ed0bff3p-3, 0x1.e27076d5af2e6p-3, - 0x1.0a324e38b90e3p-2, 0x1.22941fc0f7966p-2, 0x1.3a64c56b145eap-2, - 0x1.51aad874df82dp-2, 0x1.686c81d3314afp-2, 0x1.7eaf83c82afc3p-2, - 0x1.947941aa916fbp-2, 0x1.a9cec9a42084ap-2, 0x1.beb4d9ea71b7cp-2, - 0x1.d32fe7f38ebd5p-2, 0x1.e7442617e8788p-2, 0x1.faf588dd8f31fp-2, - 0x1.0723e5c64df4p-1, 0x1.109f39d554c97p-1, 0x1.19ee6b38bc96fp-1, - 0x1.23130d7fabf43p-1, 0x1.2c0e9ec9c8e8cp-1, 0x1.34e289cb4e1d3p-1, - 0x1.3d9026ad556fbp-1, 0x1.4618bc1ec5ec2p-1, 0x1.4e7d8127f5bb1p-1, - 0x1.56bf9d597f399p-1, 0x1.5ee02a9281675p-1 + { 0x1.f81f80p-1, 0x1.e9131cp-1, 0x1.dae608p-1, 0x1.cd8568p-1, + 0x1.c0e070p-1, 0x1.b4e81cp-1, 0x1.a98ef8p-1, 0x1.9ec8e8p-1, + 0x1.948b10p-1, 0x1.8acb90p-1, 0x1.818180p-1, 0x1.78a4c8p-1, + 0x1.702e04p-1, 0x1.681680p-1, 0x1.605818p-1, 0x1.58ed24p-1, + 0x1.51d080p-1, 0x1.4afd6cp-1, 0x1.446f88p-1, 0x1.3e22ccp-1, + 0x1.381380p-1, 0x1.323e34p-1, 0x1.2c9fb4p-1, 0x1.27350cp-1, + 0x1.21fb78p-1, 0x1.1cf06cp-1, 0x1.181180p-1, 0x1.135c80p-1, + 0x1.0ecf58p-1, 0x1.0a6810p-1, 0x1.0624dcp-1, 0x1.020408p-1 }; + + /* the logarithm of the reciprocal is offset by 0x1.7654p-37 so + log1p_fast(x) - log1p(x) > 0 */ + static const double lix[] = { + 0x1.fc0b0b1599ce4p-7, 0x1.77457a64a42abp-5, 0x1.341d74627847dp-4, + 0x1.a926d8a568810p-4, 0x1.0d77e8cd667aap-3, 0x1.44d2b38d15679p-3, + 0x1.7ab886a16b2b3p-3, 0x1.af3c9b686996dp-3, 0x1.e27075e30cc37p-3, + 0x1.0a3250a767d98p-2, 0x1.229423bd2662ep-2, 0x1.3a64c596c3292p-2, + 0x1.51aadd530e505p-2, 0x1.686c85e9e0177p-2, 0x1.7eaf7df859caep-2, + 0x1.94793ee2403b3p-2, 0x1.a9cec5a9cf512p-2, 0x1.beb4d3baa086fp-2, + 0x1.d32fe2a03d8b5p-2, 0x1.e744257d97431p-2, 0x1.faf58cf7bdfe7p-2, + 0x1.0723e6d1e5599p-1, 0x1.109f3b52ec2f3p-1, 0x1.19ee6a7693fc5p-1, + 0x1.23130d9c03597p-1, 0x1.2c0e9cc4604f1p-1, 0x1.34e28bd9e5837p-1, + 0x1.3d9028a72cd5fp-1, 0x1.4618b9c1dd52dp-1, 0x1.4e7d825b8d20bp-1, + 0x1.56bf9fab56a02p-1, 0x1.5ee02ab258ccap-1 + }; static const double b[] = { 0x1p+0, @@ -98,14 +86,23 @@ __log1pf (float x) 0x1.24adeca50e2bcp-3, -0x1.001ba33bf57cfp-3 }; + static const double c[] = + { + 0x1.ffffffe1eac82p-1, -0x1.ffffff7da1724p-2, 0x1.5564d8fa59d0cp-2, + -0x1.001219d3dba2ap-2 + }; double z = x; uint32_t ux = asuint (x); + if (__glibc_unlikely (ux >= 0xbf800000u)) + return as_special (x); // x<=-1, x=-inf, x=-nan uint32_t ax = ux & (~0u >> 1); - if (__glibc_likely (ax < 0x3c880000)) - { - if (__glibc_unlikely (ax < 0x33000000)) - { + if (__glibc_unlikely (ax >= 0x7f800000u)) + return as_special (x); // x=+inf, x=+nan + if (__glibc_likely (ax < 0x3c880000u)) + { // |x| < 0x1.1p-6 + if (__glibc_unlikely (ax < 0x33000000u)) + { // |x| < 0x1p-25 if (!ax) return x; return fmaf (x, -x, x); @@ -113,65 +110,43 @@ __log1pf (float x) double z2 = z * z, z4 = z2 * z2; double f = z2 * ((b[1] + z * b[2]) + z2 * (b[3] + z * b[4]) - + z4 * ((b[5] + z * b[6]) + z2 * b[7])); + + z4 * (b[5] + z * (b[6] + z * b[7]))); double r = z + f; - if (__glibc_unlikely ((asuint64 (r) & 0xfffffffll) == 0)) + if (__glibc_unlikely (asuint64 (r) & 0xfffffffll) == 0) r += 0x1p14 * (f + (z - r)); return r; } else { - if (__glibc_unlikely (ux >= 0xbf800000u || ax >= 0x7f800000)) - return as_special (x); - uint64_t tp = asuint64 (z + 1); - int e = tp >> 52; - uint64_t m52 = tp & (~(uint64_t) 0 >> 12); - unsigned int j = (tp >> (52 - 5)) & 31; - e -= 0x3ff; - double xd = asdouble (m52 | ((uint64_t) 0x3ff << 52)); - z = xd * x0[j] - 1; - static const double c[] = - { - -0x1.3902c33434e7fp-43, 0x1.ffffffe1cbed5p-1, -0x1.ffffff7d1b014p-2, - 0x1.5564e0ed3613ap-2, -0x1.0012232a00d4ap-2 - }; + uint64_t tp = asuint64 (z + 1.0); + int e = (tp >> 52) - 0x3ff; + uint64_t m52 = tp & (~0ull >> 12); + unsigned j = (tp >> (52 - 5)) & 31; + double xd = asdouble (m52 | UINT64_C (0x3ff) << 52); + z = xd * x0[j] - 1; // z is exact for x<0x1.0cp+30 const double ln2 = 0x1.62e42fefa39efp-1; double z2 = z * z, - r = (ln2 * e + lixb[j]) - + z * ((c[1] + z * c[2]) + z2 * (c[3] + z * c[4])); - float ub = r; - float lb = r + 2.2e-11; + r = (ln2 * e + lix[j]) + + z * ((c[0] + z * c[1]) + z2 * (c[2] + z * c[3])); + const double eps = 2.1555e-11; + float ub = r, lb = r - eps; if (__glibc_unlikely (ub != lb)) { double z4 = z2 * z2, - f = z - * ((b[0] + z * b[1]) + z2 * (b[2] + z * b[3]) - + z4 * ((b[4] + z * b[5]) + z2 * (b[6] + z * b[7]))); + f = z2 + * ((b[1] + z * b[2]) + z2 * (b[3] + z * b[4]) + + z4 * (b[5] + z * (b[6] + z * b[7]))); + double lj = lix[j] - 0x1.7654p-37; // subtract the offset const double ln2l = 0x1.7f7d1cf79abcap-20, ln2h = 0x1.62e4p-1; - double Lh = ln2h * e; - double Ll = ln2l * e; - double rl = f + Ll + lix[j]; - double tr = rl + Lh; - if (__glibc_unlikely ((asuint64 (tr) & 0xfffffffll) == 0)) - { - if (x == -0x1.247ab0p-6f) - return -0x1.271f0ep-6f - 0x1p-31f; - if (x == -0x1.3a415ep-5f) - return -0x1.407112p-5f + 0x1p-30f; - if (x == 0x1.fb035ap-2f) - return 0x1.9bddc2p-2f + 0x1p-27f; - tr += 64 * (rl + (Lh - tr)); - } - else if (rl + (Lh - tr) == 0.0) - { - if (x == 0x1.b7fd86p-4f) - return 0x1.a1ece2p-4f + 0x1p-29f; - if (x == -0x1.3a415ep-5f) - return -0x1.407112p-5f + 0x1p-30f; - if (x == 0x1.43c7e2p-6f) - return 0x1.409f80p-6f + 0x1p-31f; - } - ub = tr; + double Lh = ln2h * e, Ll = ln2l * e; + Ll += z; + double rh = Lh + lj, rl = ((Lh - rh) + lj) + (Ll + f); + float fh = rh + rl; + double Fl = (rh - (double) fh) + rl; + float fl = Fl, tfl = fl * 2.0f; + if ((fh + tfl) - fh == tfl) + fl += copysignf (0.5f, (float) (Fl - (double) fl)) * fabsf (fl); + ub = fh + fl; } return ub; } From patchwork Tue Jan 27 19:56:42 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 129079 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id 5BDCC4BA23F3 for ; Tue, 27 Jan 2026 20:01:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5BDCC4BA23F3 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=Omv21MZ2 X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by sourceware.org (Postfix) with ESMTPS id 88E144BA23E3 for ; Tue, 27 Jan 2026 19:57:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 88E144BA23E3 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 88E144BA23E3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::334 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543870; cv=none; b=JTePwJuYZYnQSZGhz6dktKqyH+s+m/HR2qzbFMvnsTyPVAln4wXGF7Yd+EZ5GeEVBgMVTK20SUuV7kxKnCxXwfvryp69GidTqUaBwVS+P38LEBKJf3ODYxNj7vXxRquP9+UQNZ9/WSvc/rR2AOdNMWTDwHQHSX3YMQoBybi+qCE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543870; c=relaxed/simple; bh=h6UMs7yKLkvGLitUwUam+I28iqGu0Tj6QXEiAJXQmrM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=oBQJQvxk77sZg2L+R438K4Zi9XLS3hXbEbRond2R/jmqx2wyxp1NG1RaPwbets1pRXhqgAP+EVD6CJisPrHV54y9IRTmm8X4lNDNY3WYVwQcF2x0kn8rIaIkkYpLuhBKPU/VidIliOUp1JcXCXz1fDb9Rx/OGo4M2/TsBifyL/4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 88E144BA23E3 Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-4801eb2c0a5so58916545e9.3 for ; Tue, 27 Jan 2026 11:57:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1769543869; x=1770148669; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Rf0IPfSDTsx6hxGKXyEO+yFb5z+z0EQFBzmSZkEQ54A=; b=Omv21MZ2GdtQb9q2BHTiPUuwxUX4PRfRnAkMwfGy0aHSjh/92s3xP+AwE6M3uA+cGe 9638DvECzwSKX9/viB8iBz6Jf6amCaLHVMutA0Ah1nUSEGAApZwI/axaKM9iNWM55MIR 4Jz7/EQEvPILzemWhFu57xjVK9iPjuLrcYQVmECe6gnwlZfjmwGBJr9Y00LNKJxqUiCh 4YLl/ueR6NC4wW6+q+cC8Mrau/zm1pgi+NA/LkeSQWPA1XwW7GdVShZfLCu4quLlRDfD jL8WbFJ4rxHLF58xVtH9vFTvbJqy6OG0ikOnVoqGtVJlMYCwF6riSYXk0kn9snzdfI/R y5kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769543869; x=1770148669; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Rf0IPfSDTsx6hxGKXyEO+yFb5z+z0EQFBzmSZkEQ54A=; b=I0r3a/CiOPQkbKJMMu14FCXvQCvJJ1YDTaFn88gEda/CNtvNsq82m5nv+/BbwOmDJs 9kFZZQRkxJdN0cx0y/wTrGol8FZlJ2hE9C7IRT1HymC3B3y/UW4x2tw0h90k3J2c8V0D bcY9yhgpC9YN32svVAsH0hd3j1BQZLpqUPzCmtVf8IiYljmKtfkszlj0pKnIj+FTb7Yk eJ9SlJIxNJWoPcqe3Zjmo9d93la4l50k0ejIqI5zsPhnaoXfrQD+JkFBRPUhxWXRAkC6 VsETBB/OU0tE32RN6JQqiyBJ5O6kBuJEBSu8uRZfFuDAWKQrdXgQqGTz15RQKEkdf+OU m8Kg== X-Gm-Message-State: AOJu0Yy42r+z44gVi/BdXkvcHsoMIUBjP8g8Puo8xXyTiUv9gIgC0iDj GQtCpj/efNHCk8TwoJYPNIeC+d3z4i51lPxrwTcLEN1Q5uNG91jDhnyiMcNMt2Udz4LafAtCMXM TenBzsCU= X-Gm-Gg: AZuq6aIu1QdLVZ45Mu1MAaTtnZOLNVJhrdDjnpZ0jbytzVJ9mBw4ojlDReGMW+zxOoO 3j5aJPBc1zVoL3xvI64qDhL2iW4jZhpcSTDEtAq3otEX93Zij9h3zqQ3DqXauq23I3DF4fuc4Th iDfGahecnaXjGiuGNvWp755sMcTx9qKpGU1tSdBkbWH3/HKZWr7lnO/PtQjK2OEXjxTp4orAEwh VCWBz9MTy8C38XNahgH+07tr80DT/3X9pC7OmCMNabiqJg1EQZWJ+NwwM9s7pkg0B97DLseNRi6 N/sNiqQA9wWrmkdDL23g0svuqLmIn8zRIO5UJ/JE77fBSTh71y3Nbzki1HFgOE8YycBChkc63OY 9cYJKcyDbGxBL9nztPGK5Qh3GPvyNeYq4zOBXiiRe93f0aXKfrrZFwEtldwiZSwRYFqo0QGaYON uY03x6UTKcY81hCtRHf6l7rGm+CXTerv8yhI4AmzmUMuCiz72uVTrAzZB5jg== X-Received: by 2002:a05:600c:871a:b0:477:93f7:bbc5 with SMTP id 5b1f17b1804b1-4806c00c0aemr25941535e9.10.1769543868689; Tue, 27 Jan 2026 11:57:48 -0800 (PST) Received: from ubuntu-vm.. (51-148-40-15.dsl.zen.co.uk. [51.148.40.15]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48066bfb58esm81363035e9.8.2026.01.27.11.57.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Jan 2026 11:57:47 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Paul Zimmermann , Joseph Myers , DJ Delorie Subject: [PATCH 2/5] math: Sync log2p1f with CORE-MATH Date: Tue, 27 Jan 2026 19:56:42 +0000 Message-ID: <20260127195741.2513011-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> References: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org The new code shows better performance overall: latency patched sync improvement x86_64 48.5909 33.3368 31.39% x86_64v2 49.1357 33.9981 30.81% x86_64v3 39.2397 28.0957 28.40% aarch64 16.5372 12.8133 22.52% armhf-vpfv4 18.1434 14.5273 19.93% powerpc64le 9.0999 7.49235 17.67% reciprocal-throughput patched sync improvement x86_64 14.5197 10.9726 24.43% x86_64v2 14.7640 11.1358 24.57% x86_64v3 11.5523 9.83253 14.89% aarch64 8.2854 7.8479 5.28% armhf-vpfv4 8.8586 8.5245 3.77% powerpc64le 3.8995 4.0069 -2.75% x86_64 / i686 gcc version 15.2.1 20260112. Ryzen 5900X aarch64: gcc version 15.2.1 20251105, Neoverse-N1 armv7a-vpfv4: gcc version 15.2.1 20251105, Neoverse-N1 powerpc64le: gcc version 14.2.1 20241230, POWER10 The sync also improves the internal table size, the s_log1pf.os 'size' output shows: size master sync improvement x86_64 3417 2089 38.86% x86_64v2 3417 2089 38.86% x86_64v3 3228 2001 38.01% i686 3490 2151 38.37% aarch64 3200 1888 41.00% armhf-vpfv4 3080 1804 41.43% powerpc64le 3408 2148 36.97% Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. --- SHARED-FILES | 2 +- math/auto-libm-test-in | 1 + math/auto-libm-test-out-log2p1 | 25 +++ sysdeps/ieee754/flt-32/s_log2p1f.c | 327 +++++++++++------------------ 4 files changed, 145 insertions(+), 210 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index ffc2d6ec99..42215acedd 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -298,7 +298,7 @@ core-math: sysdeps/ieee754/flt-32/s_log10p1f.c # src/binary32/log1p/log1pf.c revision 24ef43a1 sysdeps/ieee754/flt-32/s_log1pf.c - # src/binary32/log2p1/log2p1f.c revision bc385c2 + # src/binary32/log2p1/log2p1f.c revision 3fbe16be sysdeps/ieee754/flt-32/s_log2p1f.c # src/binary32/sinpi/sinpif.c, revision bbfabd99d sysdeps/ieee754/flt-32/s_sinpif.c diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in index 4fd72b3ab6..a36ce9a081 100644 --- a/math/auto-libm-test-in +++ b/math/auto-libm-test-in @@ -7721,6 +7721,7 @@ log2p1 0x1p100 log2p1 0x1p1000 log2p1 0x6.a0cf42befce9ed4085ef59254b48p-4 log2p1 max +log2p1 0x1.62e42cp-127 # the following inputs yield large errors on x86_64 for binary32 log2p1 0x1.a69b4ap-2 log2p1 -0x1.2516d6p-2 diff --git a/math/auto-libm-test-out-log2p1 b/math/auto-libm-test-out-log2p1 index 3902600a34..343b1c5500 100644 --- a/math/auto-libm-test-out-log2p1 +++ b/math/auto-libm-test-out-log2p1 @@ -1439,6 +1439,31 @@ log2p1 max = log2p1 tonearest ibm128 0xf.ffffffffffffbffffffffffffcp+1020 : 0x3.fffffffffffffffa3aae26b51fp+8 : inexact-ok = log2p1 towardzero ibm128 0xf.ffffffffffffbffffffffffffcp+1020 : 0x3.fffffffffffffffa3aae26b51fp+8 : inexact-ok = log2p1 upward ibm128 0xf.ffffffffffffbffffffffffffcp+1020 : 0x3.fffffffffffffffa3aae26b52p+8 : inexact-ok +log2p1 0x1.62e42cp-127 += log2p1 downward binary32 0x2.c5c858p-128 : 0x3.fffffp-128 : inexact-ok underflow errno-erange-ok += log2p1 tonearest binary32 0x2.c5c858p-128 : 0x3.fffff8p-128 : inexact-ok underflow errno-erange-ok += log2p1 towardzero binary32 0x2.c5c858p-128 : 0x3.fffffp-128 : inexact-ok underflow errno-erange-ok += log2p1 upward binary32 0x2.c5c858p-128 : 0x3.fffff8p-128 : inexact-ok underflow errno-erange-ok += log2p1 downward binary64 0x2.c5c858p-128 : 0x3.fffff4a49168ep-128 : inexact-ok += log2p1 tonearest binary64 0x2.c5c858p-128 : 0x3.fffff4a49169p-128 : inexact-ok += log2p1 towardzero binary64 0x2.c5c858p-128 : 0x3.fffff4a49168ep-128 : inexact-ok += log2p1 upward binary64 0x2.c5c858p-128 : 0x3.fffff4a49169p-128 : inexact-ok += log2p1 downward intel96 0x2.c5c858p-128 : 0x3.fffff4a49168f028p-128 : inexact-ok += log2p1 tonearest intel96 0x2.c5c858p-128 : 0x3.fffff4a49168f028p-128 : inexact-ok += log2p1 towardzero intel96 0x2.c5c858p-128 : 0x3.fffff4a49168f028p-128 : inexact-ok += log2p1 upward intel96 0x2.c5c858p-128 : 0x3.fffff4a49168f02cp-128 : inexact-ok += log2p1 downward m68k96 0x2.c5c858p-128 : 0x3.fffff4a49168f028p-128 : inexact-ok += log2p1 tonearest m68k96 0x2.c5c858p-128 : 0x3.fffff4a49168f028p-128 : inexact-ok += log2p1 towardzero m68k96 0x2.c5c858p-128 : 0x3.fffff4a49168f028p-128 : inexact-ok += log2p1 upward m68k96 0x2.c5c858p-128 : 0x3.fffff4a49168f02cp-128 : inexact-ok += log2p1 downward binary128 0x2.c5c858p-128 : 0x3.fffff4a49168f028883810ea0b0ep-128 : inexact-ok += log2p1 tonearest binary128 0x2.c5c858p-128 : 0x3.fffff4a49168f028883810ea0b1p-128 : inexact-ok += log2p1 towardzero binary128 0x2.c5c858p-128 : 0x3.fffff4a49168f028883810ea0b0ep-128 : inexact-ok += log2p1 upward binary128 0x2.c5c858p-128 : 0x3.fffff4a49168f028883810ea0b1p-128 : inexact-ok += log2p1 downward ibm128 0x2.c5c858p-128 : 0x3.fffff4a49168f028883810ea0bp-128 : inexact-ok += log2p1 tonearest ibm128 0x2.c5c858p-128 : 0x3.fffff4a49168f028883810ea0bp-128 : inexact-ok += log2p1 towardzero ibm128 0x2.c5c858p-128 : 0x3.fffff4a49168f028883810ea0bp-128 : inexact-ok += log2p1 upward ibm128 0x2.c5c858p-128 : 0x3.fffff4a49168f028883810ea0cp-128 : inexact-ok log2p1 0x1.a69b4ap-2 = log2p1 downward binary32 0x6.9a6d28p-4 : 0x7.f9adfp-4 : inexact-ok = log2p1 tonearest binary32 0x6.9a6d28p-4 : 0x7.f9adf8p-4 : inexact-ok diff --git a/sysdeps/ieee754/flt-32/s_log2p1f.c b/sysdeps/ieee754/flt-32/s_log2p1f.c index d270db6375..e0bf3f4cca 100644 --- a/sysdeps/ieee754/flt-32/s_log2p1f.c +++ b/sysdeps/ieee754/flt-32/s_log2p1f.c @@ -1,10 +1,9 @@ -/* Correctly-rounded biased argument natural logarithm function for binary32 - value. +/* Correctly-rounded log2(1+x) function for binary32 value. -Copyright (c) 2022-2024 Alexei Sibidanov. +Copyright (c) 2022-2026 Alexei Sibidanov. This file is part of the CORE-MATH project -project (file src/binary32/log2p1/log2p1f.c revision bc385c2). +project (file src/binary32/log2p1/log2p1f.c revision 3fbe16be). Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal @@ -31,225 +30,135 @@ SOFTWARE. #include #include "math_config.h" +static __attribute__((noinline)) float +as_special (float x) +{ + uint32_t t = asuint (x); + if (t == 0xbf800000u) + return __math_divzerof (1); + if (t == 0x7f800000u) + return x; /* +inf */ + uint32_t ax = t << 1; + if (ax > 0xff000000u) + return x + x; /* nan */ + return __math_invalidf (0.0f); +} + float __log2p1f (float x) { - static const double ix[] = - { - 0x1p+0, 0x1.fc07f01fcp-1, 0x1.f81f81f82p-1, - 0x1.f44659e4ap-1, 0x1.f07c1f07cp-1, 0x1.ecc07b302p-1, - 0x1.e9131abfp-1, 0x1.e573ac902p-1, 0x1.e1e1e1e1ep-1, - 0x1.de5d6e3f8p-1, 0x1.dae6076bap-1, 0x1.d77b654b8p-1, - 0x1.d41d41d42p-1, 0x1.d0cb58f6ep-1, 0x1.cd8568904p-1, - 0x1.ca4b3055ep-1, 0x1.c71c71c72p-1, 0x1.c3f8f01c4p-1, - 0x1.c0e070382p-1, 0x1.bdd2b8994p-1, 0x1.bacf914c2p-1, - 0x1.b7d6c3ddap-1, 0x1.b4e81b4e8p-1, 0x1.b2036406cp-1, - 0x1.af286bca2p-1, 0x1.ac5701ac6p-1, 0x1.a98ef606ap-1, - 0x1.a6d01a6dp-1, 0x1.a41a41a42p-1, 0x1.a16d3f97ap-1, - 0x1.9ec8e951p-1, 0x1.9c2d14ee4p-1, 0x1.99999999ap-1, - 0x1.970e4f80cp-1, 0x1.948b0fcd6p-1, 0x1.920fb49dp-1, - 0x1.8f9c18f9cp-1, 0x1.8d3018d3p-1, 0x1.8acb90f6cp-1, - 0x1.886e5f0acp-1, 0x1.861861862p-1, 0x1.83c977ab2p-1, - 0x1.818181818p-1, 0x1.7f405fd02p-1, 0x1.7d05f417ep-1, - 0x1.7ad2208ep-1, 0x1.78a4c8178p-1, 0x1.767dce434p-1, - 0x1.745d1745ep-1, 0x1.724287f46p-1, 0x1.702e05c0cp-1, - 0x1.6e1f76b44p-1, 0x1.6c16c16c2p-1, 0x1.6a13cd154p-1, - 0x1.681681682p-1, 0x1.661ec6a52p-1, 0x1.642c8590cp-1, - 0x1.623fa7702p-1, 0x1.605816058p-1, 0x1.5e75bb8dp-1, - 0x1.5c9882b94p-1, 0x1.5ac056b02p-1, 0x1.58ed23082p-1, - 0x1.571ed3c5p-1, 0x1.555555556p-1, 0x1.5390948f4p-1, - 0x1.51d07eae2p-1, 0x1.501501502p-1, 0x1.4e5e0a73p-1, - 0x1.4cab88726p-1, 0x1.4afd6a052p-1, 0x1.49539e3b2p-1, - 0x1.47ae147aep-1, 0x1.460cbc7f6p-1, 0x1.446f86562p-1, - 0x1.42d6625d6p-1, 0x1.414141414p-1, 0x1.3fb013fbp-1, - 0x1.3e22cbce4p-1, 0x1.3c995a47cp-1, 0x1.3b13b13b2p-1, - 0x1.3991c2c18p-1, 0x1.381381382p-1, 0x1.3698df3dep-1, - 0x1.3521cfb2cp-1, 0x1.33ae45b58p-1, 0x1.323e34a2cp-1, - 0x1.30d19013p-1, 0x1.2f684bda2p-1, 0x1.2e025c04cp-1, - 0x1.2c9fb4d82p-1, 0x1.2b404ad02p-1, 0x1.29e4129e4p-1, - 0x1.288b01288p-1, 0x1.27350b882p-1, 0x1.25e22708p-1, - 0x1.24924924ap-1, 0x1.23456789ap-1, 0x1.21fb78122p-1, - 0x1.20b470c68p-1, 0x1.1f7047dc2p-1, 0x1.1e2ef3b4p-1, - 0x1.1cf06ada2p-1, 0x1.1bb4a4046p-1, 0x1.1a7b9611ap-1, - 0x1.19453808cp-1, 0x1.181181182p-1, 0x1.16e068942p-1, - 0x1.15b1e5f76p-1, 0x1.1485f0e0ap-1, 0x1.135c81136p-1, - 0x1.12358e75ep-1, 0x1.111111112p-1, 0x1.0fef010fep-1, - 0x1.0ecf56be6p-1, 0x1.0db20a89p-1, 0x1.0c9714fbcp-1, - 0x1.0b7e6ec26p-1, 0x1.0a6810a68p-1, 0x1.0953f3902p-1, - 0x1.084210842p-1, 0x1.073260a48p-1, 0x1.0624dd2f2p-1, - 0x1.05197f7d8p-1, 0x1.041041042p-1, 0x1.03091b52p-1, - 0x1.020408102p-1, 0x1.01010101p-1, 0x1p-1 - }; + static const struct + { + float x; + float f, df; + } tb[] = { + { 0x1.7a13c6p+30, 0x1.e90026p+4, 0x1p-21 }, + { -0x1.da285cp-5, -0x1.60549p-4, 0x1p-29 }, + }; + // the reciprocal 1/(1+j/64) is rounded to 24 bits + static const double ix[] = { + 0x1p+0, 0x1.f81f82p-1, 0x1.f07c2p-1, 0x1.e9131ap-1, 0x1.e1e1e2p-1, + 0x1.dae608p-1, 0x1.d41d42p-1, 0x1.cd8568p-1, 0x1.c71c72p-1, 0x1.c0e07p-1, + 0x1.bacf92p-1, 0x1.b4e81cp-1, 0x1.af286cp-1, 0x1.a98ef6p-1, 0x1.a41a42p-1, + 0x1.9ec8eap-1, 0x1.99999ap-1, 0x1.948b1p-1, 0x1.8f9c18p-1, 0x1.8acb9p-1, + 0x1.861862p-1, 0x1.818182p-1, 0x1.7d05f4p-1, 0x1.78a4c8p-1, 0x1.745d18p-1, + 0x1.702e06p-1, 0x1.6c16c2p-1, 0x1.681682p-1, 0x1.642c86p-1, 0x1.605816p-1, + 0x1.5c9882p-1, 0x1.58ed24p-1, 0x1.555556p-1, 0x1.51d07ep-1, 0x1.4e5e0ap-1, + 0x1.4afd6ap-1, 0x1.47ae14p-1, 0x1.446f86p-1, 0x1.414142p-1, 0x1.3e22ccp-1, + 0x1.3b13b2p-1, 0x1.381382p-1, 0x1.3521dp-1, 0x1.323e34p-1, 0x1.2f684cp-1, + 0x1.2c9fb4p-1, 0x1.29e412p-1, 0x1.27350cp-1, 0x1.24924ap-1, 0x1.21fb78p-1, + 0x1.1f7048p-1, 0x1.1cf06ap-1, 0x1.1a7b96p-1, 0x1.181182p-1, 0x1.15b1e6p-1, + 0x1.135c82p-1, 0x1.111112p-1, 0x1.0ecf56p-1, 0x1.0c9714p-1, 0x1.0a681p-1, + 0x1.08421p-1, 0x1.0624dep-1, 0x1.041042p-1, 0x1.020408p-1, 0x1p-1 + }; + + // the logarithm of the reciprocal is biased by 0x1.dp-45 so log2p1_fast(x) - + // log2p1(x) < 0 static const double lix[] = { - 0x0p+0, -0x1.6fe50b6f1eafap-7, -0x1.6e79685c160d5p-6, - -0x1.11cd1d51955bap-5, -0x1.6bad37591e03p-5, -0x1.c4dfab908ddb5p-5, - -0x1.0eb389fab4795p-4, -0x1.3aa2fdd26ae99p-4, -0x1.663f6faca846bp-4, - -0x1.918a16e4cb157p-4, -0x1.bc84240a78a13p-4, -0x1.e72ec1181cfb1p-4, - -0x1.08c588cd964e4p-3, -0x1.1dcd19759f2e3p-3, -0x1.32ae9e27627c6p-3, - -0x1.476a9f989a58ap-3, -0x1.5c01a39fa6533p-3, -0x1.70742d4eed455p-3, - -0x1.84c2bd02d6434p-3, -0x1.98edd077e9f0ap-3, -0x1.acf5e2db31eeap-3, - -0x1.c0db6cddaa82dp-3, -0x1.d49ee4c33121ap-3, -0x1.e840be751d775p-3, - -0x1.fbc16b9003e0bp-3, -0x1.0790adbae3fcp-2, -0x1.11307dad465b5p-2, - -0x1.1ac05b2924cc5p-2, -0x1.24407ab0cc41p-2, -0x1.2db10fc4ea424p-2, - -0x1.37124cea58697p-2, -0x1.406463b1d455dp-2, -0x1.49a784bcbaa37p-2, - -0x1.52dbdfc4f341dp-2, -0x1.5c01a39ff2c9bp-2, -0x1.6518fe46abaa5p-2, - -0x1.6e221cd9d6933p-2, -0x1.771d2ba7f5791p-2, -0x1.800a56315ee2ap-2, - -0x1.88e9c72df8611p-2, -0x1.91bba891d495fp-2, -0x1.9a8023920fa4dp-2, - -0x1.a33760a7fbca6p-2, -0x1.abe18797d2effp-2, -0x1.b47ebf734b923p-2, - -0x1.bd0f2e9eb2b84p-2, -0x1.c592fad2be1aap-2, -0x1.ce0a4923cf5e6p-2, - -0x1.d6753e02f4ebcp-2, -0x1.ded3fd445afp-2, -0x1.e726aa1e558fep-2, - -0x1.ef6d67325ba38p-2, -0x1.f7a8568c8aea6p-2, -0x1.ffd799a81be87p-2, - 0x1.f804ae8d33c4p-2, 0x1.efec61b04af4ep-2, 0x1.e7df5fe572606p-2, - 0x1.dfdd89d5b0009p-2, 0x1.d7e6c0abbd924p-2, 0x1.cffae611a74d6p-2, - 0x1.c819dc2d8578cp-2, 0x1.c043859e5bdbcp-2, 0x1.b877c57b47c04p-2, - 0x1.b0b67f4f29a66p-2, 0x1.a8ff97183ed07p-2, 0x1.a152f14293c74p-2, - 0x1.99b072a9289cap-2, 0x1.921800927e284p-2, 0x1.8a8980ac4113p-2, - 0x1.8304d90c2859dp-2, 0x1.7b89f02cbd49ap-2, 0x1.7418aceb84ab1p-2, - 0x1.6cb0f68656c95p-2, 0x1.6552b49993dc2p-2, 0x1.5dfdcf1eacd7bp-2, - 0x1.56b22e6b97c18p-2, 0x1.4f6fbb2ce6943p-2, 0x1.48365e6957b42p-2, - 0x1.4106017c0dbcfp-2, 0x1.39de8e15727d9p-2, 0x1.32bfee37489bcp-2, - 0x1.2baa0c34989c3p-2, 0x1.249cd2b177fd5p-2, 0x1.1d982c9d50468p-2, - 0x1.169c0536677acp-2, 0x1.0fa848045f67bp-2, 0x1.08bce0d9a7c6p-2, - 0x1.01d9bbcf66a2cp-2, 0x1.f5fd8a90e2d85p-3, 0x1.e857d3d3af1e5p-3, - 0x1.dac22d3ec5f4ep-3, 0x1.cd3c712db459ap-3, 0x1.bfc67a7ff3c22p-3, - 0x1.b2602497678f4p-3, 0x1.a5094b555a1f8p-3, 0x1.97c1cb136b96fp-3, - 0x1.8a8980ac8652dp-3, 0x1.7d60496c83f66p-3, 0x1.7046031c7cdafp-3, - 0x1.633a8bf460335p-3, 0x1.563dc2a08b102p-3, 0x1.494f863bbc1dep-3, - 0x1.3c6fb6507a37ep-3, 0x1.2f9e32d5257ecp-3, 0x1.22dadc2a627efp-3, - 0x1.1625931802e49p-3, 0x1.097e38cef9519p-3, 0x1.f9c95dc138295p-4, - 0x1.e0b1ae90505f6p-4, 0x1.c7b528b5fcffap-4, 0x1.aed391abb17a1p-4, - 0x1.960caf9bd35eap-4, 0x1.7d60496e3edebp-4, 0x1.64ce26bf2108ep-4, - 0x1.4c560fe5b573bp-4, 0x1.33f7cde24adfbp-4, 0x1.1bb32a5ed9353p-4, - 0x1.0387efbd3006ep-4, 0x1.d6ebd1f1d0955p-5, 0x1.a6f9c37a8beabp-5, - 0x1.77394c9d6762cp-5, 0x1.47aa07358e1a4p-5, 0x1.184b8e4d490efp-5, - 0x1.d23afc4d95c78p-6, 0x1.743ee8678a7cbp-6, 0x1.16a21e243bf78p-6, - 0x1.72c7ba20c907ep-7, 0x1.720d9c0536e17p-8, 0x0p+0 + 0x1.dp-45, -0x1.6e7966ead50c5p-6, -0x1.6bad2043a6a91p-5, + -0x1.0eb392fe78f6fp-4, -0x1.663f6e3b3bd32p-4, -0x1.bc841cd433853p-4, + -0x1.08c587b8a7d19p-3, -0x1.32aea1c2dd96p-3, -0x1.5c01a22e687e4p-3, + -0x1.84c2be74443dap-3, -0x1.acf5de2afbd5ap-3, -0x1.d49ee012d2a36p-3, + -0x1.fbc16a1ed1966p-3, -0x1.11307dc445c4cp-2, -0x1.2440796db6523p-2, + -0x1.37124a7b0e1dap-2, -0x1.49a7834b7d089p-2, -0x1.5c01a2e712f36p-2, + -0x1.6e22207523bcdp-2, -0x1.800a59ccb4b43p-2, -0x1.91bba6c447a2fp-2, + -0x1.a3375ec336f01p-2, -0x1.b47ebfcfdd0dap-2, -0x1.c592fb2eea99p-2, + -0x1.d6753b20857bp-2, -0x1.e726a9208b01ep-2, -0x1.f7a8543486f32p-2, + -0x1.03fda781da376p-1, -0x1.0c104f268ec39p-1, -0x1.140c9fb5a8dafp-1, + -0x1.1bf31371c6a2p-1, -0x1.23c41b2f88f63p-1, -0x1.2b803302a31a2p-1, + -0x1.3327c82828c7dp-1, -0x1.3abb40a7ec0afp-1, -0x1.423b07f511315p-1, + -0x1.49a785d1d0f4ap-1, -0x1.51011934bf518p-1, -0x1.584820b2f56a4p-1, + -0x1.5f7cfece7619p-1, -0x1.66a00716cedc6p-1, -0x1.6db194ce2d40ap-1, + -0x1.74b1fcac361d3p-1, -0x1.7ba1911bb9cf6p-1, -0x1.82809cff91ccap-1, + -0x1.894f76c358469p-1, -0x1.900e62e869cdfp-1, -0x1.96bdabfeb6a28p-1, + -0x1.9d5d9dab023d9p-1, -0x1.a3ee7f670bf3cp-1, -0x1.aa708efbac12bp-1, + -0x1.b0e414a155bfcp-1, -0x1.b74949237d9f7p-1, -0x1.bda06f68b3e6ep-1, + -0x1.c3e9ca1704a0fp-1, -0x1.ca258b4fc9ea1p-1, -0x1.d053f44c0c9e7p-1, + -0x1.d675400a8d4b1p-1, -0x1.dc899d687d98ep-1, -0x1.e29144ae898b8p-1, + -0x1.e88c6ca77b00ep-1, -0x1.ee7b44ce9bdc6p-1, -0x1.f45e05f15ca47p-1, + -0x1.fa34e145a695p-1, -0x1.ffffffffffe3p-1 + }; + static const double b[] = { + 0x1.7154765bab3edp+0, -0x1.71574d692522fp-1, 0x1.ec60b55c8f05p-2 + }; + static const double c[] = { + 0x1.71547652b8314p+0, -0x1.71547652b7f67p-1, 0x1.ec709db872c6dp-2, + -0x1.715476b06590ep-2, 0x1.277c72c128c69p-2, -0x1.ec4ff30af701bp-3 + }; + static const double g[] = { + 0x1.4ae0bf64f73a1p-26, -0x1.71547652b82fap-1, 0x1.ec709dc3bd7dep-2, + -0x1.71547652e6faap-2, 0x1.2776c0ff5c16ep-2, -0x1.ec70942dfbb5bp-3, + 0x1.a673c6b6e2fa3p-3, -0x1.71b0db8113c46p-3 }; double z = x; uint32_t ux = asuint (x); + if (__glibc_unlikely (ux >= 0xbf800000u)) + return as_special (x); // x<=-1, x=-inf, x=-nan uint32_t ax = ux & (~0u >> 1); - if (__glibc_unlikely (ux >= 0x17fu << 23)) - { /* x <= -1 */ - if (ux == (0x17fu << 23)) - return __math_divzerof (1); - if (ux > (0x1ffu << 23)) - return x + x; /* nan */ - return __math_invalidf (x); - } - else if (__glibc_unlikely (ax >= (0xff << 23))) - { /* +inf, nan */ - if (ax > (0xff << 23)) - return x + x; /* nan */ - return INFINITY; - } - else if (__glibc_likely (ax < 0x3cb7aa26u)) - { /* |x| < 0x1.6f544cp-6 */ - double z2 = z * z, z4 = z2 * z2; - if ( __glibc_likely (ax < 0x3b9d9d34u)) - { /* |x| < 0x1.3b3a68p-8 */ - if (__glibc_likely (ax < 0x39638a7eu)) - { /* |x| < 0x1.c714fcp-13 */ - if (__glibc_likely (ax < 0x329c5639u)) - { /* |x| < 0x1.38ac72p-26 */ - static const double c[] = - { - 0x1.71547652b82fep+0, -0x1.71547652b82ffp-1 - }; - return z * (c[0] + z * c[1]); - } - else - { - if (__glibc_unlikely (ux == 0x32ff7045u)) - return 0x1.70851ap-25f - 0x1.8p-80f; - if (__glibc_unlikely (ux == 0xb395efbbu)) - return -0x1.b0a00ap-24f + 0x1p-76f; - if (__glibc_unlikely (ux == 0x35a14df7u)) - return 0x1.d16d2p-20f + 0x1p-72f; - if (__glibc_unlikely (ux == 0x3841cb81u)) - return 0x1.17949ep-14f + 0x1p-67f; - static const double c[] = - { - 0x1.71547652b82fep+0, -0x1.71547652b82fdp-1, - 0x1.ec709ead0c9a7p-2, -0x1.7154773c1cb29p-2 - }; - return z * ((c[0] + z * c[1]) + z2 * (c[2] + z * c[3])); - } - } - else - { - if (__glibc_unlikely (ux == 0xbac9363du)) - return -0x1.2282aap-9f + 0x1p-61f; - static const double c[] = - { - 0x1.71547652b82fep+0, -0x1.71547652b83p-1, - 0x1.ec709dc28f51bp-2, -0x1.7154765157748p-2, - 0x1.2778a510a3682p-2, -0x1.ec745df1551fcp-3 - }; - return z - * ((c[0] + z * c[1]) + z2 * (c[2] + z * c[3]) - + z4 * ((c[4] + z * c[5]))); - } + if (__glibc_unlikely (ax >= 0x7f800000u)) + return as_special (x); // x=+inf, x=+nan + if (__glibc_unlikely (ax < 0x3cc00000u)) + { // |x|<0.0234375 + if (__glibc_unlikely (ax <= 0x58b90bu)) + { // |x|<=0x1-126*ln(2) + if (ax == 0) + return x; // log2p1(-0.0) = -0.0 and log2p1(+0.0) = +0.0 + return z * 0x1.71547652b82fep+0; } else { - static const double c[] = - { - 0x1.71547652b82fep+0, -0x1.71547652b82fbp-1, - 0x1.ec709dc3b6a73p-2, -0x1.71547652dc09p-2, - 0x1.2776c1a88901p-2, -0x1.ec7095bd4d208p-3, - 0x1.a66bec7fc8f7p-3, -0x1.71a900fc3f3f9p-3 - }; - return z - * ((c[0] + z * c[1]) + z2 * (c[2] + z * c[3]) - + z4 * ((c[4] + z * c[5]) + z2 * (c[6] + z * c[7]))); - } - } - else - { /* |x| >= 0x1.6f544cp-6 */ - float h, l; - /* With gcc 6.3.0, if we return 0x1.e90026p+4f + 0x1.fp-21 - in the second exceptional case, with rounding up it yields - 0x1.e90026p+4 which is incorrect, thus we use this workaround. See - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112367. */ - if (__glibc_unlikely (ux == 0x52928e33u)) - { - h = 0x1.318ffap+5f; - l = 0x1.fp-20f; - return h + l; + double z2 = z * z, z4 = z2 * z2; + double f = z + * ((g[0] + z * g[1]) + z2 * (g[2] + z * g[3]) + + z4 * ((g[4] + z * g[5]) + z2 * (g[6] + z * g[7]))); + f += z * 0x1.715476p+0; // the product is exact + return f; } - if (__glibc_unlikely (ux == 0x4ebd09e3u)) - { - h = 0x1.e90026p+4f; - l = 0x1.fp-21; - return h + l; - } - uint64_t tp = asuint64 (z + 1.0); - uint64_t m = tp & (~(uint64_t) 0 >> 12); - int e = (tp >> 52) - 0x3ff; - int j = (m + ((int64_t) 1 << (52 - 8))) >> (52 - 7), k = j > 53; - e += k; - double xd = asdouble (m | (uint64_t) 0x3ff << 52); -#ifndef __FP_FAST_FMA - /* The fma is required only for x == -0x1.da285cp-5f in FE_TONEAREST - to provide correctly rounded results. */ - if (__glibc_likely (x != -0x1.da285cp-5f)) - z = xd * ix[j] - 1.0; - else -#endif - z = fma (xd, ix[j], -1.0); - static const double c[] = - { - 0x1.71547652b82fep+0, -0x1.71547652b82ffp-1, 0x1.ec709dc32988bp-2, - -0x1.715476521ec2bp-2, 0x1.277801a1ad904p-2, -0x1.ec731704d6a88p-3 - }; - double z2 = z * z; - double c0 = c[0] + z * c[1]; - double c2 = c[2] + z * c[3]; - double c4 = c[4] + z * c[5]; - c0 += z2 * (c2 + z2 * c4); - return (z * c0 - lix[j]) + e; } + uint64_t tp = asuint64 (z + 1.0); + int e = (tp >> 52) - UINT64_C(0x3ff); + uint64_t m = tp & (~0ul >> 12); + if (__glibc_unlikely (!m)) + return e; // do not raise the inexact exception for 1+x = 2^n + int32_t j = (m + (1ull << (52 - 7))) >> (52 - 6); + double xd = asdouble (m | UINT64_C(0x3ff) << 52); + double d = xd * ix[j] - 1.0, d2 = d * d, + el = e - lix[j]; // d is exact for x < 0x1.04p+29 + double f = (el + d * b[0]) + d2 * (b[1] + d * b[2]); + float lb = f, ub = f + 0x1.661p-32; + if (__glibc_likely (lb == ub)) + return lb; + for (int i = 0; i < 2; i++) + if (__glibc_unlikely (ux == asuint (tb[i].x))) + return tb[i].f + tb[i].df; + double c0 = c[0] + d * c[1]; + double c2 = c[2] + d * c[3]; + double c4 = c[4] + d * c[5]; + c0 += d2 * (c2 + d2 * c4); + f = e + (0x1.dp-45 - lix[j]) + d * c0; + lb = f; + return lb; } libm_alias_float (__log2p1, log2p1) From patchwork Tue Jan 27 19:56:43 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 129077 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id 936D44BA9024 for ; Tue, 27 Jan 2026 20:00:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 936D44BA9024 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=W/j56elA X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 6FA424BA2E2F for ; Tue, 27 Jan 2026 19:57:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6FA424BA2E2F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6FA424BA2E2F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543871; cv=none; b=TZ+zLAQDmoUZxtIuT/Wk5MXSdOuv4hncZuR6VGxRy5wxq7WQgIOHuIoUsOPurL0A+tOd7ZzkEvsmUTAhm+7O0YOcjQIIVN2Rn0aSgvFYccARbkKEgWXzYlhyO4890jDyRgbacAakqHTCY/ViSy/vUxbbjdmheoOXewGswebKvyw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543871; c=relaxed/simple; bh=7IgAfcBpoX5kTXfVFjzALtiPjrJChI3/2tPWuc8la8U=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=YK8F4pMu2oHQeR9cDYqvZMiHOnRSRPy8teLziDo4vIFuxMhZFLgsT2tv5JZkoNmUdoxDlMIHeQNtjFLOFfUrzMJaXcquDNMWVbm8FsDZzwACCaWf6StkqhsxKrTAQq5mJF95r/8yveloWO0rV8wRM0vZ0+1pYGGleIogA2vQtLM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6FA424BA2E2F Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-47ee3a63300so68677395e9.2 for ; Tue, 27 Jan 2026 11:57:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1769543870; x=1770148670; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lHXuMO7yW0WjvZnoLhD5dRasholZV0w1kP7a52Qgir4=; b=W/j56elARzO+V3+W0JlqteayyT+3fdCyHOAjjLk8ACnqEKJsWqpzuEvhhYMDenqIKW 4QApoudMAcE37MOezNFEHqwEv12eMoNdLR41VnNQBoP7JEXbmHLNmaKteNBN75bu/Zv9 y1k4mBGpuYjcuzOpaSAxFNYrMzvxK5PYHRipW9p/z4J8hxSHuhGUDIpdZcn/3ZZr735u xqmoCgRc5s6C/lc1TqEUkLUTgehtPTD8Ozg6OkNkRiEsMyoNMQWwPgmnzUgLLxDwh46R HE+BsR+5IkIeYo4XUFfn+hqpJ29XrrXrPKTcsPhWPKAeaMPDOStRP+oplj12epFm+D6g 2UPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769543870; x=1770148670; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lHXuMO7yW0WjvZnoLhD5dRasholZV0w1kP7a52Qgir4=; b=VGt8D04UCaMn8Ko7Z35xtcc85XJjdB4ovsV0Pv/WwhfPD1Xd0WYZsOmB6afL74GMlh FLEeOcedEswbqbsXGldOThbGK8rLMAYUOZNCq+YZMkPDdZlHqj9m6mTff+rem/OrZUKc 5tDm6NijNx2CkI/WqJxGGU3+k/zRvIFD34XAWVADrxHGeLl5pPZJ7TB4yZWgjOWtXLgH HdNKvDvtXlsVam2xDkNNb3IsNfSA75U4bwWgSBUQLuLjO4hheRIJyoQSd6q3TFDxHHWW AoXymsBwMPvpNxdlgUpjlUCqhuMpMAqvW8gdf+vfP5BaDuKnL8kbB4wT7xKZ5DI+66/R lcAw== X-Gm-Message-State: AOJu0YxX/GpU8cA3usvdtvGClxEWJcQzGRVNHrM3V5I/mpAGCCbtA8Jm 7IZ7zRv0ZICpdoks+/R6+mkZcsRwCoVqUtqjum0QDaq16jQ15nZ6ZjMfHxMcGf9wFRpZkZtsTOV 72pMmjUo= X-Gm-Gg: AZuq6aLZ5diD35taaUj2BH52RktPXSax0H6LcS77uREPqoDNE3TSwTOl3vcwhBSWLOM csel1Yrgt2HvLbpraXGPM8pBIWz2z2i/0C1eMFNES2gDG2OuQt3f1GHOHw1jNw1OlBmzl1rcsBB QC25fNDrnmLLnJZcXChITCSgECJ+z46MGuO3k+azIR8uK4pVo/1xBsGOx9nWKn24/fZqLEvZviI rXznGFCcUs6ab2CFPUs5JZILdO1RQP58kGNKWmRWJMbzfh4QJiGmPomRWdvXLm5dv1FCceH7HD2 3huEfcxLX1TRqRrIMJ07Q5iRRHgnHv8Pig0n6u/vqnih9GKRPEiTSabixqX0iz9Xk4eilZeq/8z LNc7DYkZ5ZbikQbNsE/tRqAqlULLaNZKdENHJj50Z6tEk6+G3ZQEm4o41egqAUJ8mhxsddBFp/R cPQdVC/d+cHZxCvqfsa+DKSACKBQPLreBrtoy/2YzSIyZpes0= X-Received: by 2002:a05:600c:8117:b0:47f:b737:5ce0 with SMTP id 5b1f17b1804b1-48069c5a468mr36226865e9.23.1769543869534; Tue, 27 Jan 2026 11:57:49 -0800 (PST) Received: from ubuntu-vm.. (51-148-40-15.dsl.zen.co.uk. [51.148.40.15]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48066bfb58esm81363035e9.8.2026.01.27.11.57.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Jan 2026 11:57:49 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Paul Zimmermann , Joseph Myers , DJ Delorie Subject: [PATCH 3/5] math: Sync log10f with CORE-MATH Date: Tue, 27 Jan 2026 19:56:43 +0000 Message-ID: <20260127195741.2513011-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> References: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org The performance is similar: latency master sync improvement x86_64 34.6851 32.7977 5.44% x86_64v2 34.0921 32.4295 4.88% x86_64v3 27.8292 27.6070 0.80% aarch64 11.7246 11.1351 5.03% armhf-vpfv4 13.3748 12.9055 3.51% powerpc64le 6.4036 6.5825 -2.79% reciprocal-throughput master sync improvement x86_64 10.2653 10.0437 2.16% x86_64v2 10.8432 10.7040 1.28% x86_64v3 10.9006 11.0765 -1.61% aarch64 6.6447 6.2743 5.57% armhf-vpfv4 6.8916 6.7538 2.00% powerpc64le 2.9494 2.7661 6.21% x86_64 / i686 gcc version 15.2.1 20260112. Ryzen 5900X aarch64: gcc version 15.2.1 20251105, Neoverse-N1 armv7a-vpfv4: gcc version 15.2.1 20251105, Neoverse-N1 powerpc64le: gcc version 14.2.1 20241230, POWER10 The code size is also similar. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. --- SHARED-FILES | 2 +- sysdeps/ieee754/flt-32/e_log10f.c | 125 +++++++++++++++++------------- 2 files changed, 70 insertions(+), 57 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index 42215acedd..24b301e3d8 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -268,7 +268,7 @@ core-math: sysdeps/ieee754/flt-32/e_gammaf_r.c # src/binary32/lgamma/lgammaf.c, revision bc385c2 sysdeps/ieee754/flt-32/e_lgammaf_r.c - # src/binary32/log10/log10f.c, revision bc385c2 + # src/binary32/log10/log10f.c, revision ebff4c43 sysdeps/ieee754/flt-32/e_log10f.c # src/binary32/sinh/sinhf.c, revision bbfabd99 sysdeps/ieee754/flt-32/e_sinhf.c diff --git a/sysdeps/ieee754/flt-32/e_log10f.c b/sysdeps/ieee754/flt-32/e_log10f.c index e9210de136..e731500ef5 100644 --- a/sysdeps/ieee754/flt-32/e_log10f.c +++ b/sysdeps/ieee754/flt-32/e_log10f.c @@ -1,9 +1,9 @@ /* Correctly-rounded radix-10 logarithm function for binary32 value. -Copyright (c) 2022-2023 Alexei Sibidanov. +Copyright (c) 2022-2026 Alexei Sibidanov. This file is part of the CORE-MATH project -project (file src/binary32/log10/log10f.c, revision bc385c2). +project (file src/binary32/log10/log10f.c, revision ebff4c43). Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal @@ -31,24 +31,29 @@ SOFTWARE. #include #include "math_config.h" +#include +#include + static __attribute__ ((noinline)) float as_special (float x) { - uint32_t ux = asuint (x); - if (ux == 0x7f800000u) - return x; /* +inf */ - uint32_t ax = ux << 1; - if (ax == 0u) + uint32_t ux = asuint (x), ax = ux<<1; + if (ax == 0u) // x = +/-0.0 /* -0.0 */ - return __math_divzerof (1); + return __math_divzerof (1); + if (ux == 0x7f800000u) + return x; // x=+inf if (ax > 0xff000000u) - return x + x; /* nan */ + return x + x; // x=nan return __math_invalidf (x); } +typedef union {double f; uint64_t u;} b64u64_u; + float __log10f (float x) { + // reciprocal of 1+i/64 (i=0..64) rounded to 29 bits static const double tr[] = { 0x1p+0, 0x1.f81f82p-1, 0x1.f07c1fp-1, 0x1.e9131acp-1, @@ -69,45 +74,47 @@ __log10f (float x) 0x1.0842108p-1, 0x1.0624dd3p-1, 0x1.041041p-1, 0x1.0204081p-1, 0.5 }; + // logarithms of the reciprocals with offset static const double tl[] = { - -0x1.d45fd6237ebe3p-47, 0x1.b947689311b6ep-8, 0x1.b5e909c96d7d5p-7, - 0x1.45f4f59ed2165p-6, 0x1.af5f92cbd8f1ep-6, 0x1.0ba01a606de8cp-5, - 0x1.3ed119b9a2b7bp-5, 0x1.714834298eec2p-5, 0x1.a30a9d98357fbp-5, - 0x1.d41d512670813p-5, 0x1.02428c0f65519p-4, 0x1.1a23444eecc3ep-4, - 0x1.31b30543f4cb4p-4, 0x1.48f3ed39bfd04p-4, 0x1.5fe8049a0e423p-4, - 0x1.769140a6aa008p-4, 0x1.8cf1836c98cb3p-4, 0x1.a30a9d55541a1p-4, - 0x1.b8de4d1ee823ep-4, 0x1.ce6e4202ca2e6p-4, 0x1.e3bc1accace07p-4, - 0x1.f8c9683b5abd4p-4, 0x1.06cbd68ca9a6ep-3, 0x1.11142f19df73p-3, - 0x1.1b3e71fa7a97fp-3, 0x1.254b4d37a46e3p-3, 0x1.2f3b6912cbf07p-3, - 0x1.390f683115886p-3, 0x1.42c7e7fffc5a8p-3, 0x1.4c65808c78d3cp-3, - 0x1.55e8c50751c55p-3, 0x1.5f52445dec3d8p-3, 0x1.68a288c3f12p-3, - 0x1.71da17bdf0d19p-3, 0x1.7af973608afd9p-3, 0x1.84011952a2579p-3, - 0x1.8cf1837a7ea6p-3, 0x1.95cb2891e43d6p-3, 0x1.9e8e7b0f869ep-3, - 0x1.a73beaa5db18dp-3, 0x1.afd3e394558d3p-3, 0x1.b856cf060d9f1p-3, - 0x1.c0c5134de1ffcp-3, 0x1.c91f1371bc99fp-3, 0x1.d1652ffcd3f53p-3, - 0x1.d997c6f635e75p-3, 0x1.e1b733ab90f3bp-3, 0x1.e9c3ceadac856p-3, - 0x1.f1bdeec43a305p-3, 0x1.f9a5e7a5fa3fep-3, 0x1.00be05ac02f2bp-2, - 0x1.04a054d81a2d4p-2, 0x1.087a0835957fbp-2, 0x1.0c4b457099517p-2, - 0x1.101431aa1fe51p-2, 0x1.13d4f08b98dd8p-2, 0x1.178da53edb892p-2, - 0x1.1b3e71e9f9d58p-2, 0x1.1ee777defdeedp-2, 0x1.2288d7b48e23bp-2, - 0x1.2622b0f52e49fp-2, 0x1.29b522a4c6314p-2, 0x1.2d404b0e30f8p-2, - 0x1.30c4478f3fbe5p-2, 0x1.34413509f7915p-2 + -0x1.2p-46, 0x1.b947689310dfap-8, 0x1.b5e909c96d11bp-7, + 0x1.45f4f59ed1e08p-6, 0x1.af5f92cbd8bc1p-6, 0x1.0ba01a606dcdep-5, + 0x1.3ed119b9a29cdp-5, 0x1.714834298ed14p-5, 0x1.a30a9d983564cp-5, + 0x1.d41d512670665p-5, 0x1.02428c0f65442p-4, 0x1.1a23444eecb67p-4, + 0x1.31b30543f4bddp-4, 0x1.48f3ed39bfc2dp-4, 0x1.5fe8049a0e34bp-4, + 0x1.769140a6a9f3p-4, 0x1.8cf1836c98bdbp-4, 0x1.a30a9d55540cap-4, + 0x1.b8de4d1ee8167p-4, 0x1.ce6e4202ca20fp-4, 0x1.e3bc1accacd3p-4, + 0x1.f8c9683b5aafdp-4, 0x1.06cbd68ca9a03p-3, 0x1.11142f19df6c5p-3, + 0x1.1b3e71fa7a913p-3, 0x1.254b4d37a4677p-3, 0x1.2f3b6912cbe9cp-3, + 0x1.390f68311581ap-3, 0x1.42c7e7fffc53dp-3, 0x1.4c65808c78cd1p-3, + 0x1.55e8c50751beap-3, 0x1.5f52445dec36cp-3, 0x1.68a288c3f1195p-3, + 0x1.71da17bdf0cadp-3, 0x1.7af973608af6ep-3, 0x1.84011952a250ep-3, + 0x1.8cf1837a7e9f4p-3, 0x1.95cb2891e436ap-3, 0x1.9e8e7b0f86974p-3, + 0x1.a73beaa5db121p-3, 0x1.afd3e39455867p-3, 0x1.b856cf060d985p-3, + 0x1.c0c5134de1f9p-3, 0x1.c91f1371bc934p-3, 0x1.d1652ffcd3ee8p-3, + 0x1.d997c6f635e09p-3, 0x1.e1b733ab90edp-3, 0x1.e9c3ceadac7ebp-3, + 0x1.f1bdeec43a29ap-3, 0x1.f9a5e7a5fa392p-3, 0x1.00be05ac02ef5p-2, + 0x1.04a054d81a29ep-2, 0x1.087a0835957c5p-2, 0x1.0c4b4570994e1p-2, + 0x1.101431aa1fe1bp-2, 0x1.13d4f08b98da2p-2, 0x1.178da53edb85cp-2, + 0x1.1b3e71e9f9d22p-2, 0x1.1ee777defdeb7p-2, 0x1.2288d7b48e205p-2, + 0x1.2622b0f52e469p-2, 0x1.29b522a4c62dep-2, 0x1.2d404b0e30f4ap-2, + 0x1.30c4478f3fbafp-2, 0x1.34413509f78dfp-2 }; + // 10^n static const union { float f; uint32_t u; } st[] = { - { 0x1p+0 }, { 0x1.4p+3 }, { 0x1.9p+6 }, { 0x1.f4p+9 }, - { 0x1.388p+13 }, { 0x1.86ap+16 }, { 0x1.e848p+19 }, { 0x1.312dp+23 }, - { 0x1.7d784p+26 }, { 0x1.dcd65p+29 }, { 0x1.2a05f2p+33 }, { 0 }, - { 0 }, { 0 }, { 0 }, { 0 } + { 0x1.2a05f2p+33 }, { 0x1.4p+3 }, { 0x1.9p+6 }, { 0 }, + { 0x1.f4p+9 }, { 0 }, { 0x1.388p+13 }, { 0x1.86ap+16 }, + { 0 }, { 0x1.e848p+19 }, { 0 }, { 0x1.312dp+23 }, + { 0x1.7d784p+26 }, { 0 }, { 0x1.dcd65p+29 }, { 0x1p+0 } }; static const double b[] = { - 0x1.bcb7b15c5a2f8p-2, -0x1.bcbb1dbb88ebap-3, 0x1.2871c39d521c6p-3 + 0x1.bcb7b15d35067p-2, -0x1.bcbb1cd29cbafp-3, 0x1.2870e2624ce4ep-3 }; static const double c[] = { @@ -115,29 +122,35 @@ __log10f (float x) -0x1.bcb7b146a14b3p-4, 0x1.63c627d5219cbp-4, -0x1.2880736c8762dp-4, 0x1.fc1ecf913961ap-5 }; + + // ln(2)/ln(10) + const double ln10 = 0x1.34413509f79ffp-2, + ln10h = 0x1.34413509f7ap-2, + ln10l = -0x1.0cee0ed4ca7e9p-54; uint32_t ux = asuint (x); - if (__glibc_unlikely (ux < (1 << 23) || ux >= 0x7f800000u)) + if (__glibc_unlikely (ux >= 0x7f800000u)) + return as_special (x); // <=-0, nan, inf + if (__glibc_unlikely (ux == st[(ux >> 24) & 0xf].u)) + { // x = 10^n + unsigned je = ((int) ux >> 23) - 126; + je = (je * 0x4d104d4) >> 28; + return je; + } + if (__glibc_unlikely (ux < 0x00800000u)) { - if (ux == 0 || ux >= 0x7f800000u) - return as_special (x); - /* subnormal */ + if (__glibc_unlikely (ux == 0u)) + return as_special (x); // x=+0 + // subnormal int n = __builtin_clz (ux) - 8; ux <<= n; ux -= n << 23; } - unsigned m = ux & ((1 << 23) - 1), j = (m + (1 << (23 - 7))) >> (23 - 6); - double ix = tr[j], l = tl[j]; int e = ((int) ux >> 23) - 127; - unsigned je = e + 1; - je = (je * 0x4d104d4) >> 28; - if (__glibc_unlikely (ux == st[je].u)) - return je; - - double tz = asdouble (((int64_t) m | ((int64_t) 1023 << 23)) << (52 - 23)); - double z = tz * ix - 1, z2 = z * z; - double r - = ((e * 0x1.34413509f79ffp-2 + l) + z * b[0]) + z2 * (b[1] + z * b[2]); - float ub = r, lb = r + 0x1.b008p-34; + int64_t m = ux & ((1 << 23) - 1), j = (m + (1 << (23 - 7))) >> (23 - 6); + double tz = asdouble ((m << (52 - 23)) | (UINT64_C(0x3ff) << 52)); + double z = tz * tr[j] - 1, z2 = z * z; + double r = ((e * ln10 + tl[j]) + z * b[0]) + z2 * (b[1] + z * b[2]); + float ub = r, lb = r + 0x1.af23fp-34; if (__glibc_unlikely (ub != lb)) { double f = z @@ -145,13 +158,13 @@ __log10f (float x) + z2 * ((c[2] + z * c[3]) + z2 * (c[4] + z * c[5] + z2 * c[6]))); - f -= 0x1.0cee0ed4ca7e9p-54 * e; - f += l - tl[0]; - double el = e * 0x1.34413509f7ap-2; + f += ln10l * e; + f += tl[j] - tl[0]; + double el = e * ln10h; r = el + f; ub = r; tz = r; - if (__glibc_unlikely (!((asuint64 (tz) & ((1 << 28) - 1))))) + if (__glibc_unlikely (!(asuint64 (tz) & ((1 << 28) - 1)))) { double dr = (el - r) + f; r += dr * 32; From patchwork Tue Jan 27 19:56:44 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 129078 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id 0AC754BAD173 for ; Tue, 27 Jan 2026 20:01:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0AC754BAD173 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=STPBR0oI X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by sourceware.org (Postfix) with ESMTPS id 5DF074BA2E3E for ; Tue, 27 Jan 2026 19:57:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5DF074BA2E3E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5DF074BA2E3E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543872; cv=none; b=KnxahBM37n/bdTLcHMmCPn1RAIg9NNF0UPSwnpESznliiEj3MMO7nENt0K11sbFibi7FIZ84TmkA6ylTkLJUUV03DzyP4dfJx/QFpHtvG4Zk8iTxaordtKABUiJY/J03EV3E6dHvacdhSnOpUe16xGzcBPZ0Clm02/LmaaQiO1Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543872; c=relaxed/simple; bh=g5WoV8wkbHYzh9FM3u+N4mBH/YHQxEia4oJ/YlmWJlk=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=OkjR4wo9k6+r/vlYUkZggEYQTXeVZ8/oa6HYySlYHLr3G1arPwGiMtLJ5zrc+34PcIG1uf5H1yvtMWFlgJVaPg8CzivXBi9NtJXKEGwbkASXkyq8TPecIIbA182hLQuxcU5blOGB/W2I0LpYue94knaX0JdvV6FMn/1dRmcxx4s= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5DF074BA2E3E Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-47ee0291921so52386015e9.3 for ; Tue, 27 Jan 2026 11:57:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1769543871; x=1770148671; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qefpUt6pgBjNKosgFYqt7LPHUkH+sU0+SHiK02fyeSo=; b=STPBR0oIUPXU/ejvx9SUfiCpvOmp7yznxBVPAjfY2CdbHS+1LDT0hMfXE/aXZY7DKJ 33HrVrhsjM1gWdUdH75y84pYSSKalrjYwh4YlV6oUkz/GWhP7iOVS5QdKXT25lxQEml8 MNJH2/O6mAIPc3+okDbhQVJdaTEbUWzlnI9ZtDnfiOsFoSOevNsXpNpjijQCrb/IOfBc mWHro1NM5JyxOn2MirXATYgwKOHje+Q7njnWgeEif7zW53RgmnibOQgtTjWacu3pxiIF 0NIWCE949g2zMbPamXil+ZU/V0d92Hw2C8ZhGYrgBNOo1AaB0f9Tj6Fc74hhS/Eez99L UnDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769543871; x=1770148671; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=qefpUt6pgBjNKosgFYqt7LPHUkH+sU0+SHiK02fyeSo=; b=WPaDXF7PWA64Aeuo5V11cHXc/f+VermPlV2XJR9ZX4aB6TTbQ8OWlx5WnnOoQOPJZi m8wznDiZIlaW9XdqKc7/hxIIRs7HTA6aOej1skeX5OdL9LNXbWPZnih7hZYi5NNxJkOi mck9VEu9NEN7sWttbdxh2dKpzIYNTM4rpgCyTtd1mzsnNr+VFMipkFA/XMzlxeGM1xEK IyFdFUh49MMBs+JsdVaKZ65tbSppMPYo3dp1AwpM/zbmOZNCH9GixafpIbKjq3W7UwUi LZXdX+apZu4iqYn+ZMkr8jS5SPGAjbEQglz2bt9PctQLiVmlOCnCZsTKeMolmfSrew9m y7MA== X-Gm-Message-State: AOJu0Yy0P3veP2CsgRH0vBvRfhbV+EQ+A/hIHmWF0OHKQC8GplDfx+qn 3dQLZ78QOTWmMc3tRee62CRJJSrVaHNgdEPEc2knWGArG8AuiFYW86Gvv+7kYafbzTJpBzcP66l ijq1Ev3Y= X-Gm-Gg: AZuq6aKGYXJqrlOp2YgK6WU8G0AxrplYhkU30QQyyK0TVk+EahWtJke6+j04EDxY+QF KuaVB2f2GuIXlJaOpROHQqmIyPINBEuRGPFzF18NDbPVPxh7edHBy0CxA6B/vMHd1MwvSaxXcBe KV5WKeL9dYjp2ySCVUIQeKbi5Ry2Lp2zQ5RQSmTWWZ3owlZgtWKUNU0onueFi4M6PGPdjYyrPCc aherFTr6wr3Hgp086TEVFAC+rJjVNDUCJx+u6E6ShYNoKGSCq236pOT0s9ITirPsgUMQYcsP7xg 5fcpvCP80Wx0EcDeitpzMWQEO7CFy/fcVWp6mVMK8kxUvWLtzY7wxPCHpsUKOZNQIuBaUPoLDc+ GpLnfVd4wNlom/fpUN0owhvV7X+vdDIN6Y3BZYITcXtRQNITzDTLca40O25PHmaBuK2gxGgpqJD L3SXaQXVx1nD29mDTE+cz8QrVKhGl0SEfRZidbg160qctEQ5U= X-Received: by 2002:a05:600c:4689:b0:475:dd9a:f791 with SMTP id 5b1f17b1804b1-48069c98051mr31659085e9.28.1769543870422; Tue, 27 Jan 2026 11:57:50 -0800 (PST) Received: from ubuntu-vm.. (51-148-40-15.dsl.zen.co.uk. [51.148.40.15]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48066bfb58esm81363035e9.8.2026.01.27.11.57.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Jan 2026 11:57:49 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Paul Zimmermann , Joseph Myers , DJ Delorie Subject: [PATCH 4/5] math: Sync log10p1f with CORE-MATH Date: Tue, 27 Jan 2026 19:56:44 +0000 Message-ID: <20260127195741.2513011-5-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> References: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org The new code shows a small performance increase in x86_64: latency patched sync improvement x86_64 41.8873 40.2864 3.82% x86_64v2 40.5859 39.2079 3.40% x86_64v3 34.6393 33.5018 3.28% aarch64 15.2731 14.5953 4.44% armhf-vpfv4 17.0373 17.0186 0.11% powerpc64le 8.3341 8.3298 0.05% reciprocal-throughput patched sync improvement x86_64 15.6516 13.6373 12.87% x86_64v2 15.0551 13.2769 11.81% x86_64v3 12.8994 11.0628 14.24% aarch64 8.8306 9.1898 -4.07% armhf-vpfv4 9.5855 10.0199 -4.53% powerpc64le 4.0074 4.4466 -10.96% x86_64 / i686 gcc version 15.2.1 20260112. Ryzen 5900X aarch64: gcc version 15.2.1 20251105, Neoverse-N1 armv7a-vpfv4: gcc version 15.2.1 20251105, Neoverse-N1 powerpc64le: gcc version 14.2.1 20241230, POWER10 The code size also show slight improvement, the s_log10pf.os 'size' output shows: size patched sync improvement x86_64 2345 2243 4.35% x86_64v2 2345 2243 4.35% x86_64v3 2226 2162 2.88% aarch64 2104 2112 -0.38% armhf-vpfv4 2016 2012 0.20% powerpc64le 2324 2340 -0.69% Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. --- SHARED-FILES | 2 +- sysdeps/ieee754/flt-32/s_log10p1f.c | 231 +++++++++++++++------------- 2 files changed, 124 insertions(+), 109 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index 24b301e3d8..875b1fea28 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -294,7 +294,7 @@ core-math: sysdeps/ieee754/flt-32/s_exp2m1f.c # src/binary32/expm1/expm1f.c, revision bc385c2 sysdeps/ieee754/flt-32/s_expm1f.c - # src/binary32/log10p1/log10p1f.c revision bc385c2 + # src/binary32/log10p1/log10p1f.c revision eb28456b sysdeps/ieee754/flt-32/s_log10p1f.c # src/binary32/log1p/log1pf.c revision 24ef43a1 sysdeps/ieee754/flt-32/s_log1pf.c diff --git a/sysdeps/ieee754/flt-32/s_log10p1f.c b/sysdeps/ieee754/flt-32/s_log10p1f.c index d9e1149201..ab881d505f 100644 --- a/sysdeps/ieee754/flt-32/s_log10p1f.c +++ b/sysdeps/ieee754/flt-32/s_log10p1f.c @@ -3,7 +3,7 @@ Copyright (c) 2022-2023 Alexei Sibidanov. This file is part of the CORE-MATH project -project (file src/binary32/log10p1/log10p1f.c revision bc385c2). +project (file src/binary32/log10p1/log10p1f.c revision eb28456b). Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal @@ -48,50 +48,62 @@ as_special (float x) float __log10p1f (float x) { + static const struct + { + float x; + float f, df; + } tb[] = { + { 0x1.34f8p-12, 0x1.0c53cap-13, 0x1p-38}, + { -0x1.a3c2e6p-31, -0x1.6c999ep-32, 0x1p-57}, + { -0x1.1d4db2p-9, -0x1.f029c4p-11, 0x1p-36}, + { -0x1.0dff72p-4, -0x1.e53536p-6, -0x1p-31}, + }; + + // reciprocal of 1+(j+0.5)/64 rounded to 24 bits static const double tr[] = { - 0x1p+0, 0x1.f81f82p-1, 0x1.f07c1fp-1, 0x1.e9131acp-1, - 0x1.e1e1e1ep-1, 0x1.dae6077p-1, 0x1.d41d41dp-1, 0x1.cd85689p-1, - 0x1.c71c71cp-1, 0x1.c0e0704p-1, 0x1.bacf915p-1, 0x1.b4e81b5p-1, - 0x1.af286bdp-1, 0x1.a98ef6p-1, 0x1.a41a41ap-1, 0x1.9ec8e95p-1, - 0x1.999999ap-1, 0x1.948b0fdp-1, 0x1.8f9c19p-1, 0x1.8acb90fp-1, - 0x1.8618618p-1, 0x1.8181818p-1, 0x1.7d05f41p-1, 0x1.78a4c81p-1, - 0x1.745d174p-1, 0x1.702e05cp-1, 0x1.6c16c17p-1, 0x1.6816817p-1, - 0x1.642c859p-1, 0x1.605816p-1, 0x1.5c9882cp-1, 0x1.58ed231p-1, - 0x1.5555555p-1, 0x1.51d07ebp-1, 0x1.4e5e0a7p-1, 0x1.4afd6ap-1, - 0x1.47ae148p-1, 0x1.446f865p-1, 0x1.4141414p-1, 0x1.3e22cbdp-1, - 0x1.3b13b14p-1, 0x1.3813814p-1, 0x1.3521cfbp-1, 0x1.323e34ap-1, - 0x1.2f684bep-1, 0x1.2c9fb4ep-1, 0x1.29e412ap-1, 0x1.27350b9p-1, - 0x1.2492492p-1, 0x1.21fb781p-1, 0x1.1f7047ep-1, 0x1.1cf06aep-1, - 0x1.1a7b961p-1, 0x1.1811812p-1, 0x1.15b1e5fp-1, 0x1.135c811p-1, - 0x1.1111111p-1, 0x1.0ecf56cp-1, 0x1.0c9715p-1, 0x1.0a6810ap-1, - 0x1.0842108p-1, 0x1.0624dd3p-1, 0x1.041041p-1, 0x1.0204081p-1, - 0.5 + 0x1.fc07fp-1, 0x1.f4465ap-1, 0x1.ecc07cp-1, 0x1.e573acp-1, + 0x1.de5d6ep-1, 0x1.d77b66p-1, 0x1.d0cb58p-1, 0x1.ca4b3p-1, + 0x1.c3f8fp-1, 0x1.bdd2b8p-1, 0x1.b7d6c4p-1, 0x1.b20364p-1, + 0x1.ac5702p-1, 0x1.a6d01ap-1, 0x1.a16d4p-1, 0x1.9c2d14p-1, + 0x1.970e5p-1, 0x1.920fb4p-1, 0x1.8d3018p-1, 0x1.886e6p-1, + 0x1.83c978p-1, 0x1.7f406p-1, 0x1.7ad22p-1, 0x1.767dcep-1, + 0x1.724288p-1, 0x1.6e1f76p-1, 0x1.6a13cep-1, 0x1.661ec6p-1, + 0x1.623fa8p-1, 0x1.5e75bcp-1, 0x1.5ac056p-1, 0x1.571ed4p-1, + 0x1.539094p-1, 0x1.501502p-1, 0x1.4cab88p-1, 0x1.49539ep-1, + 0x1.460cbcp-1, 0x1.42d662p-1, 0x1.3fb014p-1, 0x1.3c995ap-1, + 0x1.3991c2p-1, 0x1.3698ep-1, 0x1.33ae46p-1, 0x1.30d19p-1, + 0x1.2e025cp-1, 0x1.2b404ap-1, 0x1.288b02p-1, 0x1.25e228p-1, + 0x1.234568p-1, 0x1.20b47p-1, 0x1.1e2ef4p-1, 0x1.1bb4a4p-1, + 0x1.194538p-1, 0x1.16e068p-1, 0x1.1485fp-1, 0x1.12358ep-1, + 0x1.0fef02p-1, 0x1.0db20ap-1, 0x1.0b7e6ep-1, 0x1.0953f4p-1, + 0x1.07326p-1, 0x1.05198p-1, 0x1.03091cp-1, 0x1.010102p-1 }; + // logarithm of the reciprocals biased by 0x1.58ep-43 static const double tl[] = { - -0x1.562ec497ef351p-43, 0x1.b9476892ea99cp-8, 0x1.b5e909c959eecp-7, - 0x1.45f4f59ec84fp-6, 0x1.af5f92cbcf2aap-6, 0x1.0ba01a6069052p-5, - 0x1.3ed119b99dd41p-5, 0x1.714834298a088p-5, 0x1.a30a9d98309c1p-5, - 0x1.d41d51266b9d9p-5, 0x1.02428c0f62dfcp-4, 0x1.1a23444eea521p-4, - 0x1.31b30543f2597p-4, 0x1.48f3ed39bd5e7p-4, 0x1.5fe8049a0bd06p-4, - 0x1.769140a6a78eap-4, 0x1.8cf1836c96595p-4, 0x1.a30a9d5551a84p-4, - 0x1.b8de4d1ee5b21p-4, 0x1.ce6e4202c7bc9p-4, 0x1.e3bc1accaa6eap-4, - 0x1.f8c9683b584b7p-4, 0x1.06cbd68ca86ep-3, 0x1.11142f19de3a2p-3, - 0x1.1b3e71fa795fp-3, 0x1.254b4d37a3354p-3, 0x1.2f3b6912cab79p-3, - 0x1.390f6831144f7p-3, 0x1.42c7e7fffb21ap-3, 0x1.4c65808c779aep-3, - 0x1.55e8c507508c7p-3, 0x1.5f52445deb049p-3, 0x1.68a288c3efe72p-3, - 0x1.71da17bdef98bp-3, 0x1.7af9736089c4bp-3, 0x1.84011952a11ebp-3, - 0x1.8cf1837a7d6d1p-3, 0x1.95cb2891e3048p-3, 0x1.9e8e7b0f85651p-3, - 0x1.a73beaa5d9dfep-3, 0x1.afd3e39454544p-3, 0x1.b856cf060c662p-3, - 0x1.c0c5134de0c6dp-3, 0x1.c91f1371bb611p-3, 0x1.d1652ffcd2bc5p-3, - 0x1.d997c6f634ae6p-3, 0x1.e1b733ab8fbadp-3, 0x1.e9c3ceadab4c8p-3, - 0x1.f1bdeec438f77p-3, 0x1.f9a5e7a5f906fp-3, 0x1.00be05ac02564p-2, - 0x1.04a054d81990cp-2, 0x1.087a083594e33p-2, 0x1.0c4b457098b4fp-2, - 0x1.101431aa1f48ap-2, 0x1.13d4f08b98411p-2, 0x1.178da53edaecbp-2, - 0x1.1b3e71e9f9391p-2, 0x1.1ee777defd526p-2, 0x1.2288d7b48d874p-2, - 0x1.2622b0f52dad8p-2, 0x1.29b522a4c594cp-2, 0x1.2d404b0e305b9p-2, - 0x1.30c4478f3f21dp-2, 0x1.34413509f6f4dp-2 + 0x1.bafd550786257p-9, 0x1.49b08209ec64p-7, 0x1.10a82eca6416p-6, + 0x1.7adc46340fc4p-6, 0x1.e3806e7ccbebp-6, 0x1.255026bdcb233p-5, + 0x1.5823964d3c8dcp-5, 0x1.8a3fb08692a5fp-5, 0x1.bba9a137364d7p-5, + 0x1.ec664cb24c458p-5, 0x1.0e3d294d154ccp-4, 0x1.25f5217a7e5dfp-4, + 0x1.3d5d3200e0e36p-4, 0x1.547774e40ea5ap-4, 0x1.6b45ddb283c9cp-4, + 0x1.81ca67d4c05e1p-4, 0x1.9806d71561d18p-4, 0x1.adfd09f848345p-4, + 0x1.c3aea856ca97fp-4, 0x1.d91d540edcaep-4, 0x1.ee4ab8dbebb39p-4, + 0x1.019c29971c034p-3, 0x1.0bf3d1e104ae2p-3, 0x1.162d08ca9a7bep-3, + 0x1.204881c31ba38p-3, 0x1.2a46ea803f1cbp-3, 0x1.3428e0134104bp-3, + 0x1.3def1007f5f74p-3, 0x1.479a066a5fa87p-3, 0x1.512a631ebcdecp-3, + 0x1.5aa0b67441c74p-3, 0x1.63fd84e6e41dep-3, 0x1.6d41602642abbp-3, + 0x1.766cc236f8424p-3, 0x1.7f80367f521b3p-3, 0x1.887c2f002ebc7p-3, + 0x1.916128c710c17p-3, 0x1.9a2f9609731f8p-3, 0x1.a2e7e853a516cp-3, + 0x1.ab8a901fe9635p-3, 0x1.b417f6bfa9e71p-3, 0x1.bc907da9eace3p-3, + 0x1.c4f494e895d78p-3, 0x1.cd44987634191p-3, 0x1.d580e68cc0a87p-3, + 0x1.dda9df79b84d2p-3, 0x1.e5bfd37200ee8p-3, 0x1.edc325e4a0f81p-3, + 0x1.f5b4297735a7cp-3, 0x1.fd93318156892p-3, 0x1.02b042b675879p-2, + 0x1.068e3fa975162p-2, 0x1.0a63b3535192bp-2, 0x1.0e30c45ab291fp-2, + 0x1.11f595eceef01p-2, 0x1.15b24abf1aeb1p-2, 0x1.196704f31e753p-2, + 0x1.1d13ec95021b6p-2, 0x1.20b91bd192db3p-2, 0x1.2456b26c914d5p-2, + 0x1.27ecd5ea10d93p-2, 0x1.2b7b9d4731bd4p-2, 0x1.2f032bca9e44bp-2, + 0x1.32839caee0d8ap-2 }; static const union { @@ -99,83 +111,86 @@ __log10p1f (float x) uint32_t u; } st[] = { - { 0x0p+0 }, { 0x1.2p+3 }, { 0x1.8cp+6 }, - { 0x1.f38p+9 }, { 0x1.3878p+13 }, { 0x1.869fp+16 }, - { 0x1.e847ep+19 }, { 0x1.312cfep+23 } + { 0x125p+0 }, { 0x1.2p+3 }, { 0x1.8cp+6 }, { 0} , + { 0x1.f38p+9 }, { 0 }, { 0x1.3878p+13 }, { 0x1.869fp+16 }, + { 0 }, { 0x1.e847ep+19 }, { 0 }, { 0x1.312cfep+23 }, + { 0 }, { 0 }, { 0 }, { 0 } }; + static const double b[] = + { + 0x1.bcb7b150bf33dp-2, -0x1.bcb7b14b2164ep-3, 0x1.287de1f406bedp-3, + -0x1.bcbfad32135bdp-4 + }; + static const double c[] = + { + 0x1.bcb7b1526e50ep-2, -0x1.bcb7b1526e48ep-3, 0x1.287a7636f422fp-3, + -0x1.bcb7b15514181p-4, 0x1.63c62778ff0d1p-4, -0x1.287a581961505p-4, + 0x1.fc3f60b6c20a5p-5, -0x1.bdb55f5990c49p-5, 0x1.8c4ba9c7c0692p-5 + }; + const double ln10 = 0x1.34413509f79ffp-2, + ln10h = 0x1.34413509f8p-2, + ln10l = -0x1.80433b83b532ap-44; double z = x; uint32_t ux = asuint (x); - if (__glibc_unlikely (ux >= 0x17fu << 23)) /* x <= -1 */ - return as_special (x); - uint32_t ax = ux & (~0u >> 1); - if (__glibc_unlikely (ax == 0)) - return copysign (0, x); - if (__glibc_unlikely (ax >= (0xff << 23))) /* +inf, nan */ - return as_special (x); - int ie = ux; - ie >>= 23; - unsigned int je = ie - 126; - je = (je * 0x9a209a8) >> 29; - if (__glibc_unlikely (ux == st[je].u)) - return je; - + if (__glibc_unlikely (ux >= 0xbf800000u)) + return as_special (x); // x <= -1, -inf, -nan + if (__glibc_unlikely ((int32_t) ux >= 0x7f800000)) + return as_special (x); // +inf, +nan + if (__glibc_unlikely (ux == st[(ux >> 24) & 0xf].u)) + { + int ie = ux; + ie >>= 23; + unsigned je = ie - 126; + je = (je * 0x9a209a8) >> 29; + return je; + } uint64_t tz = asuint64 (z + 1.0); - uint64_t m = tz & (~(uint64_t) 0 >> 12); - int32_t e = (tz >> 52) - 1023, j = ((m + ((int64_t) 1 << 45)) >> 46); - tz = m | ((uint64_t) 0x3ff << 52); - double ix = tr[j], l = tl[j]; - double off = e * 0x1.34413509f79ffp-2 + l; - double v = asdouble (tz) * ix - 1; - - static const double h[] = + uint64_t m = tz & (UINT64_C(~0) >> 12); + int32_t e = (tz >> 52) - 1023, j = m >> 46; + tz = m | (UINT64_C(0x3ff) << 52); + if (__glibc_unlikely (m == 0)) { - 0x1.bcb7b150bf6d8p-2, -0x1.bcb7b1738c07ep-3, - 0x1.287de19e795c5p-3, -0x1.bca44edc44bc4p-4 - }; - double v2 = v * v; - double f = (h[0] + v * h[1]) + v2 * (h[2] + v * h[3]); - double r = off + v * f; - float ub = r; - float lb = r + 0x1.5cp-42; + if (__glibc_unlikely (ux == 0 || ux == 0x80000000)) + return x; // return signed zero + } + double v = asdouble (tz) * tr[j] - 1, v2 = v * v; + double f + = (e * ln10 + tl[j]) + v * ((b[0] + v * b[1]) + v2 * (b[2] + v * b[3])); + float ub = f, lb = f + 0x1.56ap-42; if (__glibc_unlikely (ub != lb)) { - if (__glibc_unlikely (ax < 0x3d32743eu)) - { /* 0x1.64e87cp-5f */ - if (__glibc_unlikely (ux == 0xa6aba8afu)) - return -0x1.2a33bcp-51f + 0x1p-76f; - if (__glibc_unlikely (ux == 0xaf39b9a7u)) - return -0x1.42a342p-34f + 0x1p-59f; - if (__glibc_unlikely (ux == 0x399a7c00u)) - return 0x1.0c53cap-13f + 0x1p-38f; - z /= 2.0 + z; - double z2 = z * z, z4 = z2 * z2; - static const double c[] = - { - 0x1.bcb7b1526e50fp-1, 0x1.287a76370129dp-2, - 0x1.63c62378fa3dbp-3, 0x1.fca4139a42374p-4 - }; - float ret = z * ((c[0] + z2 * c[1]) + z4 * (c[2] + z2 * c[3])); - if (x != 0.0f && ret == 0.0f) - __set_errno (ERANGE); - return ret; + for (int i = 0; i < 4; i++) + if (__glibc_unlikely (ux == asuint (tb[i].x))) + return tb[i].f + tb[i].df; + double lj = tl[j] + 0x1.58ep-43; + uint32_t ax = ux & (~0u >> 1); + if (ax < 0x3d100000) + { // |x| < 0x1.2p-5 + if (__glibc_unlikely (ax < 0x33000000u)) + { // |x| < 0x1p-25 + static const double c0h = 0x1.bcb7b15p-2, + c0l = 0x1.37287195355bbp-33; + float r = z * c0h + z * (c0l + z * c[1]); + // |x|<=0x1-126*ln(10) + if (__glibc_unlikely (ax <= 0x01135d8du)) + __set_errno (ERANGE); // underflow + return r; + } + e = 0; + v = x; + v2 = v * v; + lj = 0.0; } - if (__glibc_unlikely (ux == 0x7956ba5eu)) - return 0x1.16bebap+5f + 0x1p-20f; - if (__glibc_unlikely (ux == 0xbd86ffb9u)) - return -0x1.e53536p-6f + 0x1p-31f; - static const double c[] = - { - 0x1.bcb7b1526e50ep-2, -0x1.bcb7b1526e53dp-3, 0x1.287a7636f3fa2p-3, - -0x1.bcb7b146a14b3p-4, 0x1.63c627d5219cbp-4, -0x1.2880736c8762dp-4, - 0x1.fc1ecf913961ap-5 - }; + double v4 = v2 * v2; f = v - * ((c[0] + v * c[1]) - + v2 * ((c[2] + v * c[3]) + v2 * (c[4] + v * c[5] + v2 * c[6]))); - f += l - tl[0]; - double el = e * 0x1.34413509f79ffp-2; - r = el + f; - ub = r; + * (((c[0] + v * c[1]) + v2 * (c[2] + v * c[3])) + + v4 + * ((c[4] + v * c[5]) + + v2 * ((c[6] + v * c[7]) + v2 * c[8]))); + f += e * ln10l; + f += lj; + f += e * ln10h; + ub = f; } return ub; } From patchwork Tue Jan 27 19:56:45 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 129076 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id 299AC4BA23DA for ; Tue, 27 Jan 2026 19:59:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 299AC4BA23DA Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=MfUwvxxO X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by sourceware.org (Postfix) with ESMTPS id E64AF4BA2E36 for ; Tue, 27 Jan 2026 19:57:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E64AF4BA2E36 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E64AF4BA2E36 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543873; cv=none; b=GI/pewBBtgOqeA8QA/UC3uNiyD/118Mjvi7OWbPy7/nGf0cPSiV2auB4ZWHB3MjaAHqU6pVBK4WKPaR+dT5zGSDOMhZsQE3noUnOS+QpW8TC7iAegRvgwLvBan4+U27e17qlY5288psbh//TdKXNbIS2ayxzUTK1ZibGtdjY7nw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1769543873; c=relaxed/simple; bh=SFhOVkjfeRAHzNTjBjOS853yMEIhMVtHmCM8ceDGRBY=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=sxk7BNaLdzqrb4wBBEite0mT6pkVwX0ghHRZ6HAa99v+ZWK/6onNfYOtM3jkjRWg/GPg2Qx8djMdfr0cCPcXPwy//9XQKubnW2JE+txAsGJe7KLgyH71ZgjOQQOPLoV/ZK8Wfvv66Fvz0B6WahztA3ZZYeVPVZ5joHvNmUi4kzY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E64AF4BA2E36 Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-47ee974e230so53093505e9.2 for ; Tue, 27 Jan 2026 11:57:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1769543871; x=1770148671; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZsfG6hiGuzAgxWiBNDsVfEKSsFxk0vE1W1X76YibWog=; b=MfUwvxxOqpUohVKj0CloPiFGrLHZFdZWwsMAg1d7k5op0PrDN4DXnlmPpGAyrFPmr0 DiujUnlEg5lrU9k4l9+Q0wKN86N4CVYtIbr93mbsmAX/oIWYQ2NyIWxC8dEzl1ZcdxOB Ue7SK2QJR5ZiJS1gg03IlHT4/S64+iqcIKW13vdcQkBEQmIhEYmFEq7os//aFCUaBhvK pOocxhr5Je92aCu896LBna+W/+DRbVK0DCsJATcw4Nr0V9HTVDRuVAhb7swCXDIBKdvX p9QnkWAiLZKUtCzpO/Jl2BL4y6LZrkSle+2u12i+kg3uAbbdy8gL9h6GxhaoPg10RDzo dKMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769543871; x=1770148671; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ZsfG6hiGuzAgxWiBNDsVfEKSsFxk0vE1W1X76YibWog=; b=U2lz0ScRBX+EtR/RhsY8CpbweDE2dvZ5QeULVnQleHCvxOLOfJBRA60IU0iexGsNAf UT/V7orwYfnOHT3DoQXr22vhH/pJRJAPjcuox7bVLSJDJHpuxnUiG0LbS+iH85BmmyUr ix6vlQG0z0zZkNsGldSEHsKABD+jAvOad1l65zBoX29BgAGvlLMZw4zQM0Boq7LCI9De IglT7yL8gUkkuBYrAqorMIaDsHKIAlyEk8xB/Hm7BXcfbpNX4fOzlsaFAOC04GND7hN2 c9JKSxQSxc+j2MJcE6VnhPOMYVtevou6b7FBGYVogVMaJCgRnBgmvla3OmWUGMbxygXE HaiQ== X-Gm-Message-State: AOJu0Yx+Zdf+Dl5CMAIM7lJzPJIdE72hhGlDDLl5dKzbwsMoRERuW1ag fSyv4E2+/sYmZFDUSPzasycaJcDm3RNpl9AStprNbqcObVCdik7g1PjlzNxhoyBOXBsujmGSITb 1fIyt/Nw= X-Gm-Gg: AZuq6aIEksPgnvv2JxYb2xojWjWKnnaHu+BstBGfxtshdwUsUSJKjgpZYnIBTpSf8Bb iZ6e1Q75SE0AJZ5VBQDLyamFZ6bZME4KTy6YlkN8G55edQjRyGbYPLYzwHGFSiXDYGil65u5h6/ 6tZYG7Ugu9EMJSuY6YzYpPYiucdDHKLLkqWbDccl534faQUupJIDULzFKSiGewZDE3L0CvVzUtv CqUC08/AX0KeCC+l2EML4fni6d4JwkwVC/X0JhidlcXwnRmiXkn5Nw9KmUQWEwBiYrt14gvUuVO UfNuxQskNjzciijGFJmdNTCR7U/TStXot8JlPit6TcnhMGJ+oSuDXfw81vc9fKcpJJGE3EU0DdN FRqytBW8vdZ6wEDDZzW3HBYyJ0g6Rb7e7FOzRyWYPTF+9ltHa4G+8VWygL7VQ2TU/pmz8AAmw7H MtrKiRSfJXrC6Fn0VGnxVO+w4LkgA0kbNoX8K8uxl4ZduhSwh1RONpXNgKsw== X-Received: by 2002:a05:600c:1c13:b0:477:7925:f7fb with SMTP id 5b1f17b1804b1-48069c26ad2mr46849825e9.10.1769543871342; Tue, 27 Jan 2026 11:57:51 -0800 (PST) Received: from ubuntu-vm.. (51-148-40-15.dsl.zen.co.uk. [51.148.40.15]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48066bfb58esm81363035e9.8.2026.01.27.11.57.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Jan 2026 11:57:50 -0800 (PST) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Paul Zimmermann , Joseph Myers , DJ Delorie Subject: [PATCH 5/5] math: Sync atanh with CORE-MATH Date: Tue, 27 Jan 2026 19:56:45 +0000 Message-ID: <20260127195741.2513011-6-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> References: <20260127195741.2513011-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org It speeds the fast-path for |x|<0.25. The CORE-MATH muldd is the same as muldd2 from glibc ddcoremath.h. Checked on aarch64-linux-gnu, arm-linux-gnueabihf, powerpc64le-linux-gnu, i686-linux-gnu, and x86_64-linux-gnu. --- SHARED-FILES | 2 +- sysdeps/ieee754/dbl-64/e_atanh.c | 7 +++++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/SHARED-FILES b/SHARED-FILES index 875b1fea28..8ecb9a094b 100644 --- a/SHARED-FILES +++ b/SHARED-FILES @@ -246,7 +246,7 @@ tzdata: core-math: # src/binary64/acosh/acosh.c, revision 1bd85b89 sysdeps/ieee754/dbl-64/e_acosh.c - # src/binary64/atanh/atanh.c, revision c423b9a3 + # src/binary64/atanh/atanh.c, revision 532e37dc sysdeps/ieee754/dbl-64/e_atanh.c # src/binary64/tgamma/tgamma.c, revision 0f185e23 sysdeps/ieee754/dbl-64/e_gamma_r.c diff --git a/sysdeps/ieee754/dbl-64/e_atanh.c b/sysdeps/ieee754/dbl-64/e_atanh.c index e017afc12a..46821b307b 100644 --- a/sysdeps/ieee754/dbl-64/e_atanh.c +++ b/sysdeps/ieee754/dbl-64/e_atanh.c @@ -3,7 +3,7 @@ Copyright (c) 2023-2026 Alexei Sibidanov. The original version of this file was copied from the CORE-MATH -project (file src/binary64/atanh/atanh.c, revision c423b9a3). +project (file src/binary64/atanh/atanh.c, revision 532e37dc). Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal @@ -95,6 +95,9 @@ __ieee754_atanh (double x) return __math_check_uflow_zero_lt (x, 0x1p-1022, fma (x, 0x1p-55, x)); } + /* checked exhautively this branch (with and without FMA): + * for 0x1.d12ed0af1a27fp-27 <= x < 2^-24 + */ double x2 = x * x; static const double c[] = { 0x1.999999999999ap-3, 0x1.2492492492244p-3, 0x1.c71c71c79715fp-4, 0x1.745d16f777723p-4, @@ -109,7 +112,7 @@ __ieee754_atanh (double x) + x8 * ((c[4] + x2 * c[5]) + x4 * (c[6] + x2 * c[7]) + x8 * c[8]); double t = fma (x2, p, 0x1.5555555555555p-56); double pl, ph = fasttwosum (0x1.5555555555555p-2, t, &pl); - ph = muldd_acc (ph, pl, x3, dx3, &pl); + ph = muldd2 (ph, pl, x3, dx3, &pl); double tl; ph = fasttwosum (x, ph, &tl); pl += tl;