From patchwork Fri Sep 19 14:05:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 120523 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 032E63858402 for ; Fri, 19 Sep 2025 14:08:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 032E63858402 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=XwCLKg7U X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-ej1-x629.google.com (mail-ej1-x629.google.com [IPv6:2a00:1450:4864:20::629]) by sourceware.org (Postfix) with ESMTPS id 197583858D20 for ; Fri, 19 Sep 2025 14:07:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 197583858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 197583858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::629 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1758290846; cv=none; b=NdUnAP251YK9XgGPkFaMpapMyDBLVh+feU4WfQUTpe9UsrsPCy/q0kiOQY+iI9olsnhP+ojl3X9Ov+ugc3vFcRtNO26klyMjypUyrK77sqgoKb2WIA/uN4O+qdaajvdi3ad4WuSpDmxFl9lUVsW/M+p+8rsDu7chW5yEX1Jhp9Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1758290846; c=relaxed/simple; bh=4IKEWoDOpB7GBg2/weB10IeA1qk2TjF0gG6Jfi5lBDI=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=ul1x6F13KYVEjgykBRgUa5LYxEJAtKyqJqYtDMN/fMniwFuwlOKaWT3OOWduvvmBHz9rUlNasnPF7uj13lLZCPxnFkoCWOZabmpAgg0DxMNCgNt/4UhJWTOc339LtkV77yqd4aVJpeK/ZIFM8X9U6vuYt6c/2UsteoDyYMbleUQ= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 197583858D20 Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-b2381c58941so227064666b.1 for ; Fri, 19 Sep 2025 07:07:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1758290844; x=1758895644; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=GtfX7oNvj0dlOKRdVp7hrtSO7tn/ZH8REXu26nuVhho=; b=XwCLKg7U+BYWF1wU+dAwyEyZ7msbzpLS4NUKiWZ+uDEHnG5m9L8dioCe+5lqH5BDf8 gUBU6sq5vBRZy4jZeg/8eVqb6wG7vqhw2ppfzpiWzK+f/qaSFBslMXriRQIWI0SAG1yz 8PVrZv5ugZeNRpuTLAoXyP0QZNpqtczJdCuMdYfWuSDsZZDfH9L83Rd5WIme2Pdu9uLe 2ZPZIaNVVggoX0fNgl+Pmg9CV46gpvi9ttZTE2tMOT+Ccjoi7WSmTU49bV4qvu8PM0Hx MiMAuTqrpTUVBcWszJ3Mar+m2p99kNL6qwvogV1lydJky67Wtkk+NGvJG5BEssXzmgFo z5mQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758290844; x=1758895644; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GtfX7oNvj0dlOKRdVp7hrtSO7tn/ZH8REXu26nuVhho=; b=aGcMXEgnzc0F3hAp96/lUZWW+3qPBinGtv4QvLYSn0T7TF5d6Xs8BIp8Zx947P1E18 n2jcoh3blDLCdfF70cSYJBJL4FajrRDyQOfSHF80B5uwqYiWCPpzmbqDKX2sgiRBXVbG vPcQy1Uf8JAhUBMv7TWO6JAplHGm17ATKCoa8P/58BBQC+4fq4kiFpZJM0EgU1xRyjOd 717B5c+qYKVXgPCIzqi5U316UZl0Ctya4X9q7gV1REEs2SAI1Yw3uAOPRr9hbkwSHmAg ukHSlqx8sxwe2E/kqD/tr6J7LMaf3yngQRx+d9Pz5p29xA/1ImHU+M3av89OHjsuhwhQ uk/Q== X-Gm-Message-State: AOJu0YxtFOjaNJBWU72A4nJMgBhPWfPTtWfRuvJ8ufQKs5Uh0KxDFqfD 4kddiNQZGBZqdq8EzvVp6xl2LTuQsNCxZVKkLJ0Bf48ujZScJv+is4GKOkX9yA== X-Gm-Gg: ASbGncvVITJELaTK7J3AkOJrLsDJEZj+oQBllMoAqCgM/vpjdTPyuM4aCGbgPw0iWF1 bVX+hXQ9t6GvI15aiv2knGSHVgolCUuEhPURXTnk1U9CGbboZ6eGIEnauVw7jovGEPWtt8sEJec H1FwhBumBKzUgrLjFHym6PL7cbf//tz/wIiBA1GzHT8FuTnrEOToGAcSKtl3ZndunnIOqD8izHd OCjuhnm/J/VmtTMpps1swnOqxZJNexpr86NavnxgPyYfGhSJjkIzxSqAAA10fQB1QVuMuOSMtj1 XigUYu8rHK90Lw3sNDSLOY/5DmwrUye6SSouMa5F6OXPAnF+WPVhpxuRRAJBG/wyyuPLGsZMhX0 MPskTZqWcLc1e X-Google-Smtp-Source: AGHT+IEUMCcjosCMnZwGp96jNNGfSZ6IRK68FAbHkZrj2TvXfh9HhqHELSmEcWFWu5oLWyQ0xwRZdw== X-Received: by 2002:a17:907:9449:b0:b1f:ecda:b79f with SMTP id a640c23a62f3a-b24f3a76167mr316668966b.38.1758290843158; Fri, 19 Sep 2025 07:07:23 -0700 (PDT) Received: from fedora ([46.248.82.114]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b25edd99849sm150520966b.48.2025.09.19.07.07.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Sep 2025 07:07:22 -0700 (PDT) From: Uros Bizjak To: libc-alpha@sourceware.org, hjl.tools@gmail.com Cc: Uros Bizjak Subject: [PATCH] x86: Clean up and improve FPU inline assembly Date: Fri, 19 Sep 2025 16:05:46 +0200 Message-ID: <20250919140717.110329-1-ubizjak@gmail.com> X-Mailer: git-send-email 2.51.0 MIME-Version: 1.0 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org Clean up and improve x86 FPU inline assembly: * Remove obsolete "*&" GCC asm memory operand workaround * Use %v prefix to emit VEX prefixed insns for AVX targets * Use \t and \n\t separators consistently Signed-off-by: Uros Bizjak Reviewed-by: Florian Weimer --- sysdeps/i386/fpu/e_acosl.c | 18 +++++----- sysdeps/i386/fpu/e_fmodl.c | 10 +++--- sysdeps/i386/fpu/fclrexcpt.c | 8 ++--- sysdeps/i386/fpu/fedisblxcpt.c | 8 ++--- sysdeps/i386/fpu/feenablxcpt.c | 8 ++--- sysdeps/i386/fpu/fegetenv.c | 6 ++-- sysdeps/i386/fpu/fegetexcept.c | 2 +- sysdeps/i386/fpu/fegetmode.c | 2 +- sysdeps/i386/fpu/fegetround.c | 2 +- sysdeps/i386/fpu/feholdexcpt.c | 7 ++-- sysdeps/i386/fpu/fesetenv.c | 8 ++--- sysdeps/i386/fpu/fesetexcept.c | 10 +++--- sysdeps/i386/fpu/fesetmode.c | 4 +-- sysdeps/i386/fpu/fesetround.c | 8 ++--- sysdeps/i386/fpu/feupdateenv.c | 4 +-- sysdeps/i386/fpu/fgetexcptflg.c | 4 +-- sysdeps/i386/fpu/fraiseexcpt.c | 21 +++++++----- sysdeps/i386/fpu/fsetexcptflg.c | 12 +++---- sysdeps/i386/fpu/ftestexcept.c | 4 +-- sysdeps/i386/fpu/s_atanl.c | 2 +- sysdeps/i386/fpu/s_logbl.c | 4 +-- sysdeps/i386/fpu/s_significandl.c | 4 +-- sysdeps/i386/setfpucw.c | 8 ++--- sysdeps/x86/fpu/fenv_private.h | 55 ++++++++++++++----------------- sysdeps/x86/fpu/sfp-machine.h | 8 +---- sysdeps/x86_64/fpu/fclrexcpt.c | 8 ++--- sysdeps/x86_64/fpu/fedisblxcpt.c | 8 ++--- sysdeps/x86_64/fpu/feenablxcpt.c | 8 ++--- sysdeps/x86_64/fpu/fegetenv.c | 6 ++-- sysdeps/x86_64/fpu/fegetexcept.c | 2 +- sysdeps/x86_64/fpu/fegetmode.c | 2 +- sysdeps/x86_64/fpu/fegetround.c | 2 +- sysdeps/x86_64/fpu/feholdexcpt.c | 6 ++-- sysdeps/x86_64/fpu/fesetenv.c | 8 ++--- sysdeps/x86_64/fpu/fesetexcept.c | 4 +-- sysdeps/x86_64/fpu/fesetmode.c | 4 +-- sysdeps/x86_64/fpu/fesetround.c | 8 ++--- sysdeps/x86_64/fpu/feupdateenv.c | 3 +- sysdeps/x86_64/fpu/fgetexcptflg.c | 4 +-- sysdeps/x86_64/fpu/fraiseexcpt.c | 16 ++++----- sysdeps/x86_64/fpu/fsetexcptflg.c | 8 ++--- sysdeps/x86_64/fpu/ftestexcept.c | 4 +-- 42 files changed, 161 insertions(+), 167 deletions(-) diff --git a/sysdeps/i386/fpu/e_acosl.c b/sysdeps/i386/fpu/e_acosl.c index 5e81b29153..4405bb1d81 100644 --- a/sysdeps/i386/fpu/e_acosl.c +++ b/sysdeps/i386/fpu/e_acosl.c @@ -12,15 +12,15 @@ __ieee754_acosl (long double x) long double res; /* acosl = atanl (sqrtl((1-x) (1+x)) / x) */ - asm ( "fld %%st\n" - "fld1\n" - "fsubp\n" - "fld1\n" - "fadd %%st(2)\n" - "fmulp\n" /* 1 - x^2 */ - "fsqrt\n" /* sqrtl (1 - x^2) */ - "fabs\n" - "fxch %%st(1)\n" + asm ( "fld\t%%st\n\t" + "fld1\n\t" + "fsubp\n\t" + "fld1\n\t" + "fadd\t%%st(2)\n\t" + "fmulp\n\t" /* 1 - x^2 */ + "fsqrt\n\t" /* sqrtl (1 - x^2) */ + "fabs\n\t" + "fxch\t%%st(1)\n\t" "fpatan" : "=t" (res) : "0" (x) : "st(1)"); return res; diff --git a/sysdeps/i386/fpu/e_fmodl.c b/sysdeps/i386/fpu/e_fmodl.c index a5761c8b64..2c5e092f7c 100644 --- a/sysdeps/i386/fpu/e_fmodl.c +++ b/sysdeps/i386/fpu/e_fmodl.c @@ -11,11 +11,11 @@ __ieee754_fmodl (long double x, long double y) { long double res; - asm ("1:\tfprem\n" - "fstsw %%ax\n" - "sahf\n" - "jp 1b\n" - "fstp %%st(1)" + asm ("1:\tfprem\n\t" + "fstsw\t%%ax\n\t" + "sahf\n\t" + "jp\t1b\n\t" + "fstp\t%%st(1)" : "=t" (res) : "0" (x), "u" (y) : "ax", "st(1)"); return res; } diff --git a/sysdeps/i386/fpu/fclrexcpt.c b/sysdeps/i386/fpu/fclrexcpt.c index 713bc03669..174c50d874 100644 --- a/sysdeps/i386/fpu/fclrexcpt.c +++ b/sysdeps/i386/fpu/fclrexcpt.c @@ -30,13 +30,13 @@ __feclearexcept (int excepts) /* Bah, we have to clear selected exceptions. Since there is no `fldsw' instruction we have to do it the hard way. */ - __asm__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ ("fnstenv\t%0" : "=m" (temp)); /* Clear the relevant bits. */ temp.__status_word &= excepts ^ FE_ALL_EXCEPT; /* Put the new data in effect. */ - __asm__ ("fldenv %0" : : "m" (*&temp)); + __asm__ ("fldenv\t%0" : : "m" (temp)); /* If the CPU supports SSE, we clear the MXCSR as well. */ if (CPU_FEATURE_USABLE (SSE)) @@ -44,13 +44,13 @@ __feclearexcept (int excepts) unsigned int xnew_exc; /* Get the current MXCSR. */ - __asm__ ("stmxcsr %0" : "=m" (*&xnew_exc)); + __asm__ ("%vstmxcsr\t%0" : "=m" (xnew_exc)); /* Clear the relevant bits. */ xnew_exc &= ~excepts; /* Put the new data in effect. */ - __asm__ ("ldmxcsr %0" : : "m" (*&xnew_exc)); + __asm__ ("%vldmxcsr\t%0" : : "m" (xnew_exc)); } /* Success. */ diff --git a/sysdeps/i386/fpu/fedisblxcpt.c b/sysdeps/i386/fpu/fedisblxcpt.c index b23fd8e869..c029f1d44c 100644 --- a/sysdeps/i386/fpu/fedisblxcpt.c +++ b/sysdeps/i386/fpu/fedisblxcpt.c @@ -26,14 +26,14 @@ fedisableexcept (int excepts) unsigned short int new_exc, old_exc; /* Get the current control word. */ - __asm__ ("fstcw %0" : "=m" (*&new_exc)); + __asm__ ("fstcw\t%0" : "=m" (new_exc)); old_exc = (~new_exc) & FE_ALL_EXCEPT; excepts &= FE_ALL_EXCEPT; new_exc |= excepts; - __asm__ ("fldcw %0" : : "m" (*&new_exc)); + __asm__ ("fldcw\t%0" : : "m" (new_exc)); /* If the CPU supports SSE we set the MXCSR as well. */ if (CPU_FEATURE_USABLE (SSE)) @@ -41,11 +41,11 @@ fedisableexcept (int excepts) unsigned int xnew_exc; /* Get the current control word. */ - __asm__ ("stmxcsr %0" : "=m" (*&xnew_exc)); + __asm__ ("%vstmxcsr\t%0" : "=m" (xnew_exc)); xnew_exc |= excepts << 7; - __asm__ ("ldmxcsr %0" : : "m" (*&xnew_exc)); + __asm__ ("%vldmxcsr\t%0" : : "m" (xnew_exc)); } return old_exc; diff --git a/sysdeps/i386/fpu/feenablxcpt.c b/sysdeps/i386/fpu/feenablxcpt.c index bc4a4ce32f..5f67115dac 100644 --- a/sysdeps/i386/fpu/feenablxcpt.c +++ b/sysdeps/i386/fpu/feenablxcpt.c @@ -27,13 +27,13 @@ feenableexcept (int excepts) unsigned short int old_exc; /* Get the current control word. */ - __asm__ ("fstcw %0" : "=m" (*&new_exc)); + __asm__ ("fstcw\t%0" : "=m" (new_exc)); excepts &= FE_ALL_EXCEPT; old_exc = (~new_exc) & FE_ALL_EXCEPT; new_exc &= ~excepts; - __asm__ ("fldcw %0" : : "m" (*&new_exc)); + __asm__ ("fldcw\t%0" : : "m" (new_exc)); /* If the CPU supports SSE we set the MXCSR as well. */ if (CPU_FEATURE_USABLE (SSE)) @@ -41,11 +41,11 @@ feenableexcept (int excepts) unsigned int xnew_exc; /* Get the current control word. */ - __asm__ ("stmxcsr %0" : "=m" (*&xnew_exc)); + __asm__ ("%vstmxcsr\t%0" : "=m" (xnew_exc)); xnew_exc &= ~(excepts << 7); - __asm__ ("ldmxcsr %0" : : "m" (*&xnew_exc)); + __asm__ ("%vldmxcsr\t%0" : : "m" (xnew_exc)); } return old_exc; diff --git a/sysdeps/i386/fpu/fegetenv.c b/sysdeps/i386/fpu/fegetenv.c index 0d2b87db93..0f0a545ba1 100644 --- a/sysdeps/i386/fpu/fegetenv.c +++ b/sysdeps/i386/fpu/fegetenv.c @@ -23,14 +23,14 @@ int __fegetenv (fenv_t *envp) { - __asm__ ("fnstenv %0" : "=m" (*envp)); + __asm__ ("fnstenv\t%0" : "=m" (*envp)); /* And load it right back since the processor changes the mask. Intel thought this opcode to be used in interrupt handlers which would block all exceptions. */ - __asm__ ("fldenv %0" : : "m" (*envp)); + __asm__ ("fldenv\t%0" : : "m" (*envp)); if (CPU_FEATURE_USABLE (SSE)) - __asm__ ("stmxcsr %0" : "=m" (envp->__eip)); + __asm__ ("%vstmxcsr\t%0" : "=m" (envp->__eip)); /* Success. */ return 0; diff --git a/sysdeps/i386/fpu/fegetexcept.c b/sysdeps/i386/fpu/fegetexcept.c index 00ff7c4cdb..29eea2225e 100644 --- a/sysdeps/i386/fpu/fegetexcept.c +++ b/sysdeps/i386/fpu/fegetexcept.c @@ -24,7 +24,7 @@ fegetexcept (void) unsigned short int exc; /* Get the current control word. */ - __asm__ ("fstcw %0" : "=m" (*&exc)); + __asm__ ("fstcw\t%0" : "=m" (exc)); return (~exc) & FE_ALL_EXCEPT; } diff --git a/sysdeps/i386/fpu/fegetmode.c b/sysdeps/i386/fpu/fegetmode.c index 41275e1036..9f0fde7446 100644 --- a/sysdeps/i386/fpu/fegetmode.c +++ b/sysdeps/i386/fpu/fegetmode.c @@ -26,6 +26,6 @@ fegetmode (femode_t *modep) { _FPU_GETCW (modep->__control_word); if (CPU_FEATURE_USABLE (SSE)) - __asm__ ("stmxcsr %0" : "=m" (modep->__mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (modep->__mxcsr)); return 0; } diff --git a/sysdeps/i386/fpu/fegetround.c b/sysdeps/i386/fpu/fegetround.c index 297894d5a5..dc988295f6 100644 --- a/sysdeps/i386/fpu/fegetround.c +++ b/sysdeps/i386/fpu/fegetround.c @@ -23,7 +23,7 @@ __fegetround (void) { int cw; - __asm__ ("fnstcw %0" : "=m" (*&cw)); + __asm__ ("fnstcw\t%0" : "=m" (cw)); return cw & 0xc00; } diff --git a/sysdeps/i386/fpu/feholdexcpt.c b/sysdeps/i386/fpu/feholdexcpt.c index a323a04f27..72adf18358 100644 --- a/sysdeps/i386/fpu/feholdexcpt.c +++ b/sysdeps/i386/fpu/feholdexcpt.c @@ -25,7 +25,8 @@ __feholdexcept (fenv_t *envp) { /* Store the environment. Recall that fnstenv has a side effect of masking all exceptions. Then clear all exceptions. */ - __asm__ volatile ("fnstenv %0; fnclex" : "=m" (*envp)); + __asm__ volatile ("fnstenv\t%0\n\t" + "fnclex" : "=m" (*envp)); /* If the CPU supports SSE we set the MXCSR as well. */ if (CPU_FEATURE_USABLE (SSE)) @@ -33,12 +34,12 @@ __feholdexcept (fenv_t *envp) unsigned int xwork; /* Get the current control word. */ - __asm__ ("stmxcsr %0" : "=m" (envp->__eip)); + __asm__ ("%vstmxcsr\t%0" : "=m" (envp->__eip)); /* Set all exceptions to non-stop and clear them. */ xwork = (envp->__eip | 0x1f80) & ~0x3f; - __asm__ ("ldmxcsr %0" : : "m" (*&xwork)); + __asm__ ("%vldmxcsr\t%0" : : "m" (xwork)); } return 0; diff --git a/sysdeps/i386/fpu/fesetenv.c b/sysdeps/i386/fpu/fesetenv.c index 66d7002edd..ae8d065d07 100644 --- a/sysdeps/i386/fpu/fesetenv.c +++ b/sysdeps/i386/fpu/fesetenv.c @@ -40,7 +40,7 @@ __fesetenv (const fenv_t *envp) values which we do not want to come from the saved environment. Therefore, we get the current environment and replace the values we want to use from the environment specified by the parameter. */ - __asm__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ ("fnstenv\t%0" : "=m" (temp)); if (envp == FE_DFL_ENV) { @@ -75,12 +75,12 @@ __fesetenv (const fenv_t *envp) temp.__data_offset = 0; temp.__data_selector = 0; - __asm__ ("fldenv %0" : : "m" (temp)); + __asm__ ("fldenv\t%0" : : "m" (temp)); if (CPU_FEATURE_USABLE (SSE)) { unsigned int mxcsr; - __asm__ ("stmxcsr %0" : "=m" (mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (mxcsr)); if (envp == FE_DFL_ENV) { @@ -111,7 +111,7 @@ __fesetenv (const fenv_t *envp) else mxcsr = envp->__eip; - __asm__ ("ldmxcsr %0" : : "m" (mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); } /* Success. */ diff --git a/sysdeps/i386/fpu/fesetexcept.c b/sysdeps/i386/fpu/fesetexcept.c index e483b46678..648b468311 100644 --- a/sysdeps/i386/fpu/fesetexcept.c +++ b/sysdeps/i386/fpu/fesetexcept.c @@ -33,13 +33,13 @@ fesetexcept (int excepts) { /* Get the control word of the SSE unit. */ unsigned int mxcsr; - __asm__ ("stmxcsr %0" : "=m" (*&mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (mxcsr)); /* Set relevant flags. */ mxcsr |= excepts; /* Put the new data in effect. */ - __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); } else { @@ -47,7 +47,7 @@ fesetexcept (int excepts) /* Note: fnstenv masks all floating-point exceptions until the fldenv or fldcw below. */ - __asm__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ ("fnstenv\t%0" : "=m" (temp)); /* Set relevant flags. */ temp.__status_word |= excepts; @@ -57,12 +57,12 @@ fesetexcept (int excepts) /* Setting the exception flags may trigger a trap (at the next floating-point instruction, but that does not matter). ISO C23 (7.6.4.4) does not allow it. */ - __asm__ volatile ("fldcw %0" : : "m" (*&temp.__control_word)); + __asm__ volatile ("fldcw\t%0" : : "m" (temp.__control_word)); return -1; } /* Store the new status word (along with the rest of the environment). */ - __asm__ ("fldenv %0" : : "m" (*&temp)); + __asm__ ("fldenv\t%0" : : "m" (temp)); } return 0; diff --git a/sysdeps/i386/fpu/fesetmode.c b/sysdeps/i386/fpu/fesetmode.c index eab0a5d683..b5915d18e0 100644 --- a/sysdeps/i386/fpu/fesetmode.c +++ b/sysdeps/i386/fpu/fesetmode.c @@ -37,7 +37,7 @@ fesetmode (const femode_t *modep) if (CPU_FEATURE_USABLE (SSE)) { unsigned int mxcsr; - __asm__ ("stmxcsr %0" : "=m" (mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (mxcsr)); /* Preserve SSE exception flags but restore other state in MXCSR. */ mxcsr &= FE_ALL_EXCEPT_X86; @@ -47,7 +47,7 @@ fesetmode (const femode_t *modep) mxcsr |= FE_ALL_EXCEPT_X86 << 7; else mxcsr |= modep->__mxcsr & ~FE_ALL_EXCEPT_X86; - __asm__ ("ldmxcsr %0" : : "m" (mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); } return 0; } diff --git a/sysdeps/i386/fpu/fesetround.c b/sysdeps/i386/fpu/fesetround.c index ea1f9096b5..10cbeabe7b 100644 --- a/sysdeps/i386/fpu/fesetround.c +++ b/sysdeps/i386/fpu/fesetround.c @@ -29,20 +29,20 @@ __fesetround (int round) /* ROUND is no valid rounding mode. */ return 1; - __asm__ ("fnstcw %0" : "=m" (*&cw)); + __asm__ ("fnstcw\t%0" : "=m" (cw)); cw &= ~0xc00; cw |= round; - __asm__ ("fldcw %0" : : "m" (*&cw)); + __asm__ ("fldcw\t%0" : : "m" (cw)); /* If the CPU supports SSE we set the MXCSR as well. */ if (CPU_FEATURE_USABLE (SSE)) { unsigned int xcw; - __asm__ ("stmxcsr %0" : "=m" (*&xcw)); + __asm__ ("%vstmxcsr\t%0" : "=m" (xcw)); xcw &= ~0x6000; xcw |= round << 3; - __asm__ ("ldmxcsr %0" : : "m" (*&xcw)); + __asm__ ("%vldmxcsr\t%0" : : "m" (xcw)); } return 0; diff --git a/sysdeps/i386/fpu/feupdateenv.c b/sysdeps/i386/fpu/feupdateenv.c index 89b000953a..c1bbf1f76c 100644 --- a/sysdeps/i386/fpu/feupdateenv.c +++ b/sysdeps/i386/fpu/feupdateenv.c @@ -27,11 +27,11 @@ __feupdateenv (const fenv_t *envp) unsigned int xtemp = 0; /* Save current exceptions. */ - __asm__ ("fnstsw %0" : "=m" (*&temp)); + __asm__ ("fnstsw\t%0" : "=m" (temp)); /* If the CPU supports SSE we test the MXCSR as well. */ if (CPU_FEATURE_USABLE (SSE)) - __asm__ ("stmxcsr %0" : "=m" (*&xtemp)); + __asm__ ("%vstmxcsr\t%0" : "=m" (xtemp)); temp = (temp | xtemp) & FE_ALL_EXCEPT; diff --git a/sysdeps/i386/fpu/fgetexcptflg.c b/sysdeps/i386/fpu/fgetexcptflg.c index be181af162..a825ba0ade 100644 --- a/sysdeps/i386/fpu/fgetexcptflg.c +++ b/sysdeps/i386/fpu/fgetexcptflg.c @@ -27,7 +27,7 @@ __fegetexceptflag (fexcept_t *flagp, int excepts) fexcept_t temp; /* Get the current exceptions. */ - __asm__ ("fnstsw %0" : "=m" (*&temp)); + __asm__ ("fnstsw\t%0" : "=m" (temp)); *flagp = temp & excepts & FE_ALL_EXCEPT; @@ -37,7 +37,7 @@ __fegetexceptflag (fexcept_t *flagp, int excepts) unsigned int sse_exc; /* Get the current MXCSR. */ - __asm__ ("stmxcsr %0" : "=m" (*&sse_exc)); + __asm__ ("%vstmxcsr\t%0" : "=m" (sse_exc)); *flagp |= sse_exc & excepts & FE_ALL_EXCEPT; } diff --git a/sysdeps/i386/fpu/fraiseexcpt.c b/sysdeps/i386/fpu/fraiseexcpt.c index 65fba2e2d1..0ef9259d50 100644 --- a/sysdeps/i386/fpu/fraiseexcpt.c +++ b/sysdeps/i386/fpu/fraiseexcpt.c @@ -32,7 +32,9 @@ __feraiseexcept (int excepts) { /* One example of an invalid operation is 0.0 / 0.0. */ double d; - __asm__ __volatile__ ("fldz; fdiv %%st, %%st(0); fwait" : "=t" (d)); + __asm__ __volatile__ ("fldz\n\t" + "fdiv\t%%st,%%st(0)\n\t" + "fwait" : "=t" (d)); (void) &d; } @@ -40,7 +42,10 @@ __feraiseexcept (int excepts) if ((FE_DIVBYZERO & excepts) != 0) { double d; - __asm__ __volatile__ ("fldz; fld1; fdivp %%st, %%st(1); fwait" + __asm__ __volatile__ ("fldz\n\t" + "fld1\n\t" + "fdivp %%st,%%st(1)\n\t" + "fwait" : "=t" (d)); (void) &d; } @@ -54,13 +59,13 @@ __feraiseexcept (int excepts) /* Bah, we have to clear selected exceptions. Since there is no `fldsw' instruction we have to do it the hard way. */ - __asm__ __volatile__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ __volatile__ ("fnstenv\t%0" : "=m" (temp)); /* Set the relevant bits. */ temp.__status_word |= FE_OVERFLOW; /* Put the new data in effect. */ - __asm__ __volatile__ ("fldenv %0" : : "m" (*&temp)); + __asm__ __volatile__ ("fldenv\t%0" : : "m" (temp)); /* And raise the exception. */ __asm__ __volatile__ ("fwait"); @@ -75,13 +80,13 @@ __feraiseexcept (int excepts) /* Bah, we have to clear selected exceptions. Since there is no `fldsw' instruction we have to do it the hard way. */ - __asm__ __volatile__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ __volatile__ ("fnstenv\t%0" : "=m" (temp)); /* Set the relevant bits. */ temp.__status_word |= FE_UNDERFLOW; /* Put the new data in effect. */ - __asm__ __volatile__ ("fldenv %0" : : "m" (*&temp)); + __asm__ __volatile__ ("fldenv\t%0" : : "m" (temp)); /* And raise the exception. */ __asm__ __volatile__ ("fwait"); @@ -96,13 +101,13 @@ __feraiseexcept (int excepts) /* Bah, we have to clear selected exceptions. Since there is no `fldsw' instruction we have to do it the hard way. */ - __asm__ __volatile__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ __volatile__ ("fnstenv\t%0" : "=m" (temp)); /* Set the relevant bits. */ temp.__status_word |= FE_INEXACT; /* Put the new data in effect. */ - __asm__ __volatile__ ("fldenv %0" : : "m" (*&temp)); + __asm__ __volatile__ ("fldenv\t%0" : : "m" (temp)); /* And raise the exception. */ __asm__ __volatile__ ("fwait"); diff --git a/sysdeps/i386/fpu/fsetexcptflg.c b/sysdeps/i386/fpu/fsetexcptflg.c index 78736e0ac6..74b6f3536f 100644 --- a/sysdeps/i386/fpu/fsetexcptflg.c +++ b/sysdeps/i386/fpu/fsetexcptflg.c @@ -37,7 +37,7 @@ __fesetexceptflag (const fexcept_t *flagp, int excepts) cannot separately set the status word. Note: fnstenv masks all floating-point exceptions until the fldenv or fldcw below. */ - __asm__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ ("fnstenv\t%0" : "=m" (temp)); if (CPU_FEATURE_USABLE (SSE)) { @@ -47,16 +47,16 @@ __fesetexceptflag (const fexcept_t *flagp, int excepts) temp.__status_word &= ~(excepts & ~ *flagp); /* Store the new status word (along with the rest of the environment). */ - __asm__ ("fldenv %0" : : "m" (*&temp)); + __asm__ ("fldenv\t%0" : : "m" (temp)); /* And now similarly for SSE. */ - __asm__ ("stmxcsr %0" : "=m" (*&mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (mxcsr)); /* Clear or set relevant flags. */ mxcsr ^= (mxcsr ^ *flagp) & excepts; /* Put the new data in effect. */ - __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); } else { @@ -68,12 +68,12 @@ __fesetexceptflag (const fexcept_t *flagp, int excepts) /* Setting the exception flags may trigger a trap (at the next floating-point instruction, but that does not matter). ISO C 23 ยง 7.6.4.5 does not allow it. */ - __asm__ volatile ("fldcw %0" : : "m" (*&temp.__control_word)); + __asm__ volatile ("fldcw\t%0" : : "m" (temp.__control_word)); return -1; } /* Store the new status word (along with the rest of the environment). */ - __asm__ ("fldenv %0" : : "m" (*&temp)); + __asm__ ("fldenv\t%0" : : "m" (temp)); } /* Success. */ diff --git a/sysdeps/i386/fpu/ftestexcept.c b/sysdeps/i386/fpu/ftestexcept.c index 09a673e1ab..9ed5cc8048 100644 --- a/sysdeps/i386/fpu/ftestexcept.c +++ b/sysdeps/i386/fpu/ftestexcept.c @@ -27,11 +27,11 @@ __fetestexcept (int excepts) int xtemp = 0; /* Get current exceptions. */ - __asm__ ("fnstsw %0" : "=a" (temp)); + __asm__ ("fnstsw\t%0" : "=a" (temp)); /* If the CPU supports SSE we test the MXCSR as well. */ if (CPU_FEATURE_USABLE (SSE)) - __asm__ ("stmxcsr %0" : "=m" (*&xtemp)); + __asm__ ("%vstmxcsr\t%0" : "=m" (xtemp)); return (temp | xtemp) & excepts & FE_ALL_EXCEPT; } diff --git a/sysdeps/i386/fpu/s_atanl.c b/sysdeps/i386/fpu/s_atanl.c index 91b34498a4..3e5272276a 100644 --- a/sysdeps/i386/fpu/s_atanl.c +++ b/sysdeps/i386/fpu/s_atanl.c @@ -10,7 +10,7 @@ __atanl (long double x) { long double res; - asm ("fld1\n" + asm ("fld1\n\t" "fpatan" : "=t" (res) : "0" (x)); diff --git a/sysdeps/i386/fpu/s_logbl.c b/sysdeps/i386/fpu/s_logbl.c index ec867de010..1398de9ad8 100644 --- a/sysdeps/i386/fpu/s_logbl.c +++ b/sysdeps/i386/fpu/s_logbl.c @@ -9,8 +9,8 @@ __logbl (long double x) { long double res; - asm ("fxtract\n" - "fstp %%st" : "=t" (res) : "0" (x)); + asm ("fxtract\n\t" + "fstp\t%%st" : "=t" (res) : "0" (x)); return res; } diff --git a/sysdeps/i386/fpu/s_significandl.c b/sysdeps/i386/fpu/s_significandl.c index a11c981b54..7b6343dd0d 100644 --- a/sysdeps/i386/fpu/s_significandl.c +++ b/sysdeps/i386/fpu/s_significandl.c @@ -8,8 +8,8 @@ __significandl (long double x) { long double res; - asm ("fxtract\n" - "fstp %%st(1)" : "=t" (res) : "0" (x)); + asm ("fxtract\n\t" + "fstp\t%%st(1)" : "=t" (res) : "0" (x)); return res; } diff --git a/sysdeps/i386/setfpucw.c b/sysdeps/i386/setfpucw.c index 1edfd5be0a..6c9ca02e3d 100644 --- a/sysdeps/i386/setfpucw.c +++ b/sysdeps/i386/setfpucw.c @@ -28,14 +28,14 @@ __setfpucw (fpu_control_t set) fpu_control_t cw; /* Fetch the current control word. */ - __asm__ ("fnstcw %0" : "=m" (*&cw)); + __asm__ ("fnstcw\t%0" : "=m" (cw)); /* Preserve the reserved bits, and set the rest as the user specified (or the default, if the user gave zero). */ cw &= _FPU_RESERVED; cw |= set & ~_FPU_RESERVED; - __asm__ ("fldcw %0" : : "m" (*&cw)); + __asm__ ("fldcw\t%0" : : "m" (cw)); /* If the CPU supports SSE, we set the MXCSR as well. */ if (CPU_FEATURE_USABLE (SSE)) @@ -43,11 +43,11 @@ __setfpucw (fpu_control_t set) unsigned int xnew_exc; /* Get the current MXCSR. */ - __asm__ ("stmxcsr %0" : "=m" (*&xnew_exc)); + __asm__ ("%vstmxcsr\t%0" : "=m" (xnew_exc)); xnew_exc &= ~((0xc00 << 3) | (FE_ALL_EXCEPT << 7)); xnew_exc |= ((set & 0xc00) << 3) | ((set & FE_ALL_EXCEPT) << 7); - __asm__ ("ldmxcsr %0" : : "m" (*&xnew_exc)); + __asm__ ("%vldmxcsr\t%0" : : "m" (xnew_exc)); } } diff --git a/sysdeps/x86/fpu/fenv_private.h b/sysdeps/x86/fpu/fenv_private.h index 4b081e015b..19cc7c1322 100644 --- a/sysdeps/x86/fpu/fenv_private.h +++ b/sysdeps/x86/fpu/fenv_private.h @@ -18,22 +18,14 @@ need not care for both the 387 and the sse unit, only the one we're actually using. */ -#if defined __AVX__ || defined SSE2AVX -# define STMXCSR "vstmxcsr" -# define LDMXCSR "vldmxcsr" -#else -# define STMXCSR "stmxcsr" -# define LDMXCSR "ldmxcsr" -#endif - static __always_inline void libc_feholdexcept_sse (fenv_t *e) { unsigned int mxcsr; - asm (STMXCSR " %0" : "=m" (*&mxcsr)); + asm ("%vstmxcsr\t%0" : "=m" (mxcsr)); e->__mxcsr = mxcsr; mxcsr = (mxcsr | 0x1f80) & ~0x3f; - asm volatile (LDMXCSR " %0" : : "m" (*&mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (mxcsr)); } static __always_inline void @@ -41,8 +33,9 @@ libc_feholdexcept_387 (fenv_t *e) { /* Recall that fnstenv has a side-effect of masking exceptions. Clobber all of the fp registers so that the TOS field is 0. */ - asm volatile ("fnstenv %0; fnclex" - : "=m"(*e) + asm volatile ("fnstenv\t%0\n\t" + "fnclex" + : "=m" (*e) : : "st", "st(1)", "st(2)", "st(3)", "st(4)", "st(5)", "st(6)", "st(7)"); } @@ -51,9 +44,9 @@ static __always_inline void libc_fesetround_sse (int r) { unsigned int mxcsr; - asm (STMXCSR " %0" : "=m" (*&mxcsr)); + asm ("%vstmxcsr\t%0" : "=m" (mxcsr)); mxcsr = (mxcsr & ~0x6000) | (r << 3); - asm volatile (LDMXCSR " %0" : : "m" (*&mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (mxcsr)); } static __always_inline void @@ -69,10 +62,10 @@ static __always_inline void libc_feholdexcept_setround_sse (fenv_t *e, int r) { unsigned int mxcsr; - asm (STMXCSR " %0" : "=m" (*&mxcsr)); + asm ("%vstmxcsr\t%0" : "=m" (mxcsr)); e->__mxcsr = mxcsr; mxcsr = ((mxcsr | 0x1f80) & ~0x603f) | (r << 3); - asm volatile (LDMXCSR " %0" : : "m" (*&mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (mxcsr)); } /* Set both rounding mode and precision. A convenience function for use @@ -104,7 +97,7 @@ static __always_inline int libc_fetestexcept_sse (int e) { unsigned int mxcsr; - asm volatile (STMXCSR " %0" : "=m" (*&mxcsr)); + asm volatile ("%vstmxcsr\t%0" : "=m" (mxcsr)); return mxcsr & e & FE_ALL_EXCEPT; } @@ -112,14 +105,14 @@ static __always_inline int libc_fetestexcept_387 (int ex) { fexcept_t temp; - asm volatile ("fnstsw %0" : "=a" (temp)); + asm volatile ("fnstsw\t%0" : "=a" (temp)); return temp & ex & FE_ALL_EXCEPT; } static __always_inline void libc_fesetenv_sse (fenv_t *e) { - asm volatile (LDMXCSR " %0" : : "m" (e->__mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (e->__mxcsr)); } static __always_inline void @@ -127,7 +120,7 @@ libc_fesetenv_387 (fenv_t *e) { /* Clobber all fp registers so that the TOS value we saved earlier is compatible with the current state of the compiler. */ - asm volatile ("fldenv %0" + asm volatile ("fldenv\t%0" : : "m" (*e) : "st", "st(1)", "st(2)", "st(3)", "st(4)", "st(5)", "st(6)", "st(7)"); @@ -137,13 +130,13 @@ static __always_inline int libc_feupdateenv_test_sse (fenv_t *e, int ex) { unsigned int mxcsr, old_mxcsr, cur_ex; - asm volatile (STMXCSR " %0" : "=m" (*&mxcsr)); + asm volatile ("%vstmxcsr\t%0" : "=m" (mxcsr)); cur_ex = mxcsr & FE_ALL_EXCEPT; /* Merge current exceptions with the old environment. */ old_mxcsr = e->__mxcsr; mxcsr = old_mxcsr | cur_ex; - asm volatile (LDMXCSR " %0" : : "m" (*&mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (mxcsr)); /* Raise SIGFPE for any new exceptions since the hold. Expect that the normal environment has all exceptions masked. */ @@ -160,7 +153,7 @@ libc_feupdateenv_test_387 (fenv_t *e, int ex) fexcept_t cur_ex; /* Save current exceptions. */ - asm volatile ("fnstsw %0" : "=a" (cur_ex)); + asm volatile ("fnstsw\t%0" : "=a" (cur_ex)); cur_ex &= FE_ALL_EXCEPT; /* Reload original environment. */ @@ -189,10 +182,10 @@ static __always_inline void libc_feholdsetround_sse (fenv_t *e, int r) { unsigned int mxcsr; - asm (STMXCSR " %0" : "=m" (*&mxcsr)); + asm ("%vstmxcsr\t%0" : "=m" (mxcsr)); e->__mxcsr = mxcsr; mxcsr = (mxcsr & ~0x6000) | (r << 3); - asm volatile (LDMXCSR " %0" : : "m" (*&mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (mxcsr)); } static __always_inline void @@ -223,9 +216,9 @@ static __always_inline void libc_feresetround_sse (fenv_t *e) { unsigned int mxcsr; - asm (STMXCSR " %0" : "=m" (*&mxcsr)); + asm ("%vstmxcsr\t%0" : "=m" (mxcsr)); mxcsr = (mxcsr & ~0x6000) | (e->__mxcsr & 0x6000); - asm volatile (LDMXCSR " %0" : : "m" (*&mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (mxcsr)); } static __always_inline void @@ -315,13 +308,13 @@ static __always_inline void libc_feholdexcept_setround_sse_ctx (struct rm_ctx *ctx, int r) { unsigned int mxcsr, new_mxcsr; - asm (STMXCSR " %0" : "=m" (*&mxcsr)); + asm ("%vstmxcsr\t%0" : "=m" (mxcsr)); new_mxcsr = ((mxcsr | 0x1f80) & ~0x603f) | (r << 3); ctx->env.__mxcsr = mxcsr; if (__glibc_unlikely (mxcsr != new_mxcsr)) { - asm volatile (LDMXCSR " %0" : : "m" (*&new_mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (new_mxcsr)); ctx->updated_status = true; } else @@ -412,13 +405,13 @@ libc_feholdsetround_sse_ctx (struct rm_ctx *ctx, int r) { unsigned int mxcsr, new_mxcsr; - asm (STMXCSR " %0" : "=m" (*&mxcsr)); + asm ("%vstmxcsr\t%0" : "=m" (mxcsr)); new_mxcsr = (mxcsr & ~0x6000) | (r << 3); ctx->env.__mxcsr = mxcsr; if (__glibc_unlikely (new_mxcsr != mxcsr)) { - asm volatile (LDMXCSR " %0" : : "m" (*&new_mxcsr)); + asm volatile ("%vldmxcsr\t%0" : : "m" (new_mxcsr)); ctx->updated_status = true; } else diff --git a/sysdeps/x86/fpu/sfp-machine.h b/sysdeps/x86/fpu/sfp-machine.h index bc3fe332df..1f08c71673 100644 --- a/sysdeps/x86/fpu/sfp-machine.h +++ b/sysdeps/x86/fpu/sfp-machine.h @@ -39,15 +39,9 @@ typedef unsigned int UTItype __attribute__ ((mode (TI))); # define FP_RND_MASK 0x6000 -# ifdef __AVX__ -# define AVX_INSN_PREFIX "v" -# else -# define AVX_INSN_PREFIX "" -# endif - # define FP_INIT_ROUNDMODE \ do { \ - __asm__ __volatile__ (AVX_INSN_PREFIX "stmxcsr\t%0" : "=m" (_fcw)); \ + __asm__ __volatile__ ("%vstmxcsr\t%0" : "=m" (_fcw)); \ } while (0) #else # define _FP_W_TYPE_SIZE 32 diff --git a/sysdeps/x86_64/fpu/fclrexcpt.c b/sysdeps/x86_64/fpu/fclrexcpt.c index 3bbb5a2b48..822b2bb9a4 100644 --- a/sysdeps/x86_64/fpu/fclrexcpt.c +++ b/sysdeps/x86_64/fpu/fclrexcpt.c @@ -29,22 +29,22 @@ __feclearexcept (int excepts) /* Bah, we have to clear selected exceptions. Since there is no `fldsw' instruction we have to do it the hard way. */ - __asm__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ ("fnstenv\t%0" : "=m" (temp)); /* Clear the relevant bits. */ temp.__status_word &= excepts ^ FE_ALL_EXCEPT; /* Put the new data in effect. */ - __asm__ ("fldenv %0" : : "m" (*&temp)); + __asm__ ("fldenv\t%0" : : "m" (temp)); /* And the same procedure for SSE. */ - __asm__ ("stmxcsr %0" : "=m" (*&mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (mxcsr)); /* Clear the relevant bits. */ mxcsr &= ~excepts; /* And put them into effect. */ - __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); /* Success. */ return 0; diff --git a/sysdeps/x86_64/fpu/fedisblxcpt.c b/sysdeps/x86_64/fpu/fedisblxcpt.c index 6d87dfe71e..04c509dafb 100644 --- a/sysdeps/x86_64/fpu/fedisblxcpt.c +++ b/sysdeps/x86_64/fpu/fedisblxcpt.c @@ -27,19 +27,19 @@ fedisableexcept (int excepts) excepts &= FE_ALL_EXCEPT; /* Get the current control word of the x87 FPU. */ - __asm__ ("fstcw %0" : "=m" (*&new_exc)); + __asm__ ("fstcw\t%0" : "=m" (new_exc)); old_exc = (~new_exc) & FE_ALL_EXCEPT; new_exc |= excepts; - __asm__ ("fldcw %0" : : "m" (*&new_exc)); + __asm__ ("fldcw\t%0" : : "m" (new_exc)); /* And now the same for the SSE MXCSR register. */ - __asm__ ("stmxcsr %0" : "=m" (*&new)); + __asm__ ("%vstmxcsr\t%0" : "=m" (new)); /* The SSE exception masks are shifted by 7 bits. */ new |= excepts << 7; - __asm__ ("ldmxcsr %0" : : "m" (*&new)); + __asm__ ("%vldmxcsr\t%0" : : "m" (new)); return old_exc; } diff --git a/sysdeps/x86_64/fpu/feenablxcpt.c b/sysdeps/x86_64/fpu/feenablxcpt.c index 36a9bcd50f..3c36f0c19f 100644 --- a/sysdeps/x86_64/fpu/feenablxcpt.c +++ b/sysdeps/x86_64/fpu/feenablxcpt.c @@ -27,19 +27,19 @@ feenableexcept (int excepts) excepts &= FE_ALL_EXCEPT; /* Get the current control word of the x87 FPU. */ - __asm__ ("fstcw %0" : "=m" (*&new_exc)); + __asm__ ("fstcw\t%0" : "=m" (new_exc)); old_exc = (~new_exc) & FE_ALL_EXCEPT; new_exc &= ~excepts; - __asm__ ("fldcw %0" : : "m" (*&new_exc)); + __asm__ ("fldcw\t%0" : : "m" (new_exc)); /* And now the same for the SSE MXCSR register. */ - __asm__ ("stmxcsr %0" : "=m" (*&new)); + __asm__ ("%vstmxcsr\t%0" : "=m" (new)); /* The SSE exception masks are shifted by 7 bits. */ new &= ~(excepts << 7); - __asm__ ("ldmxcsr %0" : : "m" (*&new)); + __asm__ ("%vldmxcsr\t%0" : : "m" (new)); return old_exc; } diff --git a/sysdeps/x86_64/fpu/fegetenv.c b/sysdeps/x86_64/fpu/fegetenv.c index 7c89583c0d..ed729d4c30 100644 --- a/sysdeps/x86_64/fpu/fegetenv.c +++ b/sysdeps/x86_64/fpu/fegetenv.c @@ -21,11 +21,11 @@ int __fegetenv (fenv_t *envp) { - __asm__ ("fnstenv %0\n" + __asm__ ("fnstenv\t%0\n\t" /* fnstenv changes the exception mask, so load back the stored environment. */ - "fldenv %0\n" - "stmxcsr %1" : "=m" (*envp), "=m" (envp->__mxcsr)); + "fldenv\t%0\n\t" + "%vstmxcsr\t%1" : "=m" (*envp), "=m" (envp->__mxcsr)); /* Success. */ return 0; diff --git a/sysdeps/x86_64/fpu/fegetexcept.c b/sysdeps/x86_64/fpu/fegetexcept.c index a34745eabb..bc32b7ef07 100644 --- a/sysdeps/x86_64/fpu/fegetexcept.c +++ b/sysdeps/x86_64/fpu/fegetexcept.c @@ -24,7 +24,7 @@ fegetexcept (void) unsigned short int exc; /* Get the current control word. */ - __asm__ ("fstcw %0" : "=m" (*&exc)); + __asm__ ("fstcw\t%0" : "=m" (exc)); return (~exc) & FE_ALL_EXCEPT; } diff --git a/sysdeps/x86_64/fpu/fegetmode.c b/sysdeps/x86_64/fpu/fegetmode.c index 8830a161d6..9b51a91068 100644 --- a/sysdeps/x86_64/fpu/fegetmode.c +++ b/sysdeps/x86_64/fpu/fegetmode.c @@ -23,6 +23,6 @@ int fegetmode (femode_t *modep) { _FPU_GETCW (modep->__control_word); - __asm__ ("stmxcsr %0" : "=m" (modep->__mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (modep->__mxcsr)); return 0; } diff --git a/sysdeps/x86_64/fpu/fegetround.c b/sysdeps/x86_64/fpu/fegetround.c index 6c01346a5b..05bf719c70 100644 --- a/sysdeps/x86_64/fpu/fegetround.c +++ b/sysdeps/x86_64/fpu/fegetround.c @@ -25,7 +25,7 @@ __fegetround (void) /* We only check the x87 FPU unit. The SSE unit should be the same - and if it's not the same there's no way to signal it. */ - __asm__ ("fnstcw %0" : "=m" (*&cw)); + __asm__ ("fnstcw\t%0" : "=m" (cw)); return cw & 0xc00; } diff --git a/sysdeps/x86_64/fpu/feholdexcpt.c b/sysdeps/x86_64/fpu/feholdexcpt.c index 958aa3668e..782d01346c 100644 --- a/sysdeps/x86_64/fpu/feholdexcpt.c +++ b/sysdeps/x86_64/fpu/feholdexcpt.c @@ -25,14 +25,14 @@ __feholdexcept (fenv_t *envp) /* Store the environment. Recall that fnstenv has a side effect of masking all exceptions. Then clear all exceptions. */ - __asm__ ("fnstenv %0\n\t" - "stmxcsr %1\n\t" + __asm__ ("fnstenv\t%0\n\t" + "%vstmxcsr\t%1\n\t" "fnclex" : "=m" (*envp), "=m" (envp->__mxcsr)); /* Set the SSE MXCSR register. */ mxcsr = (envp->__mxcsr | 0x1f80) & ~0x3f; - __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); return 0; } diff --git a/sysdeps/x86_64/fpu/fesetenv.c b/sysdeps/x86_64/fpu/fesetenv.c index a50c704a5f..01130ab7dd 100644 --- a/sysdeps/x86_64/fpu/fesetenv.c +++ b/sysdeps/x86_64/fpu/fesetenv.c @@ -35,8 +35,8 @@ __fesetenv (const fenv_t *envp) values which we do not want to come from the saved environment. Therefore, we get the current environment and replace the values we want to use from the environment specified by the parameter. */ - __asm__ ("fnstenv %0\n" - "stmxcsr %1" : "=m" (*&temp), "=m" (*&temp.__mxcsr)); + __asm__ ("fnstenv\t%0\n\t" + "%vstmxcsr\t%1" : "=m" (temp), "=m" (temp.__mxcsr)); if (envp == FE_DFL_ENV) { @@ -103,8 +103,8 @@ __fesetenv (const fenv_t *envp) temp.__mxcsr = envp->__mxcsr; } - __asm__ ("fldenv %0\n" - "ldmxcsr %1" : : "m" (temp), "m" (temp.__mxcsr)); + __asm__ ("fldenv\t%0\n\t" + "%vldmxcsr\t%1" : : "m" (temp), "m" (temp.__mxcsr)); /* Success. */ return 0; diff --git a/sysdeps/x86_64/fpu/fesetexcept.c b/sysdeps/x86_64/fpu/fesetexcept.c index 15de76d544..c902cef646 100644 --- a/sysdeps/x86_64/fpu/fesetexcept.c +++ b/sysdeps/x86_64/fpu/fesetexcept.c @@ -23,9 +23,9 @@ fesetexcept (int excepts) { unsigned int mxcsr; - __asm__ ("stmxcsr %0" : "=m" (*&mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (mxcsr)); mxcsr |= excepts & FE_ALL_EXCEPT; - __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); return 0; } diff --git a/sysdeps/x86_64/fpu/fesetmode.c b/sysdeps/x86_64/fpu/fesetmode.c index 3bd728e599..8ac20a065d 100644 --- a/sysdeps/x86_64/fpu/fesetmode.c +++ b/sysdeps/x86_64/fpu/fesetmode.c @@ -28,7 +28,7 @@ fesetmode (const femode_t *modep) { fpu_control_t cw; unsigned int mxcsr; - __asm__ ("stmxcsr %0" : "=m" (mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (mxcsr)); /* Preserve SSE exception flags but restore other state in MXCSR. */ mxcsr &= FE_ALL_EXCEPT_X86; @@ -45,6 +45,6 @@ fesetmode (const femode_t *modep) mxcsr |= modep->__mxcsr & ~FE_ALL_EXCEPT_X86; } _FPU_SETCW (cw); - __asm__ ("ldmxcsr %0" : : "m" (mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); return 0; } diff --git a/sysdeps/x86_64/fpu/fesetround.c b/sysdeps/x86_64/fpu/fesetround.c index 59665e2443..9d5aa31ef9 100644 --- a/sysdeps/x86_64/fpu/fesetround.c +++ b/sysdeps/x86_64/fpu/fesetround.c @@ -29,17 +29,17 @@ __fesetround (int round) return 1; /* First set the x87 FPU. */ - asm ("fnstcw %0" : "=m" (*&cw)); + asm ("fnstcw\t%0" : "=m" (cw)); cw &= ~0xc00; cw |= round; - asm ("fldcw %0" : : "m" (*&cw)); + asm ("fldcw\t%0" : : "m" (cw)); /* And now the MSCSR register for SSE, the precision is at different bit positions in the different units, we need to shift it 3 bits. */ - asm ("stmxcsr %0" : "=m" (*&mxcsr)); + asm ("%vstmxcsr\t%0" : "=m" (mxcsr)); mxcsr &= ~ 0x6000; mxcsr |= round << 3; - asm ("ldmxcsr %0" : : "m" (*&mxcsr)); + asm ("%vldmxcsr\t%0" : : "m" (mxcsr)); return 0; } diff --git a/sysdeps/x86_64/fpu/feupdateenv.c b/sysdeps/x86_64/fpu/feupdateenv.c index 79a3b5dc43..e0d71cc53c 100644 --- a/sysdeps/x86_64/fpu/feupdateenv.c +++ b/sysdeps/x86_64/fpu/feupdateenv.c @@ -25,7 +25,8 @@ __feupdateenv (const fenv_t *envp) unsigned int xtemp; /* Save current exceptions. */ - __asm__ ("fnstsw %0\n\tstmxcsr %1" : "=m" (*&temp), "=m" (xtemp)); + __asm__ ("fnstsw\t%0\n\t" + "%vstmxcsr\t%1" : "=m" (temp), "=m" (xtemp)); temp = (temp | xtemp) & FE_ALL_EXCEPT; /* Install new environment. */ diff --git a/sysdeps/x86_64/fpu/fgetexcptflg.c b/sysdeps/x86_64/fpu/fgetexcptflg.c index fc4d9b5e0a..ec2f7857dc 100644 --- a/sysdeps/x86_64/fpu/fgetexcptflg.c +++ b/sysdeps/x86_64/fpu/fgetexcptflg.c @@ -25,8 +25,8 @@ fegetexceptflag (fexcept_t *flagp, int excepts) unsigned int mxscr; /* Get the current exceptions for the x87 FPU and SSE unit. */ - __asm__ ("fnstsw %0\n" - "stmxcsr %1" : "=m" (*&temp), "=m" (*&mxscr)); + __asm__ ("fnstsw\t%0\n\t" + "%vstmxcsr\t%1" : "=m" (temp), "=m" (mxscr)); *flagp = (temp | mxscr) & FE_ALL_EXCEPT & excepts; diff --git a/sysdeps/x86_64/fpu/fraiseexcpt.c b/sysdeps/x86_64/fpu/fraiseexcpt.c index 05631b94ce..024e815e8b 100644 --- a/sysdeps/x86_64/fpu/fraiseexcpt.c +++ b/sysdeps/x86_64/fpu/fraiseexcpt.c @@ -33,7 +33,7 @@ __feraiseexcept (int excepts) /* One example of an invalid operation is 0.0 / 0.0. */ float f = 0.0; - __asm__ __volatile__ ("divss %0, %0 " : "+x" (f)); + __asm__ __volatile__ ("%vdivss\t%0,%0" : "+x" (f)); (void) &f; } @@ -43,7 +43,7 @@ __feraiseexcept (int excepts) float f = 1.0; float g = 0.0; - __asm__ __volatile__ ("divss %1, %0" : "+x" (f) : "x" (g)); + __asm__ __volatile__ ("%vdivss\t%1,%0" : "+x" (f) : "x" (g)); (void) &f; } @@ -57,13 +57,13 @@ __feraiseexcept (int excepts) /* Bah, we have to clear selected exceptions. Since there is no `fldsw' instruction we have to do it the hard way. */ - __asm__ __volatile__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ __volatile__ ("fnstenv\t%0" : "=m" (temp)); /* Set the relevant bits. */ temp.__status_word |= FE_OVERFLOW; /* Put the new data in effect. */ - __asm__ __volatile__ ("fldenv %0" : : "m" (*&temp)); + __asm__ __volatile__ ("fldenv\t%0" : : "m" (temp)); /* And raise the exception. */ __asm__ __volatile__ ("fwait"); @@ -79,13 +79,13 @@ __feraiseexcept (int excepts) /* Bah, we have to clear selected exceptions. Since there is no `fldsw' instruction we have to do it the hard way. */ - __asm__ __volatile__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ __volatile__ ("fnstenv\t%0" : "=m" (temp)); /* Set the relevant bits. */ temp.__status_word |= FE_UNDERFLOW; /* Put the new data in effect. */ - __asm__ __volatile__ ("fldenv %0" : : "m" (*&temp)); + __asm__ __volatile__ ("fldenv\t%0" : : "m" (temp)); /* And raise the exception. */ __asm__ __volatile__ ("fwait"); @@ -101,13 +101,13 @@ __feraiseexcept (int excepts) /* Bah, we have to clear selected exceptions. Since there is no `fldsw' instruction we have to do it the hard way. */ - __asm__ __volatile__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ __volatile__ ("fnstenv\t%0" : "=m" (temp)); /* Set the relevant bits. */ temp.__status_word |= FE_INEXACT; /* Put the new data in effect. */ - __asm__ __volatile__ ("fldenv %0" : : "m" (*&temp)); + __asm__ __volatile__ ("fldenv\t%0" : : "m" (temp)); /* And raise the exception. */ __asm__ __volatile__ ("fwait"); diff --git a/sysdeps/x86_64/fpu/fsetexcptflg.c b/sysdeps/x86_64/fpu/fsetexcptflg.c index adb8d77316..9448fcab42 100644 --- a/sysdeps/x86_64/fpu/fsetexcptflg.c +++ b/sysdeps/x86_64/fpu/fsetexcptflg.c @@ -35,22 +35,22 @@ fesetexceptflag (const fexcept_t *flagp, int excepts) /* Get the current x87 FPU environment. We have to do this since we cannot separately set the status word. */ - __asm__ ("fnstenv %0" : "=m" (*&temp)); + __asm__ ("fnstenv\t%0" : "=m" (temp)); /* Clear relevant flags. */ temp.__status_word &= ~(excepts & ~ *flagp); /* Store the new status word (along with the rest of the environment). */ - __asm__ ("fldenv %0" : : "m" (*&temp)); + __asm__ ("fldenv\t%0" : : "m" (temp)); /* And now similarly for SSE. */ - __asm__ ("stmxcsr %0" : "=m" (*&mxcsr)); + __asm__ ("%vstmxcsr\t%0" : "=m" (mxcsr)); /* Clear or set relevant flags. */ mxcsr ^= (mxcsr ^ *flagp) & excepts; /* Put the new data in effect. */ - __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); + __asm__ ("%vldmxcsr\t%0" : : "m" (mxcsr)); /* Success. */ return 0; diff --git a/sysdeps/x86_64/fpu/ftestexcept.c b/sysdeps/x86_64/fpu/ftestexcept.c index 87a851d4b4..81658c86c1 100644 --- a/sysdeps/x86_64/fpu/ftestexcept.c +++ b/sysdeps/x86_64/fpu/ftestexcept.c @@ -25,8 +25,8 @@ __fetestexcept (int excepts) unsigned int mxscr; /* Get current exceptions. */ - __asm__ ("fnstsw %0\n" - "stmxcsr %1" : "=m" (*&temp), "=m" (*&mxscr)); + __asm__ ("fnstsw\t%0\n\t" + "%vstmxcsr\t%1" : "=m" (temp), "=m" (mxscr)); return (temp | mxscr) & excepts & FE_ALL_EXCEPT; }