From patchwork Thu May 22 11:48:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yury Khrustalev X-Patchwork-Id: 112829 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 994603857BBA for ; Thu, 22 May 2025 11:49:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 994603857BBA X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 9A858385843F for ; Thu, 22 May 2025 11:48:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9A858385843F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9A858385843F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1747914518; cv=none; b=cVYANhBuO9sOSYKG7wzu5RQOhAM0UDhV/KkwmhfN0oklQ3pr2/nvbPc/0KApDio1P5UgfsAYoSlFn/m5XHO4rMPVJrdxJ/oHR9d19BJNAZjh9RKsXMCjXLNNe6SViZyD10gX2WM0Gd62+NjwJRsjThSCDkPsihbNn9Deibzhf80= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1747914518; c=relaxed/simple; bh=IwjZeoBKO3y+Lf5tYTi6cVh+cZ+XGCC4W4oCoaz4OI4=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=kKHb8c4mH6WDO7BNanze6GFL3poRyNa8blxhiKVqD+vNfDoiB9clgdy7Kv3ZE1lygNCiIssR2ScEvr+uo/yQaJh4OKYdZiycIEUc33DFACHXg5zOXRf3qT/HOTOBcLM0cWyoNSxR6b16PhzYyXKvkJs7YHyFu+GuO/pfjgWRlYI= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9A858385843F Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 149461A32; Thu, 22 May 2025 04:48:24 -0700 (PDT) Received: from udebian.localdomain (unknown [10.1.27.162]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D7BA3F673; Thu, 22 May 2025 04:48:37 -0700 (PDT) From: Yury Khrustalev To: libc-alpha@sourceware.org Cc: wilco.dijkstra@arm.com, carlos@redhat.com, fweimer@redhat.com, adhemerval.zanella@linaro.org Subject: [PATCH 1/1] aarch64: clear ZA state of SME before clone syscall Date: Thu, 22 May 2025 12:48:28 +0100 Message-Id: <20250522114828.2291047-2-yury.khrustalev@arm.com> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250522114828.2291047-1-yury.khrustalev@arm.com> References: <20250522114828.2291047-1-yury.khrustalev@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org Call __libc_arm_za_disable before syscall to clear ZA state of SME for the clone and clone3 syscalls. As indicated in [1], not clearing this state may cause a variety of issues. We clear this state immediately before making a syscall as this is safe to do anyway and prevents those problems from happening. The __libc_arm_za_disable function would do nothing if SME is not available. Also add relevant tests for clone and clone3 use cases. While the former is trivial, the latter is a bit complex since the clone3 symbol is not public. To avoid having to check all possible ways clone3() may be called via other public functions (e.g. fork(), vfork(), pthread_create()), we put together a somewhat synthetic test that links directly with clone3.o. All the existing functions that have calls to clone3() may not actually use it, in which case the outcome of this test would be unexpected. Having a direct call to the clone3 symbol in the test allows to check precisely what we need to test: that the __arm_za_disable() function is called as per the SME ABI [2]. Since we use unusual approach when linking test for the clone3 use case, to keep things simple, we actually call __arm_za_disable() provided by libgcc since GCC 14 instead of using the internal Glibc implementation, but that is OK for the purposes of this test since both functions do the same thing while the one from libgcc doesn't have extra link-time dependencies. [1]: https://lore.kernel.org/linux-arm-kernel/20250508132644.1395904-14-mark.rutland@arm.com/ [2]: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst --- sysdeps/aarch64/Makefile | 7 ++ sysdeps/aarch64/tst-sme-clone.c | 93 +++++++++++++++ sysdeps/aarch64/tst-sme-clone3.c | 143 +++++++++++++++++++++++ sysdeps/aarch64/tst-sme-helper.h | 32 +++++ sysdeps/aarch64/tst-sme-za-state.c | 38 +----- sysdeps/unix/sysv/linux/aarch64/clone.S | 11 ++ sysdeps/unix/sysv/linux/aarch64/clone3.S | 11 ++ 7 files changed, 301 insertions(+), 34 deletions(-) create mode 100644 sysdeps/aarch64/tst-sme-clone.c create mode 100644 sysdeps/aarch64/tst-sme-clone3.c diff --git a/sysdeps/aarch64/Makefile b/sysdeps/aarch64/Makefile index 0fc6cf1693..51eb294b74 100644 --- a/sysdeps/aarch64/Makefile +++ b/sysdeps/aarch64/Makefile @@ -76,6 +76,13 @@ tests += \ tst-sme-jmp \ tst-sme-za-state \ # tests +tests-internal += \ + tst-sme-clone \ + tst-sme-clone3 \ + # tests-internal + +$(objpfx)tst-sme-clone3: $(objpfx)clone3.o + endif ifeq ($(subdir),malloc) diff --git a/sysdeps/aarch64/tst-sme-clone.c b/sysdeps/aarch64/tst-sme-clone.c new file mode 100644 index 0000000000..2e801ba17b --- /dev/null +++ b/sysdeps/aarch64/tst-sme-clone.c @@ -0,0 +1,93 @@ +/* Test that ZA state of SME is cleared in both parent and child + when clone() syscall is used. + Copyright (C) 2025 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include + +#include +#include +#include + +#include +#include + +#include "tst-sme-helper.h" + +static int +fun (void * const arg) +{ + printf ("in child: %s\n", (const char *)arg); + /* Check that ZA state of SME was disabled in child. */ + check_sme_za_state ("after clone in child", /* Clear. */ true); + return 0; +} + +static char __attribute__((aligned(16))) +stack[1024 * 1024]; + +static void +run (struct blk *ptr) +{ + char *syscall_name = (char *)"clone"; + printf ("in parent: before %s\n", syscall_name); + + /* Enabled ZA state so that effect of disabling be observable. */ + enable_sme_za_state (ptr); + check_sme_za_state ("before clone", /* Clear. */ false); + + pid_t pid = xclone (fun, syscall_name, stack, sizeof (stack), + CLONE_NEWUSER | CLONE_NEWNS | SIGCHLD); + + /* Check that ZA state of SME was disabled in parent. */ + check_sme_za_state ("after clone in parent", /* Clear. */ true); + + TEST_VERIFY (xwaitpid (pid, NULL, 0) == pid); +} + +static int +do_test (void) +{ + unsigned long hwcap2 = getauxval (AT_HWCAP2); + if ((hwcap2 & HWCAP2_SME) == 0) + return EXIT_UNSUPPORTED; + + /* Get current streaming SVE vector register size. */ + svl = get_svl (); + printf ("svl: %lu\n", svl); + TEST_VERIFY_EXIT (!(svl < 16 || svl % 16 != 0 || svl >= (1 << 16))); + + /* Initialise buffer for ZA state of SME. */ + sme_state = xmalloc (svl * svl); + memset (sme_state, 1, svl * svl); + struct blk blk = { + .za_save_buffer = sme_state, + .num_za_save_slices = svl, + .__reserved = {0}, + }; + + run (&blk); + + return 0; +} + +#include + diff --git a/sysdeps/aarch64/tst-sme-clone3.c b/sysdeps/aarch64/tst-sme-clone3.c new file mode 100644 index 0000000000..8aca369500 --- /dev/null +++ b/sysdeps/aarch64/tst-sme-clone3.c @@ -0,0 +1,143 @@ +/* Test that ZA state of SME is cleared in both parent and child + when clone3() syscall is used. + Copyright (C) 2025 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "tst-sme-helper.h" + +/* Since clone3 is not a public symbol, we link this test explicitly + with clone3.o and have to provide this declaration. */ +int __clone3 (struct clone_args *cl_args, size_t size, + int (*func)(void *arg), void *arg); + +static int +fun (void * const arg) +{ + printf ("in child: %s\n", (const char *)arg); + /* Check that ZA state of SME was disabled in child. */ + check_sme_za_state ("after clone3 in child", /* Clear. */ true); + return 0; +} + +static char __attribute__((aligned(16))) +stack[1024 * 1024]; + +static void +run (struct blk *ptr) +{ + char *syscall_name = (char *)"clone3"; + struct clone_args args = { + .flags = CLONE_VM | CLONE_VFORK, + .exit_signal = SIGCHLD, + .stack = (uintptr_t) stack, + .stack_size = sizeof (stack), + }; + printf ("in parent: before %s\n", syscall_name); + + /* Enabled ZA state so that effect of disabling be observable. */ + enable_sme_za_state (ptr); + check_sme_za_state ("before clone", /* Clear. */ false); + + pid_t pid = __clone3 (&args, sizeof (args), fun, syscall_name); + + /* Check that ZA state of SME was disabled in parent. */ + check_sme_za_state ("after clone in parent", /* Clear. */ true); + + printf ("%s child pid: %d\n", syscall_name, pid); + if (pid == -1) + { + if (errno == ENOSYS) + { + puts ("clone3 syscall is not supported"); + exit (EXIT_UNSUPPORTED); + } + perror ("clone3"); + TEST_VERIFY_EXIT (0); + } + if (waitid (P_PID, pid, NULL, WEXITED)) + { + perror ("waitid"); + TEST_VERIFY_EXIT (0); + } + printf ("in parent: after %s\n", syscall_name); +} + +static int +do_test (void) +{ + unsigned long hwcap2 = getauxval (AT_HWCAP2); + if ((hwcap2 & HWCAP2_SME) == 0) + return EXIT_UNSUPPORTED; + + /* Get current streaming SVE vector register size. */ + svl = get_svl (); + printf ("svl: %lu\n", svl); + TEST_VERIFY_EXIT (!(svl < 16 || svl % 16 != 0 || svl >= (1 << 16))); + + /* Initialise buffer for ZA state of SME. */ + sme_state = xmalloc (svl * svl); + memset (sme_state, 1, svl * svl); + struct blk blk = { + .za_save_buffer = sme_state, + .num_za_save_slices = svl, + .__reserved = {0}, + }; + + run (&blk); + + return 0; +} + +#include + +/* Workaround to simplify linking with clone3.o. */ +void __syscall_error(int code) +{ + int err = -code; + fprintf (stderr, "syscall error %d (%s)\n", err, strerror (err)); + exit (err); +} + +/* Provided by libgcc since GCC 14. */ +extern void __arm_za_disable (void); + +/* We don't want to pull all the dependencies of Glibc's implementation + of __arm_za_disable, so we use symbol from libgcc because it is + available for this test executable. */ +void __libc_arm_za_disable (void) +{ +#if defined __GNUC__ && __GNUC__ >= 14 + __arm_za_disable (); +#else + puts ("compiler doesn't support SME"); + exit (EXIT_UNSUPPORTED); +#endif +} diff --git a/sysdeps/aarch64/tst-sme-helper.h b/sysdeps/aarch64/tst-sme-helper.h index f049416c2b..46477a64bd 100644 --- a/sysdeps/aarch64/tst-sme-helper.h +++ b/sysdeps/aarch64/tst-sme-helper.h @@ -95,3 +95,35 @@ set_tpidr2 (struct blk *blk) ".inst 0xd51bd0a0 /* msr tpidr2_el0, x0 */\n" :: "r"(x0) : "memory"); } + +/* Check if SME state is disabled (when CLEAR is true) or + enabled (when CLEAR is false). */ +static void __attribute__ ((unused)) +check_sme_za_state (const char msg[], bool clear) +{ + unsigned long svcr = get_svcr (); + void *tpidr2 = get_tpidr2 (); + printf ("[%s]\n", msg); + printf ("svcr = %016lx\n", svcr); + printf ("tpidr2 = %016lx\n", (unsigned long)tpidr2); + if (clear) + { + TEST_VERIFY (svcr == 0); + TEST_VERIFY (tpidr2 == NULL); + } + else + { + TEST_VERIFY (svcr != 0); + TEST_VERIFY (tpidr2 != NULL); + } +} + +static uint8_t *sme_state; + +static void __attribute__ ((unused)) +enable_sme_za_state (struct blk *ptr) +{ + set_tpidr2 (ptr); + start_za (); + load_za (sme_state); +} diff --git a/sysdeps/aarch64/tst-sme-za-state.c b/sysdeps/aarch64/tst-sme-za-state.c index 63f6eebeb4..2ed1a6b70b 100644 --- a/sysdeps/aarch64/tst-sme-za-state.c +++ b/sysdeps/aarch64/tst-sme-za-state.c @@ -28,36 +28,6 @@ #include "tst-sme-helper.h" -static uint8_t *state; - -static void -enable_sme_za_state (struct blk *ptr) -{ - set_tpidr2 (ptr); - start_za (); - load_za (state); -} - -static void -check_sme_za_state (const char msg[], bool clear) -{ - unsigned long svcr = get_svcr (); - void *tpidr2 = get_tpidr2 (); - printf ("[%s]\n", msg); - printf ("svcr = %016lx\n", svcr); - printf ("tpidr2 = %016lx\n", (unsigned long)tpidr2); - if (clear) - { - TEST_VERIFY (svcr == 0); - TEST_VERIFY (tpidr2 == NULL); - } - else - { - TEST_VERIFY (svcr != 0); - TEST_VERIFY (tpidr2 != NULL); - } -} - static void run (struct blk *ptr) { @@ -102,17 +72,17 @@ do_test (void) TEST_VERIFY_EXIT (!(svl < 16 || svl % 16 != 0 || svl >= (1 << 16))); /* Initialise buffer for ZA state of SME. */ - state = xmalloc (svl * svl); - memset (state, 1, svl * svl); + sme_state = xmalloc (svl * svl); + memset (sme_state, 1, svl * svl); struct blk blk = { - .za_save_buffer = state, + .za_save_buffer = sme_state, .num_za_save_slices = svl, .__reserved = {0}, }; run (&blk); - free (state); + free (sme_state); return 0; } diff --git a/sysdeps/unix/sysv/linux/aarch64/clone.S b/sysdeps/unix/sysv/linux/aarch64/clone.S index 40015c6933..66812e7efd 100644 --- a/sysdeps/unix/sysv/linux/aarch64/clone.S +++ b/sysdeps/unix/sysv/linux/aarch64/clone.S @@ -45,6 +45,17 @@ ENTRY(__clone) and x1, x1, -16 cbz x1, .Lsyscall_error + /* Clear ZA state of SME. */ + /* The calling convention of __libc_arm_za_disable allows to do + this thus allowing to avoid saving to and reading from stack. + As a result we also don't need to sign the return address and + check it after returning because it is not stored to stack. */ + mov x13, x30 + cfi_register (x30, x13) + bl __libc_arm_za_disable + mov x30, x13 + cfi_register (x13, x30) + /* Do the system call. */ /* X0:flags, x1:newsp, x2:parenttidptr, x3:newtls, x4:childtid. */ mov x0, x2 /* flags */ diff --git a/sysdeps/unix/sysv/linux/aarch64/clone3.S b/sysdeps/unix/sysv/linux/aarch64/clone3.S index c9ca845ef2..b2e0d10eb2 100644 --- a/sysdeps/unix/sysv/linux/aarch64/clone3.S +++ b/sysdeps/unix/sysv/linux/aarch64/clone3.S @@ -46,6 +46,17 @@ ENTRY(__clone3) cbz x10, .Lsyscall_error /* No NULL cl_args pointer. */ cbz x2, .Lsyscall_error /* No NULL function pointer. */ + /* Clear ZA state of SME. */ + /* The calling convention of __libc_arm_za_disable allows to do + this thus allowing to avoid saving to and reading from stack. + As a result we also don't need to sign the return address and + check it after returning because it is not stored to stack. */ + mov x13, x30 + cfi_register (x30, x13) + bl __libc_arm_za_disable + mov x30, x13 + cfi_register (x13, x30) + /* Do the system call, the kernel expects: x8: system call number x0: cl_args