From patchwork Thu Mar 13 08:43:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: daichengrong X-Patchwork-Id: 107818 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 868CF3858294 for ; Thu, 13 Mar 2025 08:44:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 868CF3858294 X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from cstnet.cn (smtp81.cstnet.cn [159.226.251.81]) by sourceware.org (Postfix) with ESMTPS id 6C9873858D33 for ; Thu, 13 Mar 2025 08:43:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6C9873858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=iscas.ac.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=iscas.ac.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6C9873858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=159.226.251.81 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1741855403; cv=none; b=tUJXVVINtMuF+tSlmuxnTqTH0RGW+3YAkSyP+E84S8MFARy7EHR4lWofTxFh4uQFq7nAzTLRnJcNsxSjyyMRD6reOJYpIK9XXhthXILbWI+b1sPJlE5SRbzR8na3YBMzGejzm+tX/pH20bZe6FtMsee2X2DgQhb+oNVE4SQM3aU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1741855403; c=relaxed/simple; bh=8uLyGniYnVqwMaG9uy0hx5xeXUSR1+NApe27N67Zguk=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=PY2+jwLobwa3gNnZu03eCsZBAuFnsVJ2mAWoe5WsDUqT7wrg0rTxKU7MthOHxV992uLxdd9Vc5B9jlxg8TxiUC5A+xy2yHbp0GGTATEmsqU2zgES9D2Cm3z8MUcP9gCli4a/ac8DnBP/8DQHlM9lx3UoxI1/feb2j3SBJgbNQuo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6C9873858D33 Received: from chengrong-ubuntu-02.home.arpa (unknown [124.16.138.129]) by APP-03 (Coremail) with SMTP id rQCowACHjlqimtJnwuzWFA--.3508S2; Thu, 13 Mar 2025 16:43:14 +0800 (CST) From: daichengrong@iscas.ac.cn To: libc-alpha@sourceware.org Subject: [PATCH] RISC-V: add RVV support for memrchr using IFUNC Date: Thu, 13 Mar 2025 16:43:05 +0800 Message-Id: <20250313084305.3494161-1-daichengrong@iscas.ac.cn> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-CM-TRANSID: rQCowACHjlqimtJnwuzWFA--.3508S2 X-Coremail-Antispam: 1UD129KBjvAXoWfGF1DWFyUCw13WrWDtr4DCFg_yoW8JF4fto WSgFW7Xr42gF1UCr4rC3y7Ja9Fgw17Gr4UXayDWan5Jr1ktrn5Kr10yasrZrs7GF4rWF45 XFW7JrW3Jayrtrn5n29KB7ZKAUJUUUU5529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUY47k0a2IF6w4kM7kC6x804xWl14x267AKxVWUJVW8JwAFc2x0 x2IEx4CE42xK8VAvwI8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj4 1l84x0c7CEw4AK67xGY2AK021l84ACjcxK6xIIjxv20xvE14v26ryj6F1UM28EF7xvwVC0 I7IYx2IY6xkF7I0E14v26F4j6r4UJwA2z4x0Y4vEx4A2jsIE14v26r4UJVWxJr1l84ACjc xK6I8E87Iv6xkF7I0E14v26F4UJVW0owAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40E FcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUAVWUtwAv7VC2z280aVAFwI0_Gr 1j6F4UJwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lw4CEc2x0rVAKj4xx MxkIecxEwVAFwVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s 026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_ Jr0_JrylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20x vEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE 14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa 7IU5CL9UUUUUU== X-Originating-IP: [124.16.138.129] X-CM-SenderInfo: pgdluxxhqj201qj6x2xfdvhtffof0/ X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org From: daichengrong This patch introduces a check for assembler compilation support for RVV This patch introduces RVV memrchr with IFUNC. The implementation selects the RVV memrchr via dl_hwcap. On BPI_F3, running the bench-memrchr in glibc benchtests, it gets an average improvement of 114%. On K230, the average speedup was 99%. --- config.h.in | 3 + sysdeps/riscv/configure | 35 +++++++ sysdeps/riscv/configure.ac | 25 +++++ sysdeps/riscv/multiarch/memrchr_generic.c | 35 +++++++ sysdeps/riscv/multiarch/memrchr_rvv.S | 96 +++++++++++++++++++ .../unix/sysv/linux/riscv/multiarch/Makefile | 8 ++ .../linux/riscv/multiarch/ifunc-impl-list.c | 17 ++++ .../unix/sysv/linux/riscv/multiarch/memrchr.c | 70 ++++++++++++++ 8 files changed, 289 insertions(+) create mode 100644 sysdeps/riscv/multiarch/memrchr_generic.c create mode 100644 sysdeps/riscv/multiarch/memrchr_rvv.S create mode 100644 sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c diff --git a/config.h.in b/config.h.in index cdbd555366..7802e8f9c4 100644 --- a/config.h.in +++ b/config.h.in @@ -139,6 +139,9 @@ /* RISC-V floating-point ABI for ld.so. */ #undef RISCV_ABI_FLEN +/* Define if assembler supports vector instructions on RISC-V. */ +#undef HAVE_RISCV_ASM_VECTOR_SUPPORT + /* LOONGARCH integer ABI for ld.so. */ #undef LOONGARCH_ABI_GRLEN diff --git a/sysdeps/riscv/configure b/sysdeps/riscv/configure index 3ae4ae3bdb..bbda6a0d4a 100644 --- a/sysdeps/riscv/configure +++ b/sysdeps/riscv/configure @@ -83,3 +83,38 @@ if test "$libc_cv_static_pie_on_riscv" = yes; then fi +# Check if assembler supports attribute riscv vector macro. +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for gcc attribute riscv vector support" >&5 +printf %s "checking for gcc attribute riscv vector support... " >&6; } +if test ${libc_cv_gcc_rvv+y} +then : + printf %s "(cached) " >&6 +else case e in #( + e) cat > conftest.S <&5 \ + 2>&5 ; then + libc_cv_gcc_rvv=yes +fi +rm -f conftest* ;; +esac +fi +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_cv_gcc_rvv" >&5 +printf "%s\n" "$libc_cv_gcc_rvv" >&6; } + +if test x"$libc_cv_gcc_rvv" = xyes; then + printf "%s\n" "#define HAVE_RISCV_ASM_VECTOR_SUPPORT 1" >>confdefs.h + +fi + +config_vars="$config_vars +have-gcc-riscv-rvv = $libc_cv_gcc_rvv" + + diff --git a/sysdeps/riscv/configure.ac b/sysdeps/riscv/configure.ac index ee3d1ed014..27e0e51b1c 100644 --- a/sysdeps/riscv/configure.ac +++ b/sysdeps/riscv/configure.ac @@ -43,3 +43,28 @@ EOF if test "$libc_cv_static_pie_on_riscv" = yes; then AC_DEFINE(SUPPORT_STATIC_PIE) fi + +# Check if assembler supports attribute riscv vector macro. +AC_CACHE_CHECK([for gcc attribute riscv vector support], + libc_cv_gcc_rvv, [dnl +cat > conftest.S <&AS_MESSAGE_LOG_FD \ + 2>&AS_MESSAGE_LOG_FD ; then + libc_cv_gcc_rvv=yes +fi +rm -f conftest*]) + +if test x"$libc_cv_gcc_rvv" = xyes; then + AC_DEFINE(HAVE_RISCV_ASM_VECTOR_SUPPORT) +fi + +LIBC_CONFIG_VAR([have-gcc-riscv-rvv], [$libc_cv_gcc_rvv]) + diff --git a/sysdeps/riscv/multiarch/memrchr_generic.c b/sysdeps/riscv/multiarch/memrchr_generic.c new file mode 100644 index 0000000000..c0a146eb62 --- /dev/null +++ b/sysdeps/riscv/multiarch/memrchr_generic.c @@ -0,0 +1,35 @@ +/* Re-include the default memrchr implementation. + Copyright (C) 2018-2025 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +#include +#if IS_IN (libc) + +# define MEMRCHR __memrchr_generic + +/* Do not hide the generic version of memrchr, we use it internally. */ +# undef libc_hidden_builtin_def +# define libc_hidden_builtin_def(name) + +# undef weak_alias +# define weak_alias(a, b) + +#endif + +#include diff --git a/sysdeps/riscv/multiarch/memrchr_rvv.S b/sysdeps/riscv/multiarch/memrchr_rvv.S new file mode 100644 index 0000000000..6fa801023f --- /dev/null +++ b/sysdeps/riscv/multiarch/memrchr_rvv.S @@ -0,0 +1,96 @@ +/* Optimized memrchr implementation using RVV. + Copyright (C) 2018-2025 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +#define ELEM_LMUL_1_SETTING m1 +#define ELEM_LMUL_8_SETTING m8 +#define ELEM_SEW_8_SETTING e8 + +#define v_seq_set_data v0 +#define v_loaded_data v8 +#define v_reverse_loaded_data v16 +#define v_index v16 +#define v_reverse_index v24 + +#define srcin a0 +#define chrin a1 +#define cntin a2 +#define result a0 + +#define tmp_end t1 +#define loaded_data_start t1 + +#define VL a4 + +#define cntrem t2 +#define set_num t3 +#define first_set_index t4 +#define loaded_data_max_index t5 +#define one t6 + +ENTRY (__memrchr_rvv) +.option push +.option arch, +v + + mv cntrem, cntin + add tmp_end, srcin, cntrem + +L(memrchr_loop): + blez cntrem, L(memrchr_nomatch) + vsetvli VL, cntrem, ELEM_SEW_8_SETTING, ELEM_LMUL_8_SETTING, ta, ma + sub loaded_data_start, tmp_end, VL + vle8.v v_loaded_data, (loaded_data_start) + sub cntrem, cntrem, VL + /* Set v0[i] where v8[i] = a1 */ + vmseq.vx v_seq_set_data, v_loaded_data, chrin + /* count the number of equal elements */ + vcpop.m set_num, v_seq_set_data + beqz set_num, L(memrchr_loop) + +L(memrchr_found): + li one, 1 + bgt set_num, one, L(memrchr_multi_found) + /* get the first equal element index */ + vfirst.m first_set_index, v_seq_set_data + add result, loaded_data_start, first_set_index + ret + +L(memrchr_multi_found): + /* index [0, 1, 2, 3, ...]*/ + vid.v v_index + addi loaded_data_max_index, VL, -1 + /* index [VL-1, VL-2, ..., 0] */ + vrsub.vx v_reverse_index, v_index, loaded_data_max_index + /* reverse loaded data */ + vrgather.vv v_reverse_loaded_data, v_loaded_data, v_reverse_index + /* Set v0[i] where v8[i] = a1 */ + vmseq.vx v_seq_set_data, v_reverse_loaded_data, chrin + /* get the first equal element index of reverse data*/ + vfirst.m first_set_index, v_seq_set_data + /* calc the true index of data*/ + sub first_set_index, loaded_data_max_index, first_set_index + add result, loaded_data_start, first_set_index + ret + +L(memrchr_nomatch): + mv result, zero + ret +.option pop +END (__memrchr_rvv) diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile index fcef5659d4..64b20d7074 100644 --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile @@ -1,9 +1,17 @@ ifeq ($(subdir),string) sysdep_routines += \ + memrchr \ + memrchr_generic \ memcpy \ memcpy-generic \ memcpy_noalignment \ # sysdep_routines +ifeq ($(have-gcc-riscv-rvv),yes) +sysdep_routines += \ + memrchr_rvv \ + # rvv sysdep_routines +endif + CFLAGS-memcpy_noalignment.c += -mno-strict-align endif diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c index 1c1deca8f6..deb787a116 100644 --- a/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/ifunc-impl-list.c @@ -19,6 +19,8 @@ #include #include #include +#include +#include size_t __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, @@ -27,6 +29,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, size_t i = max; bool fast_unaligned = false; +#if defined(HAVE_RISCV_ASM_VECTOR_SUPPORT) + bool rvv_ext = false; +#endif struct riscv_hwprobe pair = { .key = RISCV_HWPROBE_KEY_CPUPERF_0 }; if (__riscv_hwprobe (&pair, 1, 0, NULL, 0) == 0 @@ -34,6 +39,18 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, == RISCV_HWPROBE_MISALIGNED_FAST) fast_unaligned = true; +#if defined(HAVE_RISCV_ASM_VECTOR_SUPPORT) + if (GLRO(dl_hwcap) & COMPAT_HWCAP_ISA_V) + rvv_ext = true; +#endif + +IFUNC_IMPL (i, name, memrchr, + #if defined(HAVE_RISCV_ASM_VECTOR_SUPPORT) + IFUNC_IMPL_ADD (array, i, memrchr, rvv_ext, + __memrchr_rvv) + #endif + IFUNC_IMPL_ADD (array, i, memrchr, 1, __memrchr_generic)) + IFUNC_IMPL (i, name, memcpy, IFUNC_IMPL_ADD (array, i, memcpy, fast_unaligned, __memcpy_noalignment) diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c new file mode 100644 index 0000000000..3f3a33deab --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memrchr.c @@ -0,0 +1,70 @@ +/* Multiple versions of memrchr. + Copyright (C) 2018-2025 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +# undef memrchr +# define memrchr __redirect_memrchr +# define __memrchr __redirect___memrchr + +# include +# include +# include +# include +# include +# include + +extern __typeof (__redirect_memrchr) ___memrchr; + +extern __typeof (__redirect_memrchr) __memrchr_generic attribute_hidden; +extern __typeof (__redirect_memrchr) __memrchr_rvv attribute_hidden; +static inline __typeof (__redirect_memrchr) * +select_memrchr_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func) +{ + +#if defined(HAVE_RISCV_ASM_VECTOR_SUPPORT) + if (dl_hwcap & COMPAT_HWCAP_ISA_V) + { + return __memrchr_rvv; + } +#endif + + + return __memrchr_generic; +} + +riscv_libc_ifunc (___memrchr, select_memrchr_ifunc); + + +# undef memrchr +# undef __memrchr +strong_alias (___memrchr, memrchr); +strong_alias (___memrchr, __memrchr); + +# ifdef SHARED +__hidden_ver1 (memrchr, __GI_memrchr, __redirect_memrchr) + __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memrchr); + +__hidden_ver1 (memrchr, __GI___memrchr, __redirect___memrchr) + __attribute__ ((visibility ("hidden"))) __attribute_copy__ (memrchr); +# endif + +#else +# include +#endif