From patchwork Fri Jan 17 08:01:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?5oi05oiQ6I2j?= X-Patchwork-Id: 104967 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 71D37384A450 for ; Fri, 17 Jan 2025 08:02:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 71D37384A450 X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from cstnet.cn (smtp86.cstnet.cn [159.226.251.86]) by sourceware.org (Postfix) with ESMTPS id B35CE384A4B3 for ; Fri, 17 Jan 2025 08:01:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B35CE384A4B3 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=iscas.ac.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=iscas.ac.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B35CE384A4B3 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=159.226.251.86 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1737100883; cv=none; b=OvjO0O58M8ggIqQMcKxVIbsIv+y3MFBW+AI91qRGgpd3I+6PzoDJNB2QM5cyq73uUHNK/iDtnP4DbakuMoHxjR/qWcvhxbAojkI4YtQtsSe/05DhHw1gjgLgeVMynHtcYi+JRJINg3WCNRMuUcRyrsCxj+DMRC98BvU6Qjz2K7U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1737100883; c=relaxed/simple; bh=XZnshD4XejaphLHn4S+uqLiVGnVDAz8bgbMOHXl55Vg=; h=Date:From:To:Subject:MIME-Version:Message-ID; b=MSDjoTZlsNs+XoWTAHj8tJQq3YtpRbFug65WG5hEqhonODET8Zs7g5oXxwxxzmWJwoQZSERgr1LOGcFc/84gCa29PmAChem/pfscZpciXHTPgMdDR51ugPM+SIUqlKKSzEgB4S0mjkYeUqcftNgpQWQup0IjGn0KwyiVCcfzMiY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B35CE384A4B3 Received: from daichengrong$iscas.ac.cn ( [180.110.112.56] ) by ajax-webmail-APP-16 (Coremail) ; Fri, 17 Jan 2025 16:01:16 +0800 (GMT+08:00) X-Originating-IP: [180.110.112.56] Date: Fri, 17 Jan 2025 16:01:16 +0800 (GMT+08:00) X-CM-HeaderCharset: UTF-8 From: daichengrong@iscas.ac.cn To: "Andrew Waterman" Cc: "Palmer Dabbelt" , libc-alpha Subject: [PATCH v2] RISC-V: add multiarch RVV support for memcpy using FMV IFUNC X-Priority: 3 X-Mailer: Coremail Webmail Server Version 2024.1-cmXT5 build 20240627(e6c6db66) Copyright (c) 2002-2025 www.mailtech.cn cnic.cn In-Reply-To: References: <7fa4ac68.3f903.1946e72cef3.Coremail.daichengrong@iscas.ac.cn> MIME-Version: 1.0 Message-ID: <59438991.41f84.1947347d8ee.Coremail.daichengrong@iscas.ac.cn> X-Coremail-Locale: zh_CN X-CM-TRANSID: sQCowADHk1FNDopnn1ILAA--.44427W X-CM-SenderInfo: pgdluxxhqj201qj6x2xfdvhtffof0/1tbiBgoFEmeJ29mF7AABsf X-Coremail-Antispam: 1Ur529EdanIXcx71UUUUU7IcSsGvfJ3iIAIbVAYjsxI4VWxJw CS07vEb4IE77IF4wCS07vE1I0E4x80FVAKz4kxMIAIbVAFxVCaYxvI4VCIwcAKzIAtYxBI daVFxhVjvjDU= X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org > On Thu, Jan 16, 2025 at 1:31 AM 戴成荣 wrote: > > > > With RISC-V GCC Function multi-versioning IFUNC support,glibc could utilize some new RISC-V CPUs with new extensions such as Vector to re-realize or speed up some functions like memcpy. > > + > > +ENTRY (__memcpy_vector) > > + beq a2, zero, L(ret) > > This branch should be deleted; the strip-mine loop below does the > correct thing for a2=0. Including the branch is an anti-optimization: > it speeds up the uncommon case of size-0 memcpy at the expense of the > common case. > thanks! This branch has been deleted. > > > > + mv a6, a0 > > +L(loop): > > + vsetvli a3,a2,e8,m8,ta,mu > > + vle8.v v8,(a1) > > + vse8.v v8,(a6) > > + add a1,a1,a3 > > + sub a2,a2,a3 > > + add a6,a6,a3 > > + bnez a2,L(loop) > > +L(ret): > > + ret > > +END (__memcpy_vector) Signed-off-by: daichengrong --- sysdeps/riscv/multiarch/memcpy_vector.S | 35 +++++++++++++++++++ .../unix/sysv/linux/riscv/multiarch/Makefile | 2 ++ .../unix/sysv/linux/riscv/multiarch/memcpy.c | 5 +++ 3 files changed, 42 insertions(+) create mode 100644 sysdeps/riscv/multiarch/memcpy_vector.S diff --git a/sysdeps/riscv/multiarch/memcpy_vector.S b/sysdeps/riscv/multiarch/memcpy_vector.S new file mode 100644 index 0000000000..8fddab8432 --- /dev/null +++ b/sysdeps/riscv/multiarch/memcpy_vector.S @@ -0,0 +1,35 @@ +/* memcpy for RISC-V Vector. + Copyright (C) 2024-2025 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + + +#include +#include + +ENTRY (__memcpy_vector) + mv a6, a0 +L(loop): + vsetvli a3,a2,e8,m8,ta,mu + vle8.v v8,(a1) + vse8.v v8,(a6) + add a1,a1,a3 + sub a2,a2,a3 + add a6,a6,a3 + bnez a2,L(loop) +L(ret): + ret +END (__memcpy_vector) diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile index fcef5659d4..a8b6c22af1 100644 --- a/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/Makefile @@ -3,7 +3,9 @@ sysdep_routines += \ memcpy \ memcpy-generic \ memcpy_noalignment \ + memcpy_vector \ # sysdep_routines CFLAGS-memcpy_noalignment.c += -mno-strict-align +ASFLAGS-memcpy_vector.S += -march=rv64gcv endif diff --git a/sysdeps/unix/sysv/linux/riscv/multiarch/memcpy.c b/sysdeps/unix/sysv/linux/riscv/multiarch/memcpy.c index 8544f5402a..c9879762f6 100644 --- a/sysdeps/unix/sysv/linux/riscv/multiarch/memcpy.c +++ b/sysdeps/unix/sysv/linux/riscv/multiarch/memcpy.c @@ -32,11 +32,16 @@ extern __typeof (__redirect_memcpy) __libc_memcpy; extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden; extern __typeof (__redirect_memcpy) __memcpy_noalignment attribute_hidden; +extern __typeof (__redirect_memcpy) __memcpy_vector attribute_hidden; static inline __typeof (__redirect_memcpy) * select_memcpy_ifunc (uint64_t dl_hwcap, __riscv_hwprobe_t hwprobe_func) { unsigned long long int v; + if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_IMA_EXT_0, &v) == 0 + && (v & RISCV_HWPROBE_IMA_V) == RISCV_HWPROBE_IMA_V) + return __memcpy_vector; + if (__riscv_hwprobe_one (hwprobe_func, RISCV_HWPROBE_KEY_CPUPERF_0, &v) == 0 && (v & RISCV_HWPROBE_MISALIGNED_MASK) == RISCV_HWPROBE_MISALIGNED_FAST) return __memcpy_noalignment;