From patchwork Wed Mar 4 11:27:40 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Trevor Gross X-Patchwork-Id: 131072 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [127.0.0.1]) by sourceware.org (Postfix) with ESMTP id C72064BA23CD for ; Wed, 4 Mar 2026 11:33:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C72064BA23CD Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=umich.edu header.i=@umich.edu header.a=rsa-sha256 header.s=relay-0 header.b=GuOsIEud X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from ghoul.relay-egress.a.mail.umich.edu (ghoul.relay-egress.a.mail.umich.edu [18.217.159.240]) by sourceware.org (Postfix) with ESMTPS id 0AFD74BA2E1B for ; Wed, 4 Mar 2026 11:32:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0AFD74BA2E1B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=umich.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=umich.edu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0AFD74BA2E1B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=18.217.159.240 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1772623972; cv=none; b=ok7Ftfmq78VpqEjWxIlPuXqVKJaicoo38ck4uROnNVyC1RPrTTaso8bTwAf1+sK+KgZHewgO8wk+gOLFF99L9SG/bHrmAbCpB6Bi+8jv4ZtBH/R1edjhhOPt9Ax2TAxizqHtGm3mCJh4pIuwLon+igmW0/1Oi5pHKTuwEnoWEdA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1772623972; c=relaxed/simple; bh=b8PXnaT9eHwpC4Oc4kDwYNDmMd42YvCuIJDjuq/5kT8=; h=DKIM-Signature:Date:Message-Id:Subject:From:To:Mime-Version; b=GtfXIfPemnhJ0U35qjw7amnheKIaE2LUW7jC+1DvrIh4ki8hJ2xYx/IJlRYni9HkhK4Hk4VeY95lNSnT/92LlYKZA9nREot4dg67KdNtO7+PhLSb7SstYICvjIae6juTUhpAiJyRqf4eehdJPeLdb7TlfwFO1ceM//uqEus+r4Y= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0AFD74BA2E1B Received: from lordly-banshee.authn-relay.a.mail.umich.edu (ip-10-0-74-48.us-east-2.compute.internal [10.0.74.48]) by ghoul.relay-egress.a.mail.umich.edu with ESMTPS id 69A8172F.1C2CC434.6CF373A2.588945; Wed, 04 Mar 2026 06:27:43 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=umich.edu; s=relay-0; t=1772623661; bh=jBqn+XFKW9oImSZJUrA5zxQa2RgkgxbMfgkpLLYThXI=; h=Date:Cc:Subject:From:To; b=GuOsIEudnYKrlyrIRB6gxAZu+NyUOKvsU3hUiwLTLOMwEBG3258ZUU3wuJyuXkIYx OKPmApvvf7Xc09IkzBc0BMglQ3hR6NoiGCPET0OLpOxaFZeNple53HFG2fY4EKRFpZ fBnMkHkYSKL+eYpAijf5802uNUQf+Q+sHL2itg2DXAEamjIsL0SPJR/aUhYv6bnqnw 7cFlxZMpJyMDcPp//4sZVTvez+VAGzS7Qd+wqXxvlDekhCY6aGyhnOhfEjOZZdHgO/ WSVhINBYRAGY0W7DiCoj4KWVYyT821O4ObsrI3hH+UkIojUT2SpDJSgAnKDzMfpuKN vbMnfLse+QEXQ== Authentication-Results: lordly-banshee.authn-relay.a.mail.umich.edu; iprev=pass policy.iprev=50.4.255.176 (d4-50-176-255.evv.wideopenwest.com); auth=pass smtp.auth=tmgross Received: from localhost (d4-50-176-255.evv.wideopenwest.com [50.4.255.176]) by lordly-banshee.authn-relay.a.mail.umich.edu with ESMTPSA id 69A8172D.1363B08.547D2423.1290439; Wed, 04 Mar 2026 06:27:41 -0500 Date: Wed, 04 Mar 2026 06:27:40 -0500 Message-Id: Cc: , , , "Joseph Myers" , "Pengfei Wang" , "Jacob Lifshay" , "Folkert de Vries" , "Trevor Gross" Subject: [RFC PATCH] Adjust the _Float16 ABI to return in a GPR From: "Trevor Gross" To: "IA32 System V Application Binary Interface" , "H.J. Lu" Mime-Version: 1.0 X-Mailer: aerc 0.21.0 X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_EDU_FROM, RCVD_IN_DNSWL_BLOCKED, RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED, SPF_HELO_PASS, SPF_NONE, TXREP, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org Hello all, I am interested in revisiting the return ABI of _Float16 on i386. Currently it is returned in xmm0, meaning SSE is required for the type. This is rather inconvenient when _Float16 is otherwise quite well supported. Compilers need to pick between hacking together a custom ABI that works on the baseline, or passing the burden on to users to gate everything. Is there any interest in adjusting the specification such that _Float16 is returned in a GPR rather than SSE? This was brought up before in the thread at [1], with the concern about efficient 16-bit moves between GPRs or memory and XMM. This doesn't seem to be relevant, however, given there isn't any reason to have a _Float16 in XMM unless F16C is available, implying SSE2 and SSE4.1 for PINSRW and PEXTRW to/from memory (unless I am missing something?). A sample patch to the psABI is below. Needless to say there are compatibility concerns that come from a change but given workarounds already exist (e.g. in LLVM), it seems worth considering whether something should be codefied to make this simpler for everyone. Best regards, Trevor [1]: https://inbox.sourceware.org/gcc-patches/20210701210537.51272-1-hjl.tools@gmail.com/ (some CCs added from the linked discussion) --- patch follows --- From 1af72db89f9a10b93569fa0b9f64f65f2dd73334 Mon Sep 17 00:00:00 2001 From: Trevor Gross Date: Fri, 23 Jan 2026 21:11:43 +0000 Subject: [PATCH] Return _Float16 and _Complex _Float16 in GPRs Currently the ABI specifies that _Float16 is to be passed on the stack and returned in xmm0, meaning SSE is required to support the type. Adjust both _Float16 and _Complex _Float16 to return in eax, dropping the SSE requirement. This has the benefit of making _Float16 ABI-compatible with `short`. --- low-level-sys-info.tex | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) -- 2.50.1 (Apple Git-155) diff --git a/low-level-sys-info.tex b/low-level-sys-info.tex index 0015c8c..a2d8d6d 100644 --- a/low-level-sys-info.tex +++ b/low-level-sys-info.tex @@ -384,8 +384,7 @@ of some 64bit return types & No \\ \ESI & callee-saved register & yes \\ \EDI & callee-saved register & yes \\ \reg{xmm0} & scratch register; also used to pass the first \code{__m128} - parameter and return \code{__m128}, \code{_Float16}, - \code{_Complex _Float16} & No \\ + parameter and return \code{__m128} & No \\ \reg{ymm0} & scratch register; also used to pass the first \code{__m256} parameter and return \code{__m256} & No \\ \reg{zmm0} & scratch register; also used to pass the first \code{__m512} @@ -472,7 +471,11 @@ and \texttt{unions}) are always returned in memory. & \texttt{\textit{any-type} *} & \EAX \\ & \texttt{\textit{any-type} (*)()} & \\ \hline - & \texttt{_Float16} & \reg{xmm0} \\ + & \texttt{_Float16} & \reg{ax} \\ + & & The upper 16 bits of \EAX are undefined. + The caller must not \\ + & & rely on these being set in a predefined + way by the called function. \\ \cline{2-3} & \texttt{float} & \reg{st0} \\ \cline{2-3} @@ -484,7 +487,7 @@ and \texttt{unions}) are always returned in memory. \cline{2-3} & \texttt{__float128} & memory \\ \hline - & \texttt{_Complex _Float16} & \reg{xmm0} \\ + & \texttt{_Complex _Float16} & \reg{eax} \\ & & The real part is returned in bits 0..15. The imaginary part is returned \\ & & in bits 16..31.\\