From patchwork Fri Jun 10 07:47:04 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: liuhongt <hongtao.liu@intel.com>
X-Patchwork-Id: 55003
Return-Path: <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 5C1E6383EC7B
	for <patchwork@sourceware.org>; Fri, 10 Jun 2022 07:47:33 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5C1E6383EC7B
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org;
	s=default; t=1654847253;
	bh=QX86t9lJDWmwDyWRzImwBjvuWJGqEvc9zJQdixDWBrw=;
	h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:Cc:From;
	b=yu/76kZDIjCcF6Y/gcckAeVTp8ZL/Z20cosIAtY8Buxz1wDUL1fHppucUXgpHWKhB
	 wkR5s05/YFKUWeNCAlbiStvPYRmGWIqFCmFJfraAMUCfLQcmOZoik19Logsr4iyzF8
	 QRgHUHnIvxosjUpfBd7iBrpNUjq5f3TaQgOw3jiQ=
X-Original-To: libc-alpha@sourceware.org
Delivered-To: libc-alpha@sourceware.org
Received: from mga12.intel.com (mga12.intel.com [192.55.52.136])
 by sourceware.org (Postfix) with ESMTPS id CDE1A383EC41
 for <libc-alpha@sourceware.org>; Fri, 10 Jun 2022 07:47:12 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CDE1A383EC41
X-IronPort-AV: E=McAfee;i="6400,9594,10373"; a="257381743"
X-IronPort-AV: E=Sophos;i="5.91,288,1647327600"; d="scan'208";a="257381743"
Received: from orsmga005.jf.intel.com ([10.7.209.41])
 by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 10 Jun 2022 00:47:11 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.91,288,1647327600"; d="scan'208";a="760406115"
Received: from scymds01.sc.intel.com ([10.148.94.138])
 by orsmga005.jf.intel.com with ESMTP; 10 Jun 2022 00:47:07 -0700
Received: from shliclel045.sh.intel.com (shliclel045.sh.intel.com
 [10.239.236.45]) by scymds01.sc.intel.com
 with ESMTP id 25A7l54W027885; Fri, 10 Jun 2022 00:47:05 -0700
To: x86-64-abi@googlegroups.com
Subject: [PATCH] Add optional __Bfloat16 support
Date: Fri, 10 Jun 2022 15:47:04 +0800
Message-Id: <20220610074704.7673-1-hongtao.liu@intel.com>
X-Mailer: git-send-email 2.18.1
X-Spam-Status: No, score=-14.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 SPF_HELO_PASS, SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
X-Patchwork-Original-From: liuhongt via Libc-alpha <libc-alpha@sourceware.org>
From: liuhongt <hongtao.liu@intel.com>
Reply-To: liuhongt <hongtao.liu@intel.com>
Cc: llvm-dev@lists.llvm.org, libc-alpha@sourceware.org,
 gcc-patches@gcc.gnu.org
Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org
Sender: "Libc-alpha"
 <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org>

Pass and return __Bfloat16 values in XMM registers.

Background:
__Bfloat16 (BF16) is a new floating-point format that can accelerate machine learning (deep learning training, in particular) algorithms.
It's first introduced by Intel AVX-512 extension called AVX-512_BF16. __Bfloat16 has 8 bits of exponent and 7 bits of mantissa and it's different from _Float16.

Movivation:
Currently __bfloat16 is a typedef of short, which creates a problem where the compiler does not raise any alarms if it is used to add, subtract, multiply or divide, but the result of the calculation is actually meaningless.
To solve this problem, a real scalar type __Bfloat16 needs to be introduced. It is mainly used for intrinsics, not available for C standard operators. __Bfloat16 will also be used for movement like passing parameter, load and store, vector initialization, vector shuffle, and .etc. It creates a need for a corresponding psABI.
---
 x86-64-ABI/low-level-sys-info.tex | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/x86-64-ABI/low-level-sys-info.tex b/x86-64-ABI/low-level-sys-info.tex
index a8b69db..ba8db0d 100644
--- a/x86-64-ABI/low-level-sys-info.tex
+++ b/x86-64-ABI/low-level-sys-info.tex
@@ -302,6 +302,12 @@ be used to represent the type, is a family of integer types.
 This permits the use of these types in allocated arrays using the common
 sizeof(Array)/sizeof(ElementType) pattern.
 
+\subsubsection{Special Types}
+
+The \code{__Bfloat16} type uses a 8-bit exponent and 7-bit mantissa.
+It is used for \code{BF16} related intrinsics, it cannot be
+used with standard C operators.
+
 \subsubsection{Aggregates and Unions}
 
 Structures and unions assume the alignment of their most strictly
@@ -563,8 +569,8 @@ The basic types are assigned their natural classes:
 \item Arguments of types (signed and unsigned) \code{_Bool}, \code{char},
   \code{short}, \code{int}, \code{long}, \code{long long}, and
   pointers are in the INTEGER class.
-\item Arguments of types \code{_Float16}, \code{float}, \code{double},
-  \code{_Decimal32},
+\item Arguments of types \code{_Float16}, \code{__Bfloat16}, \code{float},
+  \code{double}, \code{_Decimal32},
   \code{_Decimal64} and \code{__m64} are in class SSE.
 \item Arguments of types \code{__float128}, \code{_Decimal128}
   and \code{__m128} are split into two halves.  The least significant