From patchwork Sun Nov 9 07:29:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dev Jain X-Patchwork-Id: 123824 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 58F8A3858C51 for ; Sun, 9 Nov 2025 07:29:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 58F8A3858C51 X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1FD333858D26 for ; Sun, 9 Nov 2025 07:29:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1FD333858D26 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1FD333858D26 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1762673355; cv=none; b=ch0+qtAF6aW0VYn/N441unuKYmFnIfsLGm7O+era5BAjE07tgQ0NyAoLq107QZl6Ssli6zjAu9vviS0QI7cMoBHY1391rlTSAS6srdiKzUWfxpEg+EVaeXg7RZXyPqMdCq1uT+KGkK9OYaqjKTa1CcnvO4c1HEMsK4jpVZbsBjo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1762673355; c=relaxed/simple; bh=OWpuC1qcpUzNaC0z9+l63TPhmOekKmKP5uhrB5CA/TA=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Rtn/hVP9T7QcOvWbprRgVi2ajafhni90RkOUiOmAPKgq0YOy8AtQ5mhFZHTVEjXS0T9dLeU2JdBDUVYlu2yrmE5YxBAacKBD8urSKtFnZWT5ZrjuSYnaLnGsi4SVW77ycdRVWIWf4maXxe3W8sLlXtORiIhxRvNU7iAggNGOpUw= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1FD333858D26 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D25E62F; Sat, 8 Nov 2025 23:29:06 -0800 (PST) Received: from MacBook-Pro.blr.arm.com (MacBook-Pro.blr.arm.com [10.164.11.1]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 083E13F63F; Sat, 8 Nov 2025 23:29:11 -0800 (PST) From: Dev Jain To: libc-alpha@sourceware.org Cc: Wilco.Dijkstra@arm.com, cupertino.miranda@oracle.com, fweimer@redhat.com, adhemerval.zanella@linaro.org, dj@redhat.com, Dev Jain Subject: [PATCH] malloc: Optimize the madvise behaviour on the main heap Date: Sun, 9 Nov 2025 12:59:01 +0530 Message-Id: <20251109072901.13853-1-dev.jain@arm.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) MIME-Version: 1.0 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org Linux handles virtual memory in Virtual Memory Areas (VMAs). The madvise(MADV_HUGEPAGE) call works on a VMA granularity, which sets the VM_HUGEPAGE flag on the VMA. Therefore, if we can guarantee that a VMA has been marked with VM_HUGEPAGE already, then we do not need to call madvise() on that VMA again. For mp_.thp_pagesize != 0, currently we align the new brk to the thp size. This means that after the first extension, all such brk extensions are guaranteed to produce an extension size >= thp size: madvise_thp() will invoke the madvise() syscall only if size >= thp size, and the other condition is related to the sysctl setting, wherein mp_.thp_mode will be same throughout the lifetime of the process. Therefore, currently we invoke the madvise() syscall on the heap on each extension, which is unnecessary. First, pass the total heap size, instead of the extension size, to madvise_thp: Linux does not care about the size passed, in case the madvise() syscall is invoked with MADV_HUGEPAGE flag, because the flag will be set on the entire VMA, no matter for what portion of the VMA the syscall is invoked. This enables us to do the following: if the old heap size >= thp size, we can guarantee that madvise() was invoked on one of the previous extensions of the heap. So, avoid making the syscall in this case. The tricky part is computing the size of the heap, i.e the current program break minus the initial program break. In case the first ever attempt at extending the break fails, mp_.sbrk_base will be set to an mmapped address. Therefore, we need some other way of remembering the initial location of the program break. We can reuse some code for this: MORECORE (0), when invoked for the first time ever, will give us the initial program break. --- The patch applies on 259adb087dd9. Built on Aarch64, all malloc tests pass. malloc/malloc.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/malloc/malloc.c b/malloc/malloc.c index 0b21bdf1bd..277fb9e9ec 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -1938,6 +1938,12 @@ struct malloc_par /* First address handed out by MORECORE/sbrk. */ char *sbrk_base; + /* The initial location of program break. This will most likely be equal + to sbrk_base; in case the first ever extension attempt of brk fails, + sbrk_base will point to an mmapped address (see sysmalloc_mmap_fallback), + in which case these two values will not be equal. */ + char *init_sbrk_base; + #if USE_TCACHE /* Maximum number of small buckets to use. */ size_t tcache_small_bins; @@ -2667,6 +2673,9 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av) if (__glibc_unlikely (mp_.thp_pagesize != 0)) { uintptr_t lastbrk = (uintptr_t) MORECORE (0); + if (mp_.init_sbrk_base == NULL) + mp_.init_sbrk_base = (char *) lastbrk; + uintptr_t top = ALIGN_UP (lastbrk + size, mp_.thp_pagesize); size = top - lastbrk; } @@ -2682,8 +2691,18 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av) if ((ssize_t) size > 0) { brk = (char *) (MORECORE ((long) size)); - if (brk != (char *) (MORECORE_FAILURE)) - madvise_thp (brk, size); + if (brk != (char *) (MORECORE_FAILURE)) { + size_t old_size = (size_t) (brk - mp_.init_sbrk_base); + + /* + If heap already marked with MADV_HUGEPAGE, skip madvise(). Note + that, we don't need to check mp_.init_sbrk_base != NULL; if it + is NULL, it implies that mp_.thp_pagesize == 0, in which case + madvise_thp() will not invoke madvise(). + */ + if (old_size < mp_.thp_pagesize) + madvise_thp (brk, old_size + size); + } LIBC_PROBE (memory_sbrk_more, 2, brk, size); }