From patchwork Mon Apr 1 19:19:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Wong X-Patchwork-Id: 87895 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6CA243858418 for ; Mon, 1 Apr 2024 19:19:54 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from dcvr.yhbt.net (dcvr.yhbt.net [173.255.242.215]) by sourceware.org (Postfix) with ESMTPS id CB1F1385841B for ; Mon, 1 Apr 2024 19:19:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CB1F1385841B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=80x24.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=80x24.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CB1F1385841B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=173.255.242.215 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711999169; cv=none; b=gs2DynQiHK1TTThCqmg6ZjkYJLudV38dyXsjqN9H3vsOpo0XH+OF8mLCXdWwqasAlMv/yur21cIKsZvQTdlwjVqjMxFMBvLvIFpnrLWe/eqUDb4YxX197WyMGUiRcrjRtrohtGBdqmcMSO/1mmc+N7mLoaVzV5Sl1tsbfqNiFf0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711999169; c=relaxed/simple; bh=vIZMhOWajTYv9fpbLSjcz3kMXCRi/J1k685SIFiRSDw=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=OxbbdCAssn3XW0Dpw5qK67ONBY2OjOhXoAeBW7bCvJnc1xDqvhVOl4fv0VnWAQinwFbhFVyO/qxHiTjeLCfsldKcChlOeFA1TsyfHNskvJEzdSL3ax2J26g4WliPhVtezEyuvgBYl12Gt28xVfUSzxffTv5FWBCp30J6KsMATJ8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 985F31F44D; Mon, 1 Apr 2024 19:19:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=80x24.org; s=selector1; t=1711999165; bh=vIZMhOWajTYv9fpbLSjcz3kMXCRi/J1k685SIFiRSDw=; h=Date:From:To:Subject:From; b=OijlIeRfDjQU+32Yhz+nOM1Z2BptvQIoqEnfXgphnX7ecLfjyXS78QLIAjRQwjia+ AnBuE3J64TsFZOzYz+j4SYF2FKuTegvgoGPGY6IBsel21aT+ax5bi1cgzHmjoaIgrE yDjsYUx+FOhrGwkEk0vJRnu3/0jCQqn0oD5kWjjs= Date: Mon, 1 Apr 2024 19:19:25 +0000 From: Eric Wong To: libc-alpha@sourceware.org Subject: status of dj/malloc branch? Message-ID: <20240401191925.M515362@dcvr> MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, LIKELY_SPAM_BODY, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org I'm interested in the tracing features described at https://sourceware.org/glibc/wiki/MallocTracing to test and validate memory fragmentation avoidance in a long-lived single-threaded Perl C10K HTTP/IMAP/NNTP/POP3 daemon. It appears stalled for years, however, and the current glibc malloc doesn't have the trace + replay features. I'm currently dogfooding the below patch on an old glibc (Debian oldstable :x) on my "production" home server. My theory is the jemalloc idea of having fewer possible sizes is good for avoiding fragmentation in long-lived processes. This is because sizes for string processing are highly variable and lifetimes are mixed for event-driven C10K servers where some clients live only for a single request and others for many. Clients end up sharing allocations due to caching and deduplication, so a short-lived client can end up allocating something that lives a long-time. Perl does lazy loading and internal caching+memoization all over the place, too. The downside is 0-20% waste in initial fits, but I expect it to get better fits over time... Not a serious patch against Debian glibc 2.31-13+deb11u8: diff --git a/malloc/malloc.c b/malloc/malloc.c index f7cd29bc..6e0b066d 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -3018,6 +3018,31 @@ tcache_thread_shutdown (void) #endif /* !USE_TCACHE */ +static inline size_t +size_class_pad (size_t bytes) +{ + if (bytes <= MAX_FAST_SIZE || bytes >= DEFAULT_MMAP_THRESHOLD_MAX) + return bytes; + /* + * Use jemalloc-inspired size classes for mid-size allocations to + * minimize fragmentation. This means we pay a 0-20% overhead on + * the initial allocations to improve the likelyhood of reuse. + */ + size_t max = sizeof(void *) << 4; + size_t nxt; + + do { + if (bytes <= max) { + size_t sc_bytes = ALIGN_UP (bytes, max >> 3); + + return sc_bytes <= DEFAULT_MMAP_THRESHOLD_MAX ? sc_bytes : bytes; + } + nxt = max << 1; + } while (nxt > max && nxt < DEFAULT_MMAP_THRESHOLD_MAX && (max = nxt)); + + return bytes; +} + void * __libc_malloc (size_t bytes) { @@ -3031,6 +3056,7 @@ __libc_malloc (size_t bytes) = atomic_forced_read (__malloc_hook); if (__builtin_expect (hook != NULL, 0)) return (*hook)(bytes, RETURN_ADDRESS (0)); + bytes = size_class_pad (bytes); #if USE_TCACHE /* int_free also calls request2size, be careful to not pad twice. */ size_t tbytes; @@ -3150,6 +3176,8 @@ __libc_realloc (void *oldmem, size_t bytes) if (oldmem == 0) return __libc_malloc (bytes); + bytes = size_class_pad (bytes); + /* chunk corresponding to oldmem */ const mchunkptr oldp = mem2chunk (oldmem); /* its size */ @@ -3391,6 +3419,7 @@ __libc_calloc (size_t n, size_t elem_size) return memset (mem, 0, sz); } + sz = size_class_pad (sz); MAYBE_INIT_TCACHE (); if (SINGLE_THREAD_P)