From patchwork Mon May 11 18:55:46 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 134809 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C026B4BB3B90 for ; Mon, 11 May 2026 18:56:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C026B4BB3B90 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ufj79WYl X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-dy1-x132f.google.com (mail-dy1-x132f.google.com [IPv6:2607:f8b0:4864:20::132f]) by sourceware.org (Postfix) with ESMTPS id BAEBA4BB24FA for ; Mon, 11 May 2026 18:56:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BAEBA4BB24FA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BAEBA4BB24FA Authentication-Results: sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::132f ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1778525773; cv=none; b=aIpfjF5bF7sSbzodvSLxICIbfz8HigorU0flmPis+oUQkHxp4PxnxCX77/xH1TDTD+2z0lByJ6YZ4e5+OimVyVjcdD+0ElX+h+T2dJn0FlE585qQDjiM915KOr7p1kR23LptHHBMEgvz0CAeEAxEhnydjGWw42Glcc8Yqqux4tU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1778525773; c=relaxed/simple; bh=JrZeNqbcdzedV071OwiW7M7B59BXs7C6Iq1c2ROINs0=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=tGtz85CyXYiAFuub5OxmXODVrJO2VmqqQ35REO4p6as7oJFRtsC0ozcZLLDLl+tJu4jG/mb0+QVDOgFEUsAZnhTLUGQHZrSx8GQ0y/a+zRTxbG8Fg6AXdN8GrvMM4oqxUzOhwOIdiCSOZHWsVURMvtiMr8elwVyJnr9SGSqPj0s= ARC-Authentication-Results: i=1; sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=ufj79WYl DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BAEBA4BB24FA Received: by mail-dy1-x132f.google.com with SMTP id 5a478bee46e88-2f7020a928eso6490843eec.1 for ; Mon, 11 May 2026 11:56:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1778525771; x=1779130571; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=M5rr7Nd4oxJnHJSYJEiSsHW+tV3F2URHrnfgJgX81GQ=; b=ufj79WYlSKg4wvdj9zhBeoyvYrwpXYEKqB6ac8YUwkjG8MF7oC4KLurj/+YsQCcNdZ DsPy4Hgb8Gq45F5M2/kkx7T4e8jZgLoFyi/c0m1BhDCZXpYwrTYW8gMLdETLYXI6/CC/ lJEFC6L5DUAKx/bczFESWlrd8B4AMX77rcj3mGWv8cnEqnoaCTsC3wZDb8Y1JCBOc7mq mUXdmy4gx7rRxD13N/6uYx/WxojCeMsgLG6iVe3UabXCEdvhdHxUw02bekUUJ0j3Maky QDhC/aEs5PL8kGSGvjE6tVD6WHDsjB46FEGz/rS9mylg9jViInbfqsVVtnMLfp4X8fYv uGAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778525771; x=1779130571; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=M5rr7Nd4oxJnHJSYJEiSsHW+tV3F2URHrnfgJgX81GQ=; b=hW6ZknKAIRzdsmPlx6DVU4XAJagzGc189iFw9dGeZjYvlF/6guOBJFaF8PZ99xmQXc Zbi3iVRlKFEWu9ikoevNsKfDUZBi7H4KSxT7yAfoAyDz9degxsLV8qzeQtZzriubpY6s 4uy/FLyWceTrWBZ+N/9Pn1XkHnAmjmz/D+aBNkr23qetFIV1znyGBHhLg7rl9+43CIpK aLEW4/5tnAho0zdyaVeQaOslq8X2UqZRojokC41jkCNsVPO5mUDt+ArMjaTQypVn37Ng 0CVQJfrl8PXBXvgfbSB7hP3bUy7djX6ajNC1XtXuxahz16XLTTV/8FUJDRJmECMuvGGU 7LBw== X-Gm-Message-State: AOJu0YyEslYSAp+hYB78S/lpJLeoHz/fY0PjNBnaCpd8ybNTOVR0vYAY e2H6LTcx5jiMFCGaIMK6hilhNJv1h4jr8K2nxQXVoeEQ22HXumJN9DPyMkYGq7B0PrmMwN7nDB4 RmMmM X-Gm-Gg: Acq92OEn/XOUO0BgSgKZlubc+yQNX4uWeEi7Zh4YIXbKXRm08UaaFeDH0a7Dj0Z1NX/ W/6edJVDSFmg2QMAiUWgPFZX6kgAflg++7OAzmwy9Ih+K/oUnigIrPqxXfj3OEzL4nEYSn+uypV /ZOrN0K4qVhqBNJuLgVrbzMe07P7kH1XfIQe9Zna5Rpq31EQLy01HttSjxpgdi66okQAu41bJdp MKIW6PmvWiUrjQ4khBQhJ1alfoHQ3DuMKTSVR0qZ4/S3RP2VdothKkrsjgYTY55Kp0Dzq0Cjdus DIenbi1GJk0na7C7hiM4r7p6YvF4ULam/D7+geVfK9EFm9n9kytUi0o/3O/fWMvRRiqq1xEXyCq /iL4N5cyWrAfxGTYkfFuxxlhCC0N9y6Y8DRa20cFwJ4dnUDOizZ/kjbd3hlPa5KQ3xO1jw/vS9q Z7Ex2iJ1FPCs3e1pZ7BTR/vOqqkD2B/KnHaOI= X-Received: by 2002:a05:693c:2c01:b0:2f0:4268:bc42 with SMTP id 5a478bee46e88-2f54ae736damr11448428eec.25.1778525770963; Mon, 11 May 2026 11:56:10 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c1:b276:ea3e:4591:5a8f:2965]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2f8859eafc2sm18521880eec.4.2026.05.11.11.56.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 May 2026 11:56:10 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: "H . J . Lu" Subject: [PATCH] elf: Batch program-header reads in _dl_map_segments (oversight fix) Date: Mon, 11 May 2026 15:55:46 -0300 Message-ID: <20260511185606.3750011-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org The fix for BZ 26577 ("Fix stack overflow in _dl_map_object_from_fd with large e_phnum") removed the alloca for the program-header table and introduced a streaming iterator (dl_pt_load_iterator) so segments could be walked without staging the entire table on the stack. That patch batched reads correctly in _dl_map_object_scan_phdrs (the first walk, which collects PT_DYNAMIC/PT_TLS/PT_GNU_* metadata), but overlooked the second walk in _dl_map_segments: _dl_pt_load_iterator_next issued one pread64 per program header to find the next PT_LOAD entry. For an object with N program headers this added N redundant per-phdr syscalls on every dlopen / loader startup -- regardless of whether the table had already been read by open_verify into struct filebuf. Unify both walks behind a single batched helper, _dl_pt_load_iterator_phdr_at: - When the program header table fits in the bytes already read by open_verify into fbp->buf (the common case for nearly all shared objects), all phdr accesses are served from that buffer with no syscall at all. - Otherwise, up to FILEBUF_SIZE / sizeof(ElfW(Phdr)) program headers are read into fbp->buf with a single pread64; subsequent indices in the same window hit the buffer. Both _dl_map_object_scan_phdrs and _dl_pt_load_iterator_next now go through this helper, eliminating the separate batching logic in _dl_map_object_scan_phdrs. struct filebuf moves from dl-load.c to dl-load.h so the inline iterator in dl-map-segments.h can reach fbp->buf. The filebuf size is also bumped to ensure the cached fast path triggers for all observed binaries. A survey of an Ubuntu 24.04 installation (scanning /usr) shows: Candidate files : 465834 ELF files inspected : 11624 glibc-linked binaries : 10164 Minimum e_phnum : 5 Maximum e_phnum : 14 Average e_phnum : 11.37 Median e_phnum : 11.0 shows e_phnum capped at 14 (for instance gcc's cc1, lto1, perl, and gdb). The previous FILEBUF_SIZE of 832 on 64-bit fit only 13 program headers after the ELF header (64 + 13*56 = 792), so 64-bit binaries with 14 phdrs missed the cached path. FILEBUF_SIZE is bumped from 512/832 to 640/1024 (32-bit / 64-bit) -- enough for at least 16 program headers on either ABI, leaving headroom over the observed maximum. For a typical shared library where open_verify's initial read covers the program header table, this reduces _dl_map_segments from N preads to 0. For a worst-case e_phnum that does not fit in fbp->buf, reads drop from N to ceil(N / phdrs_per_buf) -- the same cost _dl_map_object_scan_phdrs already pays. No functional change. Tested on x86_64-linux-gnu, aaarch64-linux-gnu, and i686-linux-gnu. --- elf/dl-load.c | 285 +++++++++++++++++++++----------------------------- elf/dl-load.h | 100 ++++++++++++++---- 2 files changed, 201 insertions(+), 184 deletions(-) diff --git a/elf/dl-load.c b/elf/dl-load.c index f6e391a4689..d20e49d526a 100644 --- a/elf/dl-load.c +++ b/elf/dl-load.c @@ -34,31 +34,6 @@ #include #include -/* Type for the buffer we put the ELF header and hopefully the program - header. This buffer does not really have to be too large. In most - cases the program header follows the ELF header directly. If this - is not the case all bets are off and we can make the header - arbitrarily large and still won't get it read. This means the only - question is how large are the ELF and program header combined. The - ELF header 32-bit files is 52 bytes long and in 64-bit files is 64 - bytes long. Each program header entry is again 32 and 56 bytes - long respectively. I.e., even with a file which has 10 program - header entries we only have to read 372B/624B respectively. Add to - this a bit of margin for program notes and reading 512B and 832B - for 32-bit and 64-bit files respectively is enough. If this - heuristic should really fail for some file the code in - `_dl_map_object_from_fd' knows how to recover. */ -struct filebuf -{ - ssize_t len; -#if __WORDSIZE == 32 -# define FILEBUF_SIZE 512 -#else -# define FILEBUF_SIZE 832 -#endif - char buf[FILEBUF_SIZE] __attribute__ ((aligned (__alignof (ElfW(Ehdr))))); -}; - #include "dynamic-link.h" #include "get-dynamic-info.h" #include @@ -936,13 +911,18 @@ _dl_notify_new_object (int mode, Lmid_t nsid, struct link_map *l) } /* Initialize the PT_LOAD iterator IT for reading program headers from FD - at file offset PHOFF with PHNUM entries. Zeros all precomputed fields - so the caller's scan loop can fill them in. */ + at file offset PHOFF with PHNUM entries. FBP is used as scratch space + for batched program-header reads; if open_verify's initial read into + FBP->buf already covers the whole phdr table, the iterator runs + entirely from that buffer without any further pread. Zeros all + precomputed fields so the caller's scan loop can fill them in. */ static void -_dl_pt_load_iterator_init (struct dl_pt_load_iterator *it, - int fd, ElfW(Off) phoff, uint16_t phnum) +_dl_pt_load_iterator_init (struct dl_pt_load_iterator *it, int fd, + struct filebuf *fbp, ElfW(Off) phoff, + uint16_t phnum) { it->fd = fd; + it->fbp = fbp; it->phoff = phoff; it->phnum = phnum; it->idx = 0; @@ -952,161 +932,134 @@ _dl_pt_load_iterator_init (struct dl_pt_load_iterator *it, it->first_mapstart = 0; it->last_mapstart = 0; it->last_allocend = 0; + it->cached = (phoff + (ElfW(Off)) phnum * sizeof (ElfW(Phdr)) + <= (ElfW(Off)) fbp->len); + it->buf_base = 0; + it->buf_count = it->cached ? phnum : 0; } -/* Scan all program headers from IT->fd in chunks, using FBP->buf as a - scratch buffer. Fills in IT's precomputed PT_LOAD metadata and collects - segment attributes into L. Returns NULL on success, or an error message - string on failure; sets *ERRVALP to errno for I/O errors, 0 otherwise. */ +/* Scan all program headers from IT->fd, using the iterator's filebuf as a + scratch buffer for batched reads (skipped entirely if open_verify + already read the whole table). Fills in IT's precomputed PT_LOAD + metadata and collects segment attributes into L. Returns NULL on + success, or an error message string on failure; sets *ERRVALP to errno + for I/O errors, 0 otherwise. */ static const char * _dl_map_object_scan_phdrs (struct dl_pt_load_iterator *it, - struct filebuf *fbp, struct link_map *l, int mode, + struct link_map *l, int mode, unsigned int *stack_flagsp, bool *has_holesp, bool *empty_dynamicp, int *errvalp) { ElfW(Addr) prev_mapend = 0; - const ElfW(Half) phdrs_per_buf = sizeof (fbp->buf) / sizeof (ElfW(Phdr)); - ElfW(Phdr) *chunk = (ElfW(Phdr) *) fbp->buf; struct dl_machine_phdr_info minfo; elf_machine_phdr_info_init (&minfo); - /* Fast path: if all program headers fit within the bytes already read - into fbp->buf by open_verify, iterate them directly without any - additional pread syscalls. The slow path falls through to pread - in chunks (which overwrites fbp->buf, but the caller has already - saved the ELF header to a local copy). */ - const bool cached - = (it->phoff + (ElfW(Off)) it->phnum * sizeof (ElfW(Phdr)) - <= (ElfW(Off)) fbp->len); - - for (ElfW(Half) base = 0; base < it->phnum; ) + for (ElfW(Half) i = 0; i < it->phnum; i++) { - ElfW(Half) batch; - const ElfW(Phdr) *batch_ptr; - - if (__glibc_likely (cached)) + const ElfW(Phdr) *ph = _dl_pt_load_iterator_phdr_at (it, i); + if (__glibc_unlikely (ph == NULL)) { - batch = it->phnum; - batch_ptr = (const ElfW(Phdr) *) (fbp->buf + it->phoff); + *errvalp = errno; + return N_("cannot read file data"); } - else + elf_machine_phdr_collect (&minfo, ph); + switch (ph->p_type) { - batch = it->phnum - base; - if (batch > phdrs_per_buf) - batch = phdrs_per_buf; - size_t bytes = (size_t) batch * sizeof (ElfW(Phdr)); - ElfW(Off) off = it->phoff + (ElfW(Off)) base * sizeof (ElfW(Phdr)); - if (__pread64_nocancel (it->fd, chunk, bytes, off) != bytes) - { - *errvalp = errno; - return N_("cannot read file data"); - } - batch_ptr = chunk; - } - - for (ElfW(Half) i = 0; i < batch; i++) - { - const ElfW(Phdr) *ph = &batch_ptr[i]; - elf_machine_phdr_collect (&minfo, ph); - switch (ph->p_type) - { - case PT_LOAD: + case PT_LOAD: + { + if (__glibc_unlikely (((ph->p_vaddr - ph->p_offset) + & (it->pagesize - 1)) != 0)) { - if (__glibc_unlikely (((ph->p_vaddr - ph->p_offset) - & (it->pagesize - 1)) != 0)) - { - *errvalp = 0; - return N_("ELF load command address/offset not page-aligned"); - } - ElfW(Addr) mapstart = ALIGN_DOWN (ph->p_vaddr, it->pagesize); - ElfW(Addr) mapend = ALIGN_UP (ph->p_vaddr + ph->p_filesz, - it->pagesize); - ElfW(Off) mapoff = ALIGN_DOWN (ph->p_offset, it->pagesize); - int prot = pf_to_prot (ph->p_flags); - if (powerof2 (ph->p_align) && ph->p_align > it->p_align_max) - it->p_align_max = ph->p_align; - it->p_align_max = _dl_map_segment_align (&(struct loadcmd) { - .mapstart = mapstart, - .mapend = mapend, - .mapoff = mapoff, - .prot = prot }, - it->p_align_max); - if (it->nloadcmds > 0 && prev_mapend != mapstart) - *has_holesp = true; - prev_mapend = mapend; - if (it->nloadcmds == 0) - it->first_mapstart = mapstart; - it->last_mapstart = mapstart; - it->last_allocend = ph->p_vaddr + ph->p_memsz; - it->nloadcmds++; + *errvalp = 0; + return N_("ELF load command address/offset not page-aligned"); } - break; + ElfW(Addr) mapstart = ALIGN_DOWN (ph->p_vaddr, it->pagesize); + ElfW(Addr) mapend = ALIGN_UP (ph->p_vaddr + ph->p_filesz, + it->pagesize); + ElfW(Off) mapoff = ALIGN_DOWN (ph->p_offset, it->pagesize); + int prot = pf_to_prot (ph->p_flags); + if (powerof2 (ph->p_align) && ph->p_align > it->p_align_max) + it->p_align_max = ph->p_align; + it->p_align_max = _dl_map_segment_align (&(struct loadcmd) { + .mapstart = mapstart, + .mapend = mapend, + .mapoff = mapoff, + .prot = prot }, + it->p_align_max); + if (it->nloadcmds > 0 && prev_mapend != mapstart) + *has_holesp = true; + prev_mapend = mapend; + if (it->nloadcmds == 0) + it->first_mapstart = mapstart; + it->last_mapstart = mapstart; + it->last_allocend = ph->p_vaddr + ph->p_memsz; + it->nloadcmds++; + } + break; - /* These entries tell us where to find things once the file's - segments are mapped in. We record the addresses it says - verbatim, and later correct for the run-time load address. */ - case PT_DYNAMIC: - if (ph->p_filesz == 0) - *empty_dynamicp = true; /* Usually separate debuginfo. */ - else - { - /* Debuginfo only files from "objcopy --only-keep-debug" - contain a PT_DYNAMIC segment with p_filesz == 0. Skip - such a segment to avoid a crash later. */ - l->l_ld = (void *) ph->p_vaddr; - l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn)); - l->l_ld_readonly = (ph->p_flags & PF_W) == 0; - } - break; - - case PT_PHDR: - l->l_phdr = (void *) ph->p_vaddr; - break; - - case PT_TLS: - if (ph->p_memsz == 0) - /* Nothing to do for an empty segment. */ - break; - - l->l_tls_blocksize = ph->p_memsz; - l->l_tls_align = ph->p_align; - if (ph->p_align == 0) - l->l_tls_firstbyte_offset = 0; - else - l->l_tls_firstbyte_offset = ph->p_vaddr & (ph->p_align - 1); - l->l_tls_initimage_size = ph->p_filesz; - /* Since we don't know the load address yet only store the - offset. We will adjust it later. */ - l->l_tls_initimage = (void *) ph->p_vaddr; - - /* l->l_tls_modid is assigned below, once there is no - possibility for failure. */ - - if (l->l_type != lt_library - && GL(dl_tls_dtv_slotinfo_list) == NULL) - { -#ifdef SHARED - /* We are loading the executable itself when the dynamic - linker was executed directly. The setup will happen - later. */ - assert (l->l_prev == NULL || (mode & __RTLD_AUDIT) != 0); -#else - assert (false && "TLS not initialized in static application"); -#endif - } - break; - - case PT_GNU_STACK: - *stack_flagsp = pf_to_prot (ph->p_flags); - break; - - case PT_GNU_RELRO: - l->l_relro_addr = ph->p_vaddr; - l->l_relro_size = ph->p_memsz; - break; + /* These entries tell us where to find things once the file's + segments are mapped in. We record the addresses it says + verbatim, and later correct for the run-time load address. */ + case PT_DYNAMIC: + if (ph->p_filesz == 0) + *empty_dynamicp = true; /* Usually separate debuginfo. */ + else + { + /* Debuginfo only files from "objcopy --only-keep-debug" + contain a PT_DYNAMIC segment with p_filesz == 0. Skip + such a segment to avoid a crash later. */ + l->l_ld = (void *) ph->p_vaddr; + l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn)); + l->l_ld_readonly = (ph->p_flags & PF_W) == 0; } + break; + + case PT_PHDR: + l->l_phdr = (void *) ph->p_vaddr; + break; + + case PT_TLS: + if (ph->p_memsz == 0) + /* Nothing to do for an empty segment. */ + break; + + l->l_tls_blocksize = ph->p_memsz; + l->l_tls_align = ph->p_align; + if (ph->p_align == 0) + l->l_tls_firstbyte_offset = 0; + else + l->l_tls_firstbyte_offset = ph->p_vaddr & (ph->p_align - 1); + l->l_tls_initimage_size = ph->p_filesz; + /* Since we don't know the load address yet only store the + offset. We will adjust it later. */ + l->l_tls_initimage = (void *) ph->p_vaddr; + + /* l->l_tls_modid is assigned below, once there is no + possibility for failure. */ + + if (l->l_type != lt_library + && GL(dl_tls_dtv_slotinfo_list) == NULL) + { +#ifdef SHARED + /* We are loading the executable itself when the dynamic + linker was executed directly. The setup will happen + later. */ + assert (l->l_prev == NULL || (mode & __RTLD_AUDIT) != 0); +#else + assert (false && "TLS not initialized in static application"); +#endif + } + break; + + case PT_GNU_STACK: + *stack_flagsp = pf_to_prot (ph->p_flags); + break; + + case PT_GNU_RELRO: + l->l_relro_addr = ph->p_vaddr; + l->l_relro_size = ph->p_memsz; + break; } - base += batch; } if (__glibc_unlikely (elf_machine_reject_phdr_p (&minfo, l, it->fd))) @@ -1275,10 +1228,10 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd, bool has_holes; bool empty_dynamic = false; - _dl_pt_load_iterator_init (&it, fd, header.e_phoff, l->l_phnum); + _dl_pt_load_iterator_init (&it, fd, fbp, header.e_phoff, l->l_phnum); has_holes = false; - errstring = _dl_map_object_scan_phdrs (&it, fbp, l, mode, &stack_flags, + errstring = _dl_map_object_scan_phdrs (&it, l, mode, &stack_flags, &has_holes, &empty_dynamic, &errval); if (__glibc_unlikely (errstring != NULL)) goto lose; diff --git a/elf/dl-load.h b/elf/dl-load.h index e58028038c9..80ae5db4b3d 100644 --- a/elf/dl-load.h +++ b/elf/dl-load.h @@ -21,12 +21,41 @@ #define _DL_LOAD_H 1 #include +#include #include +#include #include #include #include +/* Type for the buffer we put the ELF header and hopefully the program + header. This buffer does not really have to be too large. In most + cases the program header follows the ELF header directly. If this + is not the case all bets are off and we can make the header + arbitrarily large and still won't get it read. This means the only + question is how large are the ELF and program header combined. The + ELF header for 32-bit files is 52 bytes long and for 64-bit files + 64 bytes long. Each program header entry is 32 and 56 bytes long + respectively. + + Size for at least 16 entries (with a little margin for program notes) + needs 52 + 16*32 = 564 bytes on 32-bit and 64 + 16*56 = 960 bytes on + 64-bit; round up to 640 and 1024 respectively. If this heuristic + should still fail for some file the code in + `_dl_map_object_from_fd' knows how to recover. */ +struct filebuf +{ + ssize_t len; +#if __WORDSIZE == 32 +# define FILEBUF_SIZE 640 +#else +# define FILEBUF_SIZE 1024 +#endif + char buf[FILEBUF_SIZE] __attribute__ ((aligned (__alignof (ElfW(Ehdr))))); +}; + + /* On some systems, no flag bits are given to specify file mapping. */ #ifndef MAP_FILE # define MAP_FILE 0 @@ -84,28 +113,65 @@ struct loadcmd }; -/* Iterator for PT_LOAD program header segments. It should be initialized - by _dl_pt_load_iterator_init once, then _dl_pt_load_iterator_next - repeatedly to walk each PT_LOAD segment without storing them all. - Segments are re-read one at a time via pread so that no large stack - buffer is needed for the program header table. */ +/* Iterator for program header segments. Initialize with + _dl_pt_load_iterator_init, then either walk PT_LOAD segments via + _dl_pt_load_iterator_next or do random access via + _dl_pt_load_iterator_phdr_at. A scratch buffer (fbp->buf) is used to + batch-read program headers; if the entire program header table was + already loaded by open_verify's initial read no pread is issued. */ struct dl_pt_load_iterator { int fd; /* File descriptor for pread. */ + struct filebuf *fbp; /* Scratch buffer for batched phdr reads. */ ElfW(Off) phoff; /* Program header table file offset. */ ElfW(Half) phnum; /* Total number of program headers. */ ElfW(Half) idx; /* Index of next header to read. */ + ElfW(Half) buf_base; /* Index of phdr at start of fbp->buf + (chunked mode only). */ + ElfW(Half) buf_count; /* Number of phdrs currently in fbp->buf. */ + bool cached; /* True iff entire phdr table is already + resident in fbp->buf from open_verify. */ ElfW(Addr) p_align_max; /* Maximum p_align over all PT_LOAD segments. */ ElfW(Addr) pagesize; /* System page size (GLRO(dl_pagesize)). */ - /* Fields below are precomputed by _dl_pt_load_iterator_init and - are intended for use by _dl_map_segments. */ + /* Fields below are precomputed by _dl_map_object_scan_phdrs and are + intended for use by _dl_map_segments. */ ElfW(Addr) first_mapstart; /* mapstart of the first PT_LOAD segment. */ ElfW(Addr) last_mapstart; /* mapstart of the last PT_LOAD segment. */ ElfW(Addr) last_allocend; /* allocend of the last PT_LOAD segment. */ size_t nloadcmds; /* Number of PT_LOAD segments found. */ }; +/* Return a pointer to the program header at INDEX. If the entire phdr + table is already cached in fbp->buf (from open_verify), it is served + directly with no syscall; otherwise a batch of up to FILEBUF_SIZE / + sizeof(ElfW(Phdr)) entries is read into fbp->buf via a single pread. + Subsequent calls within the same batch hit the buffer. Returns NULL on + read failure (errno set by pread). */ +static __always_inline const ElfW(Phdr) * +_dl_pt_load_iterator_phdr_at (struct dl_pt_load_iterator *it, ElfW(Half) idx) +{ + if (__glibc_likely (it->cached)) + return (const ElfW(Phdr) *) (it->fbp->buf + it->phoff) + idx; + + if (idx < it->buf_base || idx >= it->buf_base + it->buf_count) + { + const ElfW(Half) phdrs_per_buf + = sizeof (it->fbp->buf) / sizeof (ElfW(Phdr)); + ElfW(Half) batch = it->phnum - idx; + if (batch > phdrs_per_buf) + batch = phdrs_per_buf; + size_t bytes = (size_t) batch * sizeof (ElfW(Phdr)); + ElfW(Off) off = it->phoff + (ElfW(Off)) idx * sizeof (ElfW(Phdr)); + if (__pread64_nocancel (it->fd, it->fbp->buf, bytes, off) + != (ssize_t) bytes) + return NULL; + it->buf_base = idx; + it->buf_count = batch; + } + return (const ElfW(Phdr) *) it->fbp->buf + (idx - it->buf_base); +} + /* Advance iterator IT to the next PT_LOAD segment and fill C with its decoded load command. Returns true when a segment was found, false when the end of the program header table has been reached or a read @@ -115,21 +181,19 @@ _dl_pt_load_iterator_next (struct dl_pt_load_iterator *it, struct loadcmd *c) { while (it->idx < it->phnum) { - ElfW(Phdr) ph; - ElfW(Off) off = it->phoff + (ElfW(Off)) it->idx * sizeof ph; + const ElfW(Phdr) *ph = _dl_pt_load_iterator_phdr_at (it, it->idx); it->idx++; - if (__pread64_nocancel (it->fd, &ph, sizeof ph, off) - != (ssize_t) sizeof ph) + if (__glibc_unlikely (ph == NULL)) return false; - if (ph.p_type != PT_LOAD) + if (ph->p_type != PT_LOAD) continue; - c->mapstart = ALIGN_DOWN (ph.p_vaddr, it->pagesize); - c->mapend = ALIGN_UP (ph.p_vaddr + ph.p_filesz, it->pagesize); - c->dataend = ph.p_vaddr + ph.p_filesz; - c->allocend = ph.p_vaddr + ph.p_memsz; - c->mapoff = ALIGN_DOWN (ph.p_offset, it->pagesize); - c->prot = pf_to_prot (ph.p_flags); + c->mapstart = ALIGN_DOWN (ph->p_vaddr, it->pagesize); + c->mapend = ALIGN_UP (ph->p_vaddr + ph->p_filesz, it->pagesize); + c->dataend = ph->p_vaddr + ph->p_filesz; + c->allocend = ph->p_vaddr + ph->p_memsz; + c->mapoff = ALIGN_DOWN (ph->p_offset, it->pagesize); + c->prot = pf_to_prot (ph->p_flags); c->mapalign = it->p_align_max; return true; }