From patchwork Fri Sep 3 17:11:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 44842 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9F6543847809 for ; Fri, 3 Sep 2021 17:13:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9F6543847809 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630689215; bh=5E3EKbk/yeAdBoqnarMTi1klpjz/HA8rTd0fYOpGqxE=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Mst6OGTpOMo98aEculfMVPuXGsUsUPAa87TxlcTONOmgeSR9rXXxez8puWwXPKbZy jz14+vBvwcscBs+zeY48fRJ9spNDwHp85TWcuGh+WB/WpHHNRUYMqCHNFwHqg2zIfg q9dx9HXobnnjQiGzhouP9V8iwHXMN8gbX94kppYA= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by sourceware.org (Postfix) with ESMTPS id C20F23848402 for ; Fri, 3 Sep 2021 17:11:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C20F23848402 Received: by mail-qt1-x829.google.com with SMTP id r21so5080954qtw.11 for ; Fri, 03 Sep 2021 10:11:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5E3EKbk/yeAdBoqnarMTi1klpjz/HA8rTd0fYOpGqxE=; b=Xwjn1yTbE881dAouVr3XdZrCj0T6Yo0AhpPFKfQxJuHKc5XiokZMvdR+sUKA1YtgVS nlDhm+inNkgFVYods8Z56veev9yzjoOkRxv6NbYLnxmpVo6ofl/sl83jFN5mc0zx9Z+m 4Ij/3KkiFcPqdUDm1lkMc8Pd1M+i+azoH2up5HQljURN8TzZVcppwmLW487fv56Kw1PL GVqJcTFqxgUEBWj3b9Y1Wjth2zS5I3ruAvof3YinS0Bd5jMCBpL/gJUy/m5Mg5WdwI5J Yt9LD338aEueLmgA8CdCT5qRvENqcVx7sn95QCYQ/6Bn0fm9TILfCutkPgfBkH4F5zsB AtwQ== X-Gm-Message-State: AOAM53292f8eqDmE6mL82k16APq8hTOLxNpmUABqf2dkzEU8xsYPeYlS iusfrdkLK9tQUCAbmgtBc/iPQfHK6/9pXg== X-Google-Smtp-Source: ABdhPJxGvrLUTqeXQU6Cxbexxas6wwk4GAkU3d/4MR+izCVdTJnE6MnRXswqMF+FoXO7u/a00Li9xw== X-Received: by 2002:ac8:734c:: with SMTP id q12mr9115qtp.192.1630689110037; Fri, 03 Sep 2021 10:11:50 -0700 (PDT) Received: from birita.. ([2804:431:c7cb:733d:fff8:7487:556e:e293]) by smtp.gmail.com with ESMTPSA id r4sm3207071qtw.5.2021.09.03.10.11.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Sep 2021 10:11:49 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 1/7] benchtests: Add bench-qsort Date: Fri, 3 Sep 2021 14:11:38 -0300 Message-Id: <20210903171144.952737-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210903171144.952737-1-adhemerval.zanella@linaro.org> References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patch adds a qsort benchmark. I tried to model the benchmark taking in consideration the possible input variation in both internal element size, element numbers, and internal state for 1. real word cases and 2. possible scenarios based on hardware characteristics. For 1. I tracked qsort usage (using a simple preload library to dump its usage and a script to pos-process it) on both GCC bootstrap and Firefox. GCC 8 bootstrap build shows 51786641 call to qsort with the following characterics: Key: number of elements: key=2 : 39.74 key=3 : 19.23 key=4 : 9.77 key=1 : 8.44 key=0 : 6.60 key=5 : 4.40 key=7 : 2.37 key=6 : 2.25 key=9 : 1.48 key=8 : 0.97 Key: element size in bytes: key=8 : 91.74 key=32 : 3.62 key=4 : 2.42 key=40 : 1.20 key=16 : 0.67 key=24 : 0.30 key=48 : 0.05 key=56 : 0.00 key=1 : 0.00 Key: total size (number of elements * element size): key=16 : 35.98 key=24 : 18.67 key=32 : 9.79 key=8 : 8.28 key=0 : 6.60 key=40 : 4.21 key=64 : 3.15 key=48 : 2.24 key=56 : 2.15 key=80 : 1.45 So for GCC: - 80% of total qsort usage are done with 10 elements of less. - All usages are done element size of maximum of 56 bytes. - 90% of calls are done with array of maximum size of 80 bytes or less. (GCC on recent version now uses a internal qsort implementation instead of the libc one). The Firefox usage was done with 2 hours of usage, with first 10 minutes activelly openning and closing different types of sites. It resulted in 21042 calls with following characteristics: Key: number of elements: key=7 : 24.40 key=1 : 10.44 key=3 : 6.33 key=4 : 5.81 key=2 : 5.46 key=6 : 4.80 key=17 : 4.54 key=0 : 3.07 key=5 : 3.05 key=9 : 2.51 key=12 : 2.06 Key: element size in bytes: key=8 : 94.49 key=28 : 4.40 key=2 : 0.70 key=16 : 0.19 key=36 : 0.07 key=12 : 0.07 key=40 : 0.07 key=24 : 0.03 Key: total size (number of elements * element size): key=56 : 24.20 key=8 : 10.27 key=24 : 6.36 key=32 : 5.86 key=16 : 5.46 key=48 : 4.80 key=476 : 3.75 key=0 : 3.07 key=40 : 3.05 key=72 : 2.50 So for Firefox: - 72% of total qsort usage are done with 18 elements of less. - All usages are done element size of maximum of 40 bytes. - 70% of calls are done with array of maximum size of 476 bytes or less. For 2. I used the idea of a machine with 3 levels of cache with sizes L1 32kb, L2 256kb, and L3 4096Kb. It resulted in a benchmark with following traits: * It checks four types of input arrays: sorted, mostly sorted, random, repeated, and bitonic: - 'sorted': the array is already sorted. - 'mostly sorted': the array will have a certain number of random position with random values (current ratio used is 20%). - 'random' the array will contain random elements. - 'repeated' the array will have random elements with a certain number (current ratio is 20%) of a repeated element distributed randomly. - 'bitonic': elements will be strictly increasing to up middle of the array then strictly decreasing. This is to trigger the pattern-defeting input for the median of 3 pivot selection used in quicksort implementation. * Three elements sizes are checked: uint32_t, uint64_t, and an element with 32 bytes (but using the uint32_t comparison checks). These element sizes are used to 1. to avoid include the comparison function itself and/or memory copy in sort benchmark itself, and 2. because key of size_t are the most used for both GCC and Firefox. * Five different element numbers: 64 (which cover mostly of used elementsizes for both GCC and Firefox), 4096/8192 (which cover 32 KB of L1 for 32 and 64 bits), 32768/65536 (L2 with 256 KB), and 24288/1048576 (L3 with 4 MB). The sizes are configurable by --nelem option. Checked on x86_64-linux-gnu --- benchtests/Makefile | 2 +- benchtests/bench-qsort.c | 195 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 196 insertions(+), 1 deletion(-) create mode 100644 benchtests/bench-qsort.c diff --git a/benchtests/Makefile b/benchtests/Makefile index 1530939a8c..781cbca8af 100644 --- a/benchtests/Makefile +++ b/benchtests/Makefile @@ -76,7 +76,7 @@ LOCALES := en_US.UTF-8 tr_TR.UTF-8 cs_CZ.UTF-8 fa_IR.UTF-8 fr_FR.UTF-8 \ include ../gen-locales.mk endif -stdlib-benchset := strtod +stdlib-benchset := strtod qsort stdio-common-benchset := sprintf diff --git a/benchtests/bench-qsort.c b/benchtests/bench-qsort.c new file mode 100644 index 0000000000..905aa6d7b6 --- /dev/null +++ b/benchtests/bench-qsort.c @@ -0,0 +1,195 @@ +/* Measure qsort function. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include + +#include + +/* Number of elements of determined type to be measured. */ +static const size_t default_nelems[] = +{ + 256/sizeof(size_t), /* 64/128 which covers mostly used element number + on GCC build. */ + 32768/sizeof(size_t), /* 4096/8192 to fit on a L1 with 32 KB. */ + 262144/sizeof(size_t), /* 32768/65536 to fit on a L2 with 256 KB. */ + 4194304/sizeof(size_t), /* 524288/1048576 to fix on a L3 with 4 MB. */ +}; + +#define OPT_NELEM 10000 +#define OPT_SEED 10001 +#define CMDLINE_OPTIONS \ + { "nelem", required_argument, NULL, OPT_NELEM }, \ + { "seed", required_argument, NULL, OPT_SEED }, + +static const size_t max_nelem = 16; +static size_t *elems = NULL; +static size_t nelem = 0; +static uint64_t seed = 0; +static bool seed_set = false; + +static void +cmdline_process_function (int c) +{ + switch (c) + { + /* Handle the --nelem option to run different sizes than DEFAULT_ELEM. + The elements sizes as passed with a ':' as the delimiter, for + instance --nelem 32:128:1024 will ran 32, 128, and 1024 elements. */ + case OPT_NELEM: + { + elems = xmalloc (max_nelem * sizeof (size_t)); + nelem = 0; + + char *saveptr; + char *token; + token = strtok_r (optarg, ":", &saveptr); + if (token == NULL) + { + printf ("error: invalid --nelem value\n"); + exit (EXIT_FAILURE); + } + do + { + if (nelem == max_nelem) + { + printf ("error: invalid --nelem value (max elem)\n"); + exit (EXIT_FAILURE); + } + elems[nelem++] = atol (token); + token = strtok_r (saveptr, ":", &saveptr); + } while (token != NULL); + } + break; + + /* handle the --seed option to use a different seed than a random one. + The SEED used should be a uint64_t number. */ + case OPT_SEED: + { + uint64_t value = strtoull (optarg, NULL, 0); + if (errno == ERANGE || value > UINT64_MAX) + { + printf ("error: seed should be a value in range of " + "[0, UINT64_MAX]\n"); + exit (EXIT_FAILURE); + } + seed = value; + seed_set = true; + } + } +} + +#define CMDLINE_PROCESS cmdline_process_function + +static int +do_test (void) +{ + random_init (); + + + json_ctx_t json_ctx; + + json_init (&json_ctx, 0, stdout); + + json_document_begin (&json_ctx); + json_attr_string (&json_ctx, "timing_type", TIMING_TYPE); + + json_attr_object_begin (&json_ctx, "functions"); + json_attr_object_begin (&json_ctx, "qsort"); + json_attr_uint (&json_ctx, "seed", seed); + + json_array_begin (&json_ctx, "results"); + + const size_t *welem = elems == NULL ? default_nelems : elems; + const size_t wnelem = elems == NULL ? array_length (default_nelems) : nelem; + + struct test_t + { + size_t type_size; + cmpfunc_t cmpfunc; + }; + static const struct test_t tests[] = + { + { sizeof (uint32_t), uint32_t_cmp }, + { sizeof (uint64_t), uint64_t_cmp }, + /* Test swap with large elements. */ + { 32, uint32_t_cmp }, + }; + + const size_t inner_loop_iters = 16; + for (const struct test_t *test = tests; test < array_end (tests); ++test) + { + for (const struct array_t *arraytype = arraytypes; + arraytype < array_end (arraytypes); + ++arraytype) + { + for (int k = 0; k < wnelem; k++) + { + json_element_object_begin (&json_ctx); + + size_t arraysize = welem[k] * test->type_size; + + json_attr_uint (&json_ctx, "nmemb", welem[k]); + json_attr_uint (&json_ctx, "type_size", test->type_size); + json_attr_string (&json_ctx, "property", arraytype->name); + + void *base = create_array (welem[k], test->type_size, + arraytype->type); + void *work = xmalloc (arraysize); + + timing_t total = 0; + + for (int n = 0; n < inner_loop_iters; n++) + { + memcpy (work, base, arraysize); + + timing_t start, end, diff; + TIMING_NOW (start); + qsort (work, welem[k], test->type_size, test->cmpfunc); + TIMING_NOW (end); + + TIMING_DIFF (diff, start, end); + TIMING_ACCUM (total, diff); + } + + json_attr_uint (&json_ctx, "timings", + (double) total / (double) inner_loop_iters); + json_element_object_end (&json_ctx); + + free (base); + free (work); + } + } + } + + json_array_end (&json_ctx); + + json_attr_object_end (&json_ctx); + json_attr_object_end (&json_ctx); + json_document_end (&json_ctx); + + return 0; +} + +#define TIMEOUT 600 +#include From patchwork Fri Sep 3 17:11:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 44841 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 34D063847809 for ; Fri, 3 Sep 2021 17:12:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 34D063847809 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630689173; bh=+YwBHuPhZiifuVE3FNufY3CquCFyiq5fkp9EfubBHBM=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=CoE0NXabYJENZ2LmO7l1PVteA1DOZIKLRbnwtrbd1d7WaXb8yqUB9O46VKL6Cm/nT koLUMuqReauhwvRwowI+mhqMiwIZ225Hhmg/L8BVWX1OlPGO9pDUBwlPYJQshaIipx I2NXcEdLFjP8okUP6tmnGv6NICCy3svDy1v2sTJs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com [IPv6:2607:f8b0:4864:20::82b]) by sourceware.org (Postfix) with ESMTPS id CAB73384B13A for ; Fri, 3 Sep 2021 17:11:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CAB73384B13A Received: by mail-qt1-x82b.google.com with SMTP id c19so5094226qte.7 for ; Fri, 03 Sep 2021 10:11:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+YwBHuPhZiifuVE3FNufY3CquCFyiq5fkp9EfubBHBM=; b=BplIb+VMyOYJc21j1dtisaBCrdDvbI3+WexwExBLXllJOy2FLUDcJVh7Hbs2V5Ek7n utR3QNzVfNfmlR6yQ2hRjV0oobpQfe4t02Of/iTfFQkONiBO4uxcxcsh+bpLCQ1Ryuyx jNaiuwWOj9TEI/O5nDcIddx5YDX5bLB31Q8nAp4WHuobbVEu1xPS0GVx9B9+HMPc0+eT eipP/YDTEshxRBLW/Qgbl9TogzmI1CxyQ89Jwe2/x9NaF5R2YEqErfH1IE+r5+FKx+Vf CCpy6YbtODRHx0jgs32vQwbZnTMX1eGAxx3MVOxW7h/kR5UJfvmcchZHnzyETqLNnyku oKoQ== X-Gm-Message-State: AOAM532L4sBg5rVCexwHvkgjzAa+rhjViwjABVf9zShAaeTx0RhTgoGf oQdNOopwa7WRTygkDI7db9Iz2HlG8lNreg== X-Google-Smtp-Source: ABdhPJy7cs1uNc1kIphLMSYIV/e9aLDc3/3hNlrfug8tvPwVqGyKKHLU/diEQ01pOgIEETNKnKjSAw== X-Received: by 2002:ac8:7010:: with SMTP id x16mr4747122qtm.293.1630689111291; Fri, 03 Sep 2021 10:11:51 -0700 (PDT) Received: from birita.. ([2804:431:c7cb:733d:fff8:7487:556e:e293]) by smtp.gmail.com with ESMTPSA id r4sm3207071qtw.5.2021.09.03.10.11.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Sep 2021 10:11:51 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 2/7] support: Fix getopt_long with CMDLINE_OPTIONS Date: Fri, 3 Sep 2021 14:11:39 -0300 Message-Id: <20210903171144.952737-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210903171144.952737-1-adhemerval.zanella@linaro.org> References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" --- support/support_test_main.c | 1 - support/test-driver.h | 1 + 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/support/support_test_main.c b/support/support_test_main.c index 07e3cdd173..ab1a475192 100644 --- a/support/support_test_main.c +++ b/support/support_test_main.c @@ -42,7 +42,6 @@ static const struct option default_options[] = { TEST_DEFAULT_OPTIONS - { NULL, 0, NULL, 0 } }; /* Show people how to run the program. */ diff --git a/support/test-driver.h b/support/test-driver.h index 8d4f38275d..3e54814a03 100644 --- a/support/test-driver.h +++ b/support/test-driver.h @@ -60,6 +60,7 @@ enum { "verbose", no_argument, NULL, 'v' }, \ { "direct", no_argument, NULL, OPT_DIRECT }, \ { "test-dir", required_argument, NULL, OPT_TESTDIR }, \ + { NULL, 0, NULL, 0 } /* The directory the test should use for temporary files. */ extern const char *test_dir; From patchwork Fri Sep 3 17:11:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 44843 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 62FA2384781C for ; Fri, 3 Sep 2021 17:14:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 62FA2384781C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630689257; bh=CwMYT9DPegeFfin3Jj1L59oCD9VF3rvVt+SDb26UQtY=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=VBH1oBwzvYjZ8oENWkx4PDkYkzFoIAY7YhmPinq6U96eHyE6KcYXI924a5Qr0Ww6q VdzCZGNnT1As54l2fW5HLUSkxdUWty3eRi0m2hUu1E6fsa2ll2Qw/2CDO76ObTnztw /A+RRmJrcRfAvkwy3QZMAfbk7JGqfCaHLrZKzmyo= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qt1-x830.google.com (mail-qt1-x830.google.com [IPv6:2607:f8b0:4864:20::830]) by sourceware.org (Postfix) with ESMTPS id 1E28C3848431 for ; Fri, 3 Sep 2021 17:11:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1E28C3848431 Received: by mail-qt1-x830.google.com with SMTP id x5so5070324qtq.13 for ; Fri, 03 Sep 2021 10:11:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CwMYT9DPegeFfin3Jj1L59oCD9VF3rvVt+SDb26UQtY=; b=AalMmo53wcsgiT3BBIKjz7sOTwI6r5K43d2rDNGNuUNb+15grkK42Kq/Olv+7wbhKU 8BOCpTNwNXBpWmFzEQYVuIN7QPrNunfhDvWqXzLRg4FLKiKN0Ut3WjB5vExJOauFyICT uEGqbRzwWRw7SrrmxlsKLyiS3TLEXYUnEWq6G1GL3+ygWe+etXUZf0Akiwi2AAT/Vodb GsIjTxlHjosyYcZ4RkovH8S8iOZq6iB2oF9dmJTE9Y/WFDGdATrpm1DlEbBW+Y1+0hDS nZbxKNU5nT3ck1u0LwX1bP3rcxC6oqZXE82/eydylJyeelA+Zhq5HSNRsZd99bBqZ4aD ZV/Q== X-Gm-Message-State: AOAM530FlAvbwhQ1fSb3jXdY9CBeEZxlk2ttkW0eq5VNRx4K/SjMdUeY dIrqeOL9as2fusYBW8vHh+6RVFzMoyUS6w== X-Google-Smtp-Source: ABdhPJwzgtm2AkHendV7QRp/g2M/TSFydkLkJjU9hF4c0Q9Q1JWWbq5Rzs7jQ5qDyw8Yygsok8FtqQ== X-Received: by 2002:ac8:7769:: with SMTP id h9mr14361qtu.144.1630689112501; Fri, 03 Sep 2021 10:11:52 -0700 (PDT) Received: from birita.. ([2804:431:c7cb:733d:fff8:7487:556e:e293]) by smtp.gmail.com with ESMTPSA id r4sm3207071qtw.5.2021.09.03.10.11.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Sep 2021 10:11:52 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 3/7] stdlib: Optimization qsort{_r} swap implementation (BZ #19305) Date: Fri, 3 Sep 2021 14:11:40 -0300 Message-Id: <20210903171144.952737-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210903171144.952737-1-adhemerval.zanella@linaro.org> References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" It optimizes take in consideration both the most common elements are either 32 or 64 bit in size [1] and inputs are aligned to the word boundary. This is similar to the optimization done on lib/sort.c from Linux. This patchs adds an optimized swap operation on qsort based in previous msort one. Instead of byte operation, three variants are provided: 1. Using uint32_t loads and stores. 2. Using uint64_t loads and stores. 3. Generic one with a temporary buffer and memcpy/mempcpy. The 1. and 2. options are selected only either if architecture defines _STRING_ARCH_unaligned or if base pointer is aligned to required type. It also fixes BZ#19305 by checking input size against number of elements 1 besides 0. Checked on x86_64-linux-gnu. [1] https://sourceware.org/pipermail/libc-alpha/2018-August/096984.html --- stdlib/qsort.c | 109 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 91 insertions(+), 18 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 23f2d28314..59458d151b 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -24,20 +24,85 @@ #include #include #include +#include -/* Byte-wise swap two items of size SIZE. */ -#define SWAP(a, b, size) \ - do \ - { \ - size_t __size = (size); \ - char *__a = (a), *__b = (b); \ - do \ - { \ - char __tmp = *__a; \ - *__a++ = *__b; \ - *__b++ = __tmp; \ - } while (--__size > 0); \ - } while (0) +/* Swap SIZE bytes between addresses A and B. These helpers are provided + along the generic one as an optimization. */ + +typedef void (*swap_func_t)(void * restrict, void * restrict, size_t); + +/* Return trues is elements can be copied used word load and sortes. + The size must be a multiple of the alignment, and the base address. */ +static inline bool +is_aligned_to_copy (const void *base, size_t size, size_t align) +{ + unsigned char lsbits = size; +#if !_STRING_ARCH_unaligned + lsbits |= (unsigned char)(uintptr_t) base; +#endif + return (lsbits & (align - 1)) == 0; +} + +#define SWAP_WORDS_64 (swap_func_t)0 +#define SWAP_WORDS_32 (swap_func_t)1 +#define SWAP_BYTES (swap_func_t)2 + +static void +swap_words_64 (void * restrict a, void * restrict b, size_t n) +{ + do + { + n -= 8; + uint64_t t = *(uint64_t *)(a + n); + *(uint64_t *)(a + n) = *(uint64_t *)(b + n); + *(uint64_t *)(b + n) = t; + } while (n); +} + +static void +swap_words_32 (void * restrict a, void * restrict b, size_t n) +{ + do + { + n -= 4; + uint32_t t = *(uint32_t *)(a + n); + *(uint32_t *)(a + n) = *(uint32_t *)(b + n); + *(uint32_t *)(b + n) = t; + } while (n); +} + +static void +swap_bytes (void * restrict a, void * restrict b, size_t n) +{ + /* Use multiple small memcpys with constant size to enable inlining + on most targets. */ + enum { SWAP_GENERIC_SIZE = 32 }; + unsigned char tmp[SWAP_GENERIC_SIZE]; + while (n > SWAP_GENERIC_SIZE) + { + memcpy (tmp, a, SWAP_GENERIC_SIZE); + a = memcpy (a, b, SWAP_GENERIC_SIZE) + SWAP_GENERIC_SIZE; + b = memcpy (b, tmp, SWAP_GENERIC_SIZE) + SWAP_GENERIC_SIZE; + n -= SWAP_GENERIC_SIZE; + } + memcpy (tmp, a, n); + memcpy (a, b, n); + memcpy (b, tmp, n); +} + +/* Replace the indirect call with a serie of if statements. It should help + the branch predictor. */ +static void +do_swap (void * restrict a, void * restrict b, size_t size, + swap_func_t swap_func) +{ + if (swap_func == SWAP_WORDS_64) + swap_words_64 (a, b, size); + else if (swap_func == SWAP_WORDS_32) + swap_words_32 (a, b, size); + else + swap_bytes (a, b, size); +} /* Discontinue quicksort algorithm when partition gets below this size. This particular magic number was chosen to work best on a Sun 4/260. */ @@ -97,6 +162,14 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, /* Avoid lossage with unsigned arithmetic below. */ return; + swap_func_t swap_func; + if (is_aligned_to_copy (pbase, size, 8)) + swap_func = SWAP_WORDS_64; + else if (is_aligned_to_copy (pbase, size, 4)) + swap_func = SWAP_WORDS_32; + else + swap_func = SWAP_BYTES; + if (total_elems > MAX_THRESH) { char *lo = base_ptr; @@ -120,13 +193,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, char *mid = lo + size * ((hi - lo) / size >> 1); if ((*cmp) ((void *) mid, (void *) lo, arg) < 0) - SWAP (mid, lo, size); + do_swap (mid, lo, size, swap_func); if ((*cmp) ((void *) hi, (void *) mid, arg) < 0) - SWAP (mid, hi, size); + do_swap (mid, hi, size, swap_func); else goto jump_over; if ((*cmp) ((void *) mid, (void *) lo, arg) < 0) - SWAP (mid, lo, size); + do_swap (mid, lo, size, swap_func); jump_over:; left_ptr = lo + size; @@ -145,7 +218,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, if (left_ptr < right_ptr) { - SWAP (left_ptr, right_ptr, size); + do_swap (left_ptr, right_ptr, size, swap_func); if (mid == left_ptr) mid = right_ptr; else if (mid == right_ptr) @@ -217,7 +290,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, tmp_ptr = run_ptr; if (tmp_ptr != base_ptr) - SWAP (tmp_ptr, base_ptr, size); + do_swap (tmp_ptr, base_ptr, size, swap_func); /* Insertion sort, running from left-hand-side up to right-hand-side. */ From patchwork Fri Sep 3 17:11:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 44844 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DEBE33847816 for ; Fri, 3 Sep 2021 17:15:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DEBE33847816 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630689304; bh=2vu6qRs3omrH9PNdjhyh+WqjLUgNyWC23FMqcmAI5hQ=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=rlzCxIe8sOgZwm0auxuqP6pzGbAwWwjtdLHXWpWBPIk/Ski6H6yAegAi3Qmg6br3B u4OP24E4CkcjGMYsDLm+6xEZiAEl+8Ok8ZzXb6TlMjX5mjxSWoSVSV6dOq0wGY0NWI rPu7hbEIauaxjm+/ZamNEVzj98VdrsNECpH7fzyY= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qv1-xf31.google.com (mail-qv1-xf31.google.com [IPv6:2607:f8b0:4864:20::f31]) by sourceware.org (Postfix) with ESMTPS id 4033A3848402 for ; Fri, 3 Sep 2021 17:11:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4033A3848402 Received: by mail-qv1-xf31.google.com with SMTP id bn14so3547161qvb.12 for ; Fri, 03 Sep 2021 10:11:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2vu6qRs3omrH9PNdjhyh+WqjLUgNyWC23FMqcmAI5hQ=; b=fuRDyLbH4xD1nZ1kF52kdZdavTmpCkn2n0SbjsQGwX7EAVlp4hAvfhWR+M4iSUF+yK CyskMni0C11jdO8MAKJ+cLMr91E+NYPaF75hz/KhH4ineZHJAEl6y7YkfL4hIdzrOA0a 9NXn9qQavRgG71kgLDbXs4Qp+himWP54+c2uO1Jl2yUuKHZIaXMU46bffw5wvTS1TJr5 9W6JXFSI3BVYaVMQJtIbPUBbXbwfRzdIUE4dXF5RKRaIkiIA4RKltHxglVtX+evvzK2B qk7QC+GG2r0ifSs3hP5pUW7ua2lOtb+QKhENRLjYbaevVsuydJDxebusFp1pIXz4d8Fo vpHg== X-Gm-Message-State: AOAM532DxdA9wp1k8cnO7UFV/RN5WHHLmYNZLjCfM7WfHnJxY2P6OEVb AGSWx43hzd1dszOxCF1DwtOpj5xB6qxHsQ== X-Google-Smtp-Source: ABdhPJwMwyGHsKkqO39HtVx2xQUvLjfNMg1mPE5triFgQKyBGKqCenCCNil6siFYXZ8zme/LU0sKBg== X-Received: by 2002:a0c:f58c:: with SMTP id k12mr109881qvm.38.1630689113717; Fri, 03 Sep 2021 10:11:53 -0700 (PDT) Received: from birita.. ([2804:431:c7cb:733d:fff8:7487:556e:e293]) by smtp.gmail.com with ESMTPSA id r4sm3207071qtw.5.2021.09.03.10.11.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Sep 2021 10:11:53 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 4/7] stdlib: Move insertion sort out qsort Date: Fri, 3 Sep 2021 14:11:41 -0300 Message-Id: <20210903171144.952737-5-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210903171144.952737-1-adhemerval.zanella@linaro.org> References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" --- stdlib/qsort.c | 100 ++++++++++++++++++++++++++----------------------- 1 file changed, 53 insertions(+), 47 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 59458d151b..b69417dedd 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -150,6 +150,58 @@ typedef struct smaller partition. This *guarantees* no more than log (total_elems) stack size is needed (actually O(1) in this case)! */ +static void +insertion_sort (void *const pbase, size_t total_elems, size_t size, + swap_func_t swap_func, + __compar_d_fn_t cmp, void *arg) +{ + char *base_ptr = (char *) pbase; + char *const end_ptr = &base_ptr[size * (total_elems - 1)]; + char *tmp_ptr = base_ptr; +#define min(x, y) ((x) < (y) ? (x) : (y)) + const size_t max_thresh = MAX_THRESH * size; + char *thresh = min(end_ptr, base_ptr + max_thresh); + char *run_ptr; + + /* Find smallest element in first threshold and place it at the + array's beginning. This is the smallest array element, + and the operation speeds up insertion sort's inner loop. */ + + for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size) + if (cmp (run_ptr, tmp_ptr, arg) < 0) + tmp_ptr = run_ptr; + + if (tmp_ptr != base_ptr) + do_swap (tmp_ptr, base_ptr, size, swap_func); + + /* Insertion sort, running from left-hand-side up to right-hand-side. */ + + run_ptr = base_ptr + size; + while ((run_ptr += size) <= end_ptr) + { + tmp_ptr = run_ptr - size; + while (cmp (run_ptr, tmp_ptr, arg) < 0) + tmp_ptr -= size; + + tmp_ptr += size; + if (tmp_ptr != run_ptr) + { + char *trav; + + trav = run_ptr + size; + while (--trav >= run_ptr) + { + char c = *trav; + char *hi, *lo; + + for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo) + *hi = *lo; + *hi = c; + } + } + } +} + void _quicksort (void *const pbase, size_t total_elems, size_t size, __compar_d_fn_t cmp, void *arg) @@ -272,51 +324,5 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, for partitions below MAX_THRESH size. BASE_PTR points to the beginning of the array to sort, and END_PTR points at the very last element in the array (*not* one beyond it!). */ - -#define min(x, y) ((x) < (y) ? (x) : (y)) - - { - char *const end_ptr = &base_ptr[size * (total_elems - 1)]; - char *tmp_ptr = base_ptr; - char *thresh = min(end_ptr, base_ptr + max_thresh); - char *run_ptr; - - /* Find smallest element in first threshold and place it at the - array's beginning. This is the smallest array element, - and the operation speeds up insertion sort's inner loop. */ - - for (run_ptr = tmp_ptr + size; run_ptr <= thresh; run_ptr += size) - if ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, arg) < 0) - tmp_ptr = run_ptr; - - if (tmp_ptr != base_ptr) - do_swap (tmp_ptr, base_ptr, size, swap_func); - - /* Insertion sort, running from left-hand-side up to right-hand-side. */ - - run_ptr = base_ptr + size; - while ((run_ptr += size) <= end_ptr) - { - tmp_ptr = run_ptr - size; - while ((*cmp) ((void *) run_ptr, (void *) tmp_ptr, arg) < 0) - tmp_ptr -= size; - - tmp_ptr += size; - if (tmp_ptr != run_ptr) - { - char *trav; - - trav = run_ptr + size; - while (--trav >= run_ptr) - { - char c = *trav; - char *hi, *lo; - - for (hi = lo = trav; (lo -= size) >= tmp_ptr; hi = lo) - *hi = *lo; - *hi = c; - } - } - } - } + insertion_sort (pbase, total_elems, size, swap_func, cmp, arg); } From patchwork Fri Sep 3 17:11:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 44845 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E0A843847818 for ; Fri, 3 Sep 2021 17:15:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E0A843847818 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630689346; bh=74mSgnP+ZRmSN7IKQEVD7zoiIgsEFZxmc9drKk3Mu6U=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=YWZbdtaOY+k8IbOfCzP/Tq2+mBVL3EVCoydBsLwZbfi9bLenrqdGzFfwZkpNwhiqm K4UrvJsgvRaI2d4rZrHPamq5V2dCCAHKbBCK3DodA0VLm+KzSLr7WAdiE2pth4S/OM u+KIyLzC3UApDcKSJShfgNj9TyoJI9QJbBH+ov8U= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by sourceware.org (Postfix) with ESMTPS id C1098384780C for ; Fri, 3 Sep 2021 17:11:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C1098384780C Received: by mail-qk1-x72f.google.com with SMTP id t190so6487839qke.7 for ; Fri, 03 Sep 2021 10:11:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=74mSgnP+ZRmSN7IKQEVD7zoiIgsEFZxmc9drKk3Mu6U=; b=TaPsd+qBv58hvoo8/s2dUaQ5bolW3usd86Ya5+yKNUcG9mGgxKbiPdiFbSqMO4v6Xy y2/682+CFY2UZrMiQddme1mt/sKXdGHoyBPD1TGHRYuTc1b3zN5mgMN8qcTjc9Z0cB0v TsgiClDXKFHSu1GOgrS425Ba8ZUDFoJkpyWMvamgvXpiDlLFLU0epcxq5rb9gjaNdqbj 7t8MjUBiYSvi19joixN4qm2Xhw1PkXkFyqj9ifud/W4znE+f+mteQ8juAy1F/xhqR5Bo JKoNIPLJIGlEewEfLY481MQ0QHmV+Ghe55A+eI2SRdBFxd6usl4eH4LbwVcvj+SKNS6R c0TQ== X-Gm-Message-State: AOAM532xtRZSOJueEDD/z/CnxRXkn4ps2f6qu6MjIVvvv9pJQhUKOYEE 9GsKPzeJluQ6f2qyVZBCuQbbE1mCoixtFQ== X-Google-Smtp-Source: ABdhPJx9uCEQmZZwZtNLBrL2uQBiF2k0Y673zf7MnJcBq9Qh7JQgyuMtD6oLGpvQq7HZpKId/41NDg== X-Received: by 2002:a05:620a:318e:: with SMTP id bi14mr4317643qkb.274.1630689114913; Fri, 03 Sep 2021 10:11:54 -0700 (PDT) Received: from birita.. ([2804:431:c7cb:733d:fff8:7487:556e:e293]) by smtp.gmail.com with ESMTPSA id r4sm3207071qtw.5.2021.09.03.10.11.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Sep 2021 10:11:54 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 5/7] stdlib: qsort: Move some macros to inline function Date: Fri, 3 Sep 2021 14:11:42 -0300 Message-Id: <20210903171144.952737-6-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210903171144.952737-1-adhemerval.zanella@linaro.org> References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Change the macro PUSH and POP and rmeove the macro STACK_NOT_EMPTY. --- stdlib/qsort.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index b69417dedd..5df640362d 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -115,15 +115,28 @@ typedef struct char *hi; } stack_node; -/* The next 4 #defines implement a very fast in-line stack abstraction. */ /* The stack needs log (total_elements) entries (we could even subtract log(MAX_THRESH)). Since total_elements has type size_t, we get as upper bound for log (total_elements): bits per byte (CHAR_BIT) * sizeof(size_t). */ -#define STACK_SIZE (CHAR_BIT * sizeof (size_t)) -#define PUSH(low, high) ((void) ((top->lo = (low)), (top->hi = (high)), ++top)) -#define POP(low, high) ((void) (--top, (low = top->lo), (high = top->hi))) -#define STACK_NOT_EMPTY (stack < top) +enum { STACK_SIZE = CHAR_BIT * sizeof (size_t) }; + +static inline stack_node * +push (stack_node *top, char *lo, char *hi) +{ + top->lo = lo; + top->hi = hi; + return ++top; +} + +static inline stack_node * +pop (stack_node *top, char **lo, char **hi) +{ + --top; + *lo = top->lo; + *hi = top->hi; + return top; +} /* Order size using quicksort. This implementation incorporates @@ -229,9 +242,9 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, stack_node stack[STACK_SIZE]; stack_node *top = stack; - PUSH (NULL, NULL); + top = push (top, NULL, NULL); - while (STACK_NOT_EMPTY) + while (stack < top) { char *left_ptr; char *right_ptr; @@ -296,7 +309,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, { if ((size_t) (hi - left_ptr) <= max_thresh) /* Ignore both small partitions. */ - POP (lo, hi); + top = pop (top, &lo, &hi); else /* Ignore small left partition. */ lo = left_ptr; @@ -307,13 +320,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else if ((right_ptr - lo) > (hi - left_ptr)) { /* Push larger left partition indices. */ - PUSH (lo, right_ptr); + top = push (top, lo, right_ptr); lo = left_ptr; } else { /* Push larger right partition indices. */ - PUSH (left_ptr, hi); + top = push (top, left_ptr, hi); hi = right_ptr; } } From patchwork Fri Sep 3 17:11:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 44846 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0167F3847804 for ; Fri, 3 Sep 2021 17:16:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0167F3847804 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630689389; bh=t1ar05kcPamiSFdi5j1jT770694At8ev06FeYeR8mnU=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=lG2xddmikmBPYx1LbGM3krza0d/GCU6oItEHrZdsj57V6uocCWoc8v44lDiz6sW7v M9KjKBNlmFyigVOLIzLFPL0FA5LrXFSsoHJiS0/1iS9/NkT+Thd446K2vJxNoR/QNM PcpPXBjl54FyD1CC9qnkjD2plw2b8gbaNSt6l4a8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x730.google.com (mail-qk1-x730.google.com [IPv6:2607:f8b0:4864:20::730]) by sourceware.org (Postfix) with ESMTPS id CC6A93847806 for ; Fri, 3 Sep 2021 17:11:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CC6A93847806 Received: by mail-qk1-x730.google.com with SMTP id a10so6473073qka.12 for ; Fri, 03 Sep 2021 10:11:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=t1ar05kcPamiSFdi5j1jT770694At8ev06FeYeR8mnU=; b=R9k4h97Ulf4S9xikq92Yrk7/IlK9tsMc43IHMcXagAfa6jAyrWgR7vkCCli3t3sdV7 2UZ9Z/vijV818BhyEqQEqrxTgexCaFS6FimASd0FGEa7CMUqwnJExr7/HEfR8UEC9KpA XAVes5NiaZcPByDtq8APux9xBV3qAz1xKjoGARVpOz9mlVvEuHsTskvPNakJy9LK5Zrq mdgg6vIfotDRzezI5GcZeyYKJAgHdReh/jY11r+bEGJJ1RDvcZtsQlgGkoqk5lGlVUXu ospiRUpIwhCUgLAUl0J1YLVEVOnjuRs5p+CWC3v54x9fvUemQ6jiwhnQF4v4sn1UaEit IxHw== X-Gm-Message-State: AOAM532aBFdwOfc6m5Ya9UQQ53zEQUc1oTqf71fVzINCj0oO0Xps+tG1 esTy0Tlvox/34+vIXHrCkSSZ0sAUbAwQxA== X-Google-Smtp-Source: ABdhPJzUeh9hWahFU0Kejv8+0RYr/OKheZn2zpVKR5vnxVRI5HBI+UA+3XmFogYO94+96BGLX4xTBw== X-Received: by 2002:a37:e306:: with SMTP id y6mr4372654qki.203.1630689116137; Fri, 03 Sep 2021 10:11:56 -0700 (PDT) Received: from birita.. ([2804:431:c7cb:733d:fff8:7487:556e:e293]) by smtp.gmail.com with ESMTPSA id r4sm3207071qtw.5.2021.09.03.10.11.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Sep 2021 10:11:55 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 6/7] stdlib: Implement introsort with qsort Date: Fri, 3 Sep 2021 14:11:43 -0300 Message-Id: <20210903171144.952737-7-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210903171144.952737-1-adhemerval.zanella@linaro.org> References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patch adds a introsort implementation on qsort to avoid worse-case performance of quicksort to O(nlog n). The heapsort fallback used is a heapsort based on Linux implementation (commit 22a241ccb2c19962a). As a side note the introsort implementation is similar the one used on libstdc++ for std::sort. Checked on x86_64-linux-gnu. --- stdlib/qsort.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 87 insertions(+), 7 deletions(-) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 5df640362d..8368576aae 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -113,6 +113,7 @@ typedef struct { char *lo; char *hi; + size_t depth; } stack_node; /* The stack needs log (total_elements) entries (we could even subtract @@ -122,23 +123,92 @@ typedef struct enum { STACK_SIZE = CHAR_BIT * sizeof (size_t) }; static inline stack_node * -push (stack_node *top, char *lo, char *hi) +push (stack_node *top, char *lo, char *hi, size_t depth) { top->lo = lo; top->hi = hi; + top->depth = depth; return ++top; } static inline stack_node * -pop (stack_node *top, char **lo, char **hi) +pop (stack_node *top, char **lo, char **hi, size_t *depth) { --top; *lo = top->lo; *hi = top->hi; + *depth = top->depth; return top; } +/* A fast, small, non-recursive O(nlog n) heapsort, adapted from Linux + lib/sort.c. Used on introsort implementation as a fallback routine with + worst-case performance of O(nlog n) and worst-case space complexity of + O(1). */ + +static inline size_t +parent (size_t i, unsigned int lsbit, size_t size) +{ + i -= size; + i -= size & -(i & lsbit); + return i / 2; +} + +static void +heapsort_r (void *base, void *end, size_t size, swap_func_t swap_func, + __compar_d_fn_t cmp, void *arg) +{ + size_t num = ((uintptr_t) end - (uintptr_t) base) / size; + size_t n = num * size, a = (num/2) * size; + /* Used to find parent */ + const unsigned int lsbit = size & -size; + + /* num < 2 || size == 0. */ + if (a == 0) + return; + + for (;;) + { + size_t b, c, d; + + if (a != 0) + /* Building heap: sift down --a */ + a -= size; + else if (n -= size) + /* Sorting: Extract root to --n */ + do_swap (base, base + n, size, swap_func); + else + break; + + /* Sift element at "a" down into heap. This is the "bottom-up" variant, + which significantly reduces calls to cmp_func(): we find the sift-down + path all the way to the leaves (one compare per level), then backtrack + to find where to insert the target element. + + Because elements tend to sift down close to the leaves, this uses fewer + compares than doing two per level on the way down. (A bit more than + half as many on average, 3/4 worst-case.). */ + for (b = a; c = 2 * b + size, (d = c + size) < n;) + b = cmp (base + c, base + d, arg) >= 0 ? c : d; + if (d == n) + /* Special case last leaf with no sibling. */ + b = c; + + /* Now backtrack from "b" to the correct location for "a". */ + while (b != a && cmp (base + a, base + b, arg) >= 0) + b = parent (b, lsbit, size); + /* Where "a" belongs. */ + c = b; + while (b != a) + { + /* Shift it into place. */ + b = parent (b, lsbit, size); + do_swap (base + b, base + c, size, swap_func); + } + } +} + /* Order size using quicksort. This implementation incorporates four optimizations discussed in Sedgewick: @@ -223,7 +293,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, const size_t max_thresh = MAX_THRESH * size; - if (total_elems == 0) + if (total_elems <= 1) /* Avoid lossage with unsigned arithmetic below. */ return; @@ -235,6 +305,9 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else swap_func = SWAP_BYTES; + /* Maximum depth before quicksort switches to heapsort. */ + size_t depth = 2 * (CHAR_BIT - 1 - __builtin_clzl (total_elems)); + if (total_elems > MAX_THRESH) { char *lo = base_ptr; @@ -242,10 +315,17 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, stack_node stack[STACK_SIZE]; stack_node *top = stack; - top = push (top, NULL, NULL); + top = push (top, NULL, NULL, depth); while (stack < top) { + if (depth == 0) + { + heapsort_r (lo, hi, size, swap_func, cmp, arg); + top = pop (top, &lo, &hi, &depth); + continue; + } + char *left_ptr; char *right_ptr; @@ -309,7 +389,7 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, { if ((size_t) (hi - left_ptr) <= max_thresh) /* Ignore both small partitions. */ - top = pop (top, &lo, &hi); + top = pop (top, &lo, &hi, &depth); else /* Ignore small left partition. */ lo = left_ptr; @@ -320,13 +400,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, else if ((right_ptr - lo) > (hi - left_ptr)) { /* Push larger left partition indices. */ - top = push (top, lo, right_ptr); + top = push (top, lo, right_ptr, depth - 1); lo = left_ptr; } else { /* Push larger right partition indices. */ - top = push (top, left_ptr, hi); + top = push (top, left_ptr, hi, depth - 1); hi = right_ptr; } } From patchwork Fri Sep 3 17:11:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 44847 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4EEFE384781B for ; Fri, 3 Sep 2021 17:17:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4EEFE384781B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630689431; bh=mZasdWUPD4zIRDa+qrI6qFzJrg0zhkZFZ1LqIBBNK6Q=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Myfybd5f8eluyZV88l+M3a82ea6c3mDryHN+yCl/LMUSfFMzsQnIa9SS54qRQriB8 wzlTGBTh+ZtNVtodeESpG80MUQVQtUk4dPg29nQwDNIt6y1ZyqMm3Gw/GF9nBncwLz 4JmwiH5HW186R730qyOyBlVC4VVEi80Bke/0iZUM= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x730.google.com (mail-qk1-x730.google.com [IPv6:2607:f8b0:4864:20::730]) by sourceware.org (Postfix) with ESMTPS id 3B5893847805 for ; Fri, 3 Sep 2021 17:11:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3B5893847805 Received: by mail-qk1-x730.google.com with SMTP id a66so6508205qkc.1 for ; Fri, 03 Sep 2021 10:11:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mZasdWUPD4zIRDa+qrI6qFzJrg0zhkZFZ1LqIBBNK6Q=; b=UWPGjeHQlPTkc1mS0mwz0YAs4I51L+lsreW+v6O+q2bXUA7nqw9f/+jKscb4XhAHa1 c+uA0YFYUeaQmEUvLe5IxKxBgAbY7AKM0acZuFgx7eLQbBBf2fQTULHqYA1FkPD/BA/d 9KAfUIGOtvCTobwz9z3H29pRncqEAw4S5c04AbFJrWejxSxh2xlOMhG53Gc0IGMK8kCq z6YSBXZBqfdW4g6ACY0D2QWe3bIO4kBfbNAo7DbtQhnnHe88aspYQY2yiVaEEUMXma5h lm0YHuz/zUpv79yqlbR/1LXGLq4DvzuPzT9NsEt1rdjixuEmzGr1CD7+SgEv5F1iubN4 /ibQ== X-Gm-Message-State: AOAM531wNdcLtOPNWRpjMucnHaWmW05OobYF1aL0HgdugxNCYUVGnpmV EwFwWReYUO5fMaRugthBdpTZPudzingf1Q== X-Google-Smtp-Source: ABdhPJwqE7k1vcXdlCLglV7KZvb+CFonYx1Rx640p4N0qyXtfLUBXKFaCUMLoVwIlp44lpRN5CbAxA== X-Received: by 2002:ae9:ed17:: with SMTP id c23mr4330739qkg.462.1630689117473; Fri, 03 Sep 2021 10:11:57 -0700 (PDT) Received: from birita.. ([2804:431:c7cb:733d:fff8:7487:556e:e293]) by smtp.gmail.com with ESMTPSA id r4sm3207071qtw.5.2021.09.03.10.11.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Sep 2021 10:11:57 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 7/7] stdlib: Remove use of mergesort on qsort (BZ #21719) Date: Fri, 3 Sep 2021 14:11:44 -0300 Message-Id: <20210903171144.952737-8-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210903171144.952737-1-adhemerval.zanella@linaro.org> References: <20210903171144.952737-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patch removes the mergesort optimization on qsort implementation and uses the quicksort instead. The mergesort implementation has some issues: - It is as-safe only for certain types sizes (if total size is less than 1 KB with large element sizes also forcing memory allocation) which contradicts the function documentation. Although not required by the C standard, it is preferable and doable to have an O(1) space implementation. - The malloc for certain element size and element number adds arbitrary latency (might even be worse if malloc is interposed). - To avoid trigger swap from memory allocation the implementation relies on system information that might be virtualized (for instance VMs with overcommit memory) which might lead to potentially use of swap even if system advertise more memory than actually has. The check also have the downside of issuing syscalls where none is expected (although only once per execution). - The mergesort is suboptimal on an already sorted array (BZ#21719). The quicksort implementation is already optimized to use constant extra space (due to the limit of total number of elements from maximum VM size) and thus can be used to avoid the malloc usage issues. Resulting performance is slower due the usage of qsort, specially in the worst-case scenario (partialy or sorted arrays) and due the fact mergesort uses a slight improved swap operations. This change also renders the BZ#21719 fix unrequired (since it is meant to fix the sorted input performance degradation for mergesort). The manual is also updated to indicate the function is now async-cancel safe. Checked on x86_64-linux-gnu. --- manual/argp.texi | 2 +- manual/locale.texi | 3 +- manual/search.texi | 7 +- stdlib/Makefile | 3 +- stdlib/msort.c | 310 --------------------------------------------- stdlib/qsort.c | 15 ++- 6 files changed, 18 insertions(+), 322 deletions(-) delete mode 100644 stdlib/msort.c diff --git a/manual/argp.texi b/manual/argp.texi index 0023441812..b77ad68285 100644 --- a/manual/argp.texi +++ b/manual/argp.texi @@ -735,7 +735,7 @@ for options, bad phase of the moon, etc. @c hol_set_group ok @c hol_find_entry ok @c hol_sort @mtslocale @acucorrupt -@c qsort dup @acucorrupt +@c qsort dup @c hol_entry_qcmp @mtslocale @c hol_entry_cmp @mtslocale @c group_cmp ok diff --git a/manual/locale.texi b/manual/locale.texi index 720e0ca952..f6afa5dc44 100644 --- a/manual/locale.texi +++ b/manual/locale.texi @@ -253,7 +253,7 @@ The symbols in this section are defined in the header file @file{locale.h}. @c calculate_head_size ok @c __munmap ok @c compute_hashval ok -@c qsort dup @acucorrupt +@c qsort dup @c rangecmp ok @c malloc @ascuheap @acsmem @c strdup @ascuheap @acsmem @@ -275,7 +275,6 @@ The symbols in this section are defined in the header file @file{locale.h}. @c realloc @ascuheap @acsmem @c realloc @ascuheap @acsmem @c fclose @ascuheap @asulock @acsmem @acsfd @aculock -@c qsort @ascuheap @acsmem @c alias_compare dup @c libc_lock_unlock @aculock @c _nl_explode_name @ascuheap @acsmem diff --git a/manual/search.texi b/manual/search.texi index 5691bf2f2b..a550858478 100644 --- a/manual/search.texi +++ b/manual/search.texi @@ -159,7 +159,7 @@ To sort an array using an arbitrary comparison function, use the @deftypefun void qsort (void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare}) @standards{ISO, stdlib.h} -@safety{@prelim{}@mtsafe{}@assafe{}@acunsafe{@acucorrupt{}}} +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} The @code{qsort} function sorts the array @var{array}. The array contains @var{count} elements, each of which is of size @var{size}. @@ -199,9 +199,8 @@ Functions}): The @code{qsort} function derives its name from the fact that it was originally implemented using the ``quick sort'' algorithm. -The implementation of @code{qsort} in this library might not be an -in-place sort and might thereby use an extra amount of memory to store -the array. +The implementation of @code{qsort} in this library is an in-place sort +and uses a constant extra space (allocated on the stack). @end deftypefun @node Search/Sort Example diff --git a/stdlib/Makefile b/stdlib/Makefile index ef8a9007d8..27aa36eaff 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -34,7 +34,7 @@ headers := stdlib.h bits/stdlib.h bits/stdlib-ldbl.h bits/stdlib-float.h \ routines := \ atof atoi atol atoll \ abort \ - bsearch qsort msort \ + bsearch qsort \ getenv putenv setenv secure-getenv \ exit on_exit atexit cxa_atexit cxa_finalize old_atexit \ quick_exit at_quick_exit cxa_at_quick_exit cxa_thread_atexit_impl \ @@ -143,7 +143,6 @@ extra-test-objs += tst-putenvmod.os generated += isomac isomac.out tst-putenvmod.so CFLAGS-bsearch.c += $(uses-callbacks) -CFLAGS-msort.c += $(uses-callbacks) CFLAGS-qsort.c += $(uses-callbacks) CFLAGS-system.c += -fexceptions CFLAGS-system.os = -fomit-frame-pointer diff --git a/stdlib/msort.c b/stdlib/msort.c deleted file mode 100644 index 8750cc59db..0000000000 --- a/stdlib/msort.c +++ /dev/null @@ -1,310 +0,0 @@ -/* An alternative to qsort, with an identical interface. - This file is part of the GNU C Library. - Copyright (C) 1992-2021 Free Software Foundation, Inc. - Written by Mike Haertel, September 1988. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include -#include -#include -#include -#include -#include -#include - -struct msort_param -{ - size_t s; - size_t var; - __compar_d_fn_t cmp; - void *arg; - char *t; -}; -static void msort_with_tmp (const struct msort_param *p, void *b, size_t n); - -static void -msort_with_tmp (const struct msort_param *p, void *b, size_t n) -{ - char *b1, *b2; - size_t n1, n2; - - if (n <= 1) - return; - - n1 = n / 2; - n2 = n - n1; - b1 = b; - b2 = (char *) b + (n1 * p->s); - - msort_with_tmp (p, b1, n1); - msort_with_tmp (p, b2, n2); - - char *tmp = p->t; - const size_t s = p->s; - __compar_d_fn_t cmp = p->cmp; - void *arg = p->arg; - switch (p->var) - { - case 0: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - *(uint32_t *) tmp = *(uint32_t *) b1; - b1 += sizeof (uint32_t); - --n1; - } - else - { - *(uint32_t *) tmp = *(uint32_t *) b2; - b2 += sizeof (uint32_t); - --n2; - } - tmp += sizeof (uint32_t); - } - break; - case 1: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - *(uint64_t *) tmp = *(uint64_t *) b1; - b1 += sizeof (uint64_t); - --n1; - } - else - { - *(uint64_t *) tmp = *(uint64_t *) b2; - b2 += sizeof (uint64_t); - --n2; - } - tmp += sizeof (uint64_t); - } - break; - case 2: - while (n1 > 0 && n2 > 0) - { - unsigned long *tmpl = (unsigned long *) tmp; - unsigned long *bl; - - tmp += s; - if ((*cmp) (b1, b2, arg) <= 0) - { - bl = (unsigned long *) b1; - b1 += s; - --n1; - } - else - { - bl = (unsigned long *) b2; - b2 += s; - --n2; - } - while (tmpl < (unsigned long *) tmp) - *tmpl++ = *bl++; - } - break; - case 3: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (*(const void **) b1, *(const void **) b2, arg) <= 0) - { - *(void **) tmp = *(void **) b1; - b1 += sizeof (void *); - --n1; - } - else - { - *(void **) tmp = *(void **) b2; - b2 += sizeof (void *); - --n2; - } - tmp += sizeof (void *); - } - break; - default: - while (n1 > 0 && n2 > 0) - { - if ((*cmp) (b1, b2, arg) <= 0) - { - tmp = (char *) __mempcpy (tmp, b1, s); - b1 += s; - --n1; - } - else - { - tmp = (char *) __mempcpy (tmp, b2, s); - b2 += s; - --n2; - } - } - break; - } - - if (n1 > 0) - memcpy (tmp, b1, n1 * s); - memcpy (b, p->t, (n - n2) * s); -} - - -void -__qsort_r (void *b, size_t n, size_t s, __compar_d_fn_t cmp, void *arg) -{ - size_t size = n * s; - char *tmp = NULL; - struct msort_param p; - - /* For large object sizes use indirect sorting. */ - if (s > 32) - size = 2 * n * sizeof (void *) + s; - - if (size < 1024) - /* The temporary array is small, so put it on the stack. */ - p.t = __alloca (size); - else - { - /* We should avoid allocating too much memory since this might - have to be backed up by swap space. */ - static long int phys_pages; - static int pagesize; - - if (pagesize == 0) - { - phys_pages = __sysconf (_SC_PHYS_PAGES); - - if (phys_pages == -1) - /* Error while determining the memory size. So let's - assume there is enough memory. Otherwise the - implementer should provide a complete implementation of - the `sysconf' function. */ - phys_pages = (long int) (~0ul >> 1); - - /* The following determines that we will never use more than - a quarter of the physical memory. */ - phys_pages /= 4; - - /* Make sure phys_pages is written to memory. */ - atomic_write_barrier (); - - pagesize = __sysconf (_SC_PAGESIZE); - } - - /* Just a comment here. We cannot compute - phys_pages * pagesize - and compare the needed amount of memory against this value. - The problem is that some systems might have more physical - memory then can be represented with a `size_t' value (when - measured in bytes. */ - - /* If the memory requirements are too high don't allocate memory. */ - if (size / pagesize > (size_t) phys_pages) - { - _quicksort (b, n, s, cmp, arg); - return; - } - - /* It's somewhat large, so malloc it. */ - int save = errno; - tmp = malloc (size); - __set_errno (save); - if (tmp == NULL) - { - /* Couldn't get space, so use the slower algorithm - that doesn't need a temporary array. */ - _quicksort (b, n, s, cmp, arg); - return; - } - p.t = tmp; - } - - p.s = s; - p.var = 4; - p.cmp = cmp; - p.arg = arg; - - if (s > 32) - { - /* Indirect sorting. */ - char *ip = (char *) b; - void **tp = (void **) (p.t + n * sizeof (void *)); - void **t = tp; - void *tmp_storage = (void *) (tp + n); - - while ((void *) t < tmp_storage) - { - *t++ = ip; - ip += s; - } - p.s = sizeof (void *); - p.var = 3; - msort_with_tmp (&p, p.t + n * sizeof (void *), n); - - /* tp[0] .. tp[n - 1] is now sorted, copy around entries of - the original array. Knuth vol. 3 (2nd ed.) exercise 5.2-10. */ - char *kp; - size_t i; - for (i = 0, ip = (char *) b; i < n; i++, ip += s) - if ((kp = tp[i]) != ip) - { - size_t j = i; - char *jp = ip; - memcpy (tmp_storage, ip, s); - - do - { - size_t k = (kp - (char *) b) / s; - tp[j] = jp; - memcpy (jp, kp, s); - j = k; - jp = kp; - kp = tp[k]; - } - while (kp != ip); - - tp[j] = jp; - memcpy (jp, tmp_storage, s); - } - } - else - { - if ((s & (sizeof (uint32_t) - 1)) == 0 - && ((char *) b - (char *) 0) % __alignof__ (uint32_t) == 0) - { - if (s == sizeof (uint32_t)) - p.var = 0; - else if (s == sizeof (uint64_t) - && ((char *) b - (char *) 0) % __alignof__ (uint64_t) == 0) - p.var = 1; - else if ((s & (sizeof (unsigned long) - 1)) == 0 - && ((char *) b - (char *) 0) - % __alignof__ (unsigned long) == 0) - p.var = 2; - } - msort_with_tmp (&p, b, n); - } - free (tmp); -} -libc_hidden_def (__qsort_r) -weak_alias (__qsort_r, qsort_r) - - -void -qsort (void *b, size_t n, size_t s, __compar_fn_t cmp) -{ - return __qsort_r (b, n, s, (__compar_d_fn_t) cmp, NULL); -} -libc_hidden_def (qsort) diff --git a/stdlib/qsort.c b/stdlib/qsort.c index 8368576aae..028210dc8f 100644 --- a/stdlib/qsort.c +++ b/stdlib/qsort.c @@ -20,7 +20,6 @@ Engineering a sort function; Jon Bentley and M. Douglas McIlroy; Software - Practice and Experience; Vol. 23 (11), 1249-1265, 1993. */ -#include #include #include #include @@ -286,8 +285,8 @@ insertion_sort (void *const pbase, size_t total_elems, size_t size, } void -_quicksort (void *const pbase, size_t total_elems, size_t size, - __compar_d_fn_t cmp, void *arg) +__qsort_r (void *const pbase, size_t total_elems, size_t size, + __compar_d_fn_t cmp, void *arg) { char *base_ptr = (char *) pbase; @@ -419,3 +418,13 @@ _quicksort (void *const pbase, size_t total_elems, size_t size, the array (*not* one beyond it!). */ insertion_sort (pbase, total_elems, size, swap_func, cmp, arg); } + +libc_hidden_def (__qsort_r) +weak_alias (__qsort_r, qsort_r) + +void +qsort (void *b, size_t n, size_t s, __compar_fn_t cmp) +{ + return __qsort_r (b, n, s, (__compar_d_fn_t) cmp, NULL); +} +libc_hidden_def (qsort)