From patchwork Fri Aug 4 18:16:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 73634 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 83D58385735D for ; Fri, 4 Aug 2023 18:17:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 83D58385735D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1691173045; bh=VGqm8sAsVALQvaPqZkIRNHTItiQ7tBCnG7Nx7titTS8=; h=To:Subject:In-Reply-To:References:Date:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=lyjZGs8vnYwcmKJnZ1oW1GNtRWVHcvT/CLRDg/qJYtXUTwc1dPZuNfCKot/Qffiqn WOnAttqhNvJ9NoVlF+e/VNAEWRx85aT6+6iQarFsEGwnjuP2mKHxTLs9qdfVy2ktlC eVw7U9a//2deOMAVwnCEL3F4mhjjsHC/QT7f0mXU= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id B7066385734B for ; Fri, 4 Aug 2023 18:16:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B7066385734B Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-315-r7w3cy8yNXaEKOW0MK7Vpw-1; Fri, 04 Aug 2023 14:16:30 -0400 X-MC-Unique: r7w3cy8yNXaEKOW0MK7Vpw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7639A185A78B for ; Fri, 4 Aug 2023 18:16:30 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.2.16.9]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C7EA5F7FB8 for ; Fri, 4 Aug 2023 18:16:29 +0000 (UTC) To: libc-alpha@sourceware.org Subject: [PATCH 3/3] elf: Check that --list-diagnostics output has the expected syntax In-Reply-To: Message-ID: <30749c92b4926f909b21102765bba0676e009d04.1691172895.git.fweimer@redhat.com> References: X-From-Line: 30749c92b4926f909b21102765bba0676e009d04 Mon Sep 17 00:00:00 2001 Date: Fri, 04 Aug 2023 20:16:28 +0200 User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Florian Weimer via Libc-alpha From: Florian Weimer Reply-To: Florian Weimer Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Parts of elf/tst-rtld-list-diagnostics.py have been copied from scripts/tst-ld-trace.py. The abnf module is entirely optional and used to verify the ABNF grammar as included in the manual. --- INSTALL | 5 + elf/Makefile | 9 + elf/tst-rtld-list-diagnostics.py | 311 +++++++++++++++++++++++++++++++ manual/install.texi | 6 + 4 files changed, 331 insertions(+) create mode 100644 elf/tst-rtld-list-diagnostics.py diff --git a/INSTALL b/INSTALL index 268acadd75..3f662bf427 100644 --- a/INSTALL +++ b/INSTALL @@ -585,6 +585,11 @@ build the GNU C Library: in your system. As of release time PExpect 4.8.0 is the newest verified to work to test the pretty printers. + • The Python ‘abnf’ module. + + This module is used to verify some ABNF grammars in the manual. + Version 2.2.0 has been confirmed to work as expected. + • GDB 7.8 or later with support for Python 2.7/3.4 or later GDB itself needs to be configured with Python support in order to diff --git a/elf/Makefile b/elf/Makefile index c00e2ccfc5..9176cbf1e3 100644 --- a/elf/Makefile +++ b/elf/Makefile @@ -1123,6 +1123,7 @@ tests-special += \ $(objpfx)argv0test.out \ $(objpfx)tst-pathopt.out \ $(objpfx)tst-rtld-help.out \ + $(objpfx)tst-rtld-list-diagnostics.out \ $(objpfx)tst-rtld-load-self.out \ $(objpfx)tst-rtld-preload.out \ $(objpfx)tst-sprof-basic.out \ @@ -2799,6 +2800,14 @@ $(objpfx)tst-ro-dynamic-mod.so: $(objpfx)tst-ro-dynamic-mod.os \ -Wl,--script=tst-ro-dynamic-mod.map \ $(objpfx)tst-ro-dynamic-mod.os +$(objpfx)tst-rtld-list-diagnostics.out: tst-rtld-list-diagnostics.py \ + $(..)manual/dynlink.texi $(objpfx)$(rtld-installed-name) + $(PYTHON) tst-rtld-list-diagnostics.py \ + --manual=$(..)manual/dynlink.texi \ + "$(test-wrapper-env) $(objpfx)$(rtld-installed-name) --list-diagnostics" \ + > $@; \ + $(evaluate-test) + $(objpfx)tst-rtld-run-static.out: $(objpfx)/ldconfig $(objpfx)tst-dl_find_object.out: \ diff --git a/elf/tst-rtld-list-diagnostics.py b/elf/tst-rtld-list-diagnostics.py new file mode 100644 index 0000000000..f4ff06fd86 --- /dev/null +++ b/elf/tst-rtld-list-diagnostics.py @@ -0,0 +1,311 @@ +#!/usr/bin/python3 +# Test that the ld.so --list-diagnostics output has the expected syntax. +# Copyright (C) 2022-2023 Free Software Foundation, Inc. +# Copyright The GNU Toolchain Authors. +# This file is part of the GNU C Library. +# +# The GNU C Library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# The GNU C Library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with the GNU C Library; if not, see +# . + +import argparse +import collections +import subprocess +import sys + +try: + subprocess.run +except: + class _CompletedProcess: + def __init__(self, args, returncode, stdout=None, stderr=None): + self.args = args + self.returncode = returncode + self.stdout = stdout + self.stderr = stderr + + def _run(*popenargs, input=None, timeout=None, check=False, **kwargs): + assert(timeout is None) + with subprocess.Popen(*popenargs, **kwargs) as process: + try: + stdout, stderr = process.communicate(input) + except: + process.kill() + process.wait() + raise + returncode = process.poll() + if check and returncode: + raise subprocess.CalledProcessError(returncode, popenargs) + return _CompletedProcess(popenargs, returncode, stdout, stderr) + + subprocess.run = _run + +# Number of errors encountered. Zero means no errors (test passes). +errors = 0 + +# PYTHON-START +def parse_line(line): + """Parse a line of --list-diagnostics output. + + This function returns a pair (SUBSCRIPTS, VALUE). VALUE is either + a byte string or an integer. SUBSCRIPT is a tuple of (LABEL, + INDEX) pairs, where LABEL is a field identifier (a string), and + INDEX is an integer or None, to indicate that this field is not + indexed. + + """ + + # Extract the list of subscripts before the value. + idx = 0 + subscripts = [] + while line[idx] != '=': + start_idx = idx + + # Extract the label. + while line[idx] not in '[.=': + idx += 1 + label = line[start_idx:idx] + + if line[idx] == '[': + # Subscript with a 0x index. + assert label + close_bracket = line.index(']', idx) + index = line[idx + 1:close_bracket] + assert index.startswith('0x') + index = int(index, 0) + subscripts.append((label, index)) + idx = close_bracket + 1 + else: # '.' or '='. + if label: + subscripts.append((label, None)) + if line[idx] == '.': + idx += 1 + + # The value is either a string or a 0x number. + value = line[idx + 1:] + if value[0] == '"': + # Decode the escaped string into a byte string. + assert value[-1] == '"' + idx = 1 + result = [] + while True: + ch = value[idx] + if ch == '\\': + if value[idx + 1] in '"\\': + result.append(ord(value[idx + 1])) + idx += 2 + else: + result.append(int(value[idx + 1:idx + 4], 8)) + idx += 4 + elif ch == '"': + assert idx == len(value) - 1 + break + else: + result.append(ord(value[idx])) + idx += 1 + value = bytes(result) + else: + # Convert the value into an integer. + assert value.startswith('0x') + value = int(value, 0) + return (tuple(subscripts), value) +# PYTHON-END + +assert parse_line('a.b[0x1]=0x2') == ((('a', None), ('b', 1)), 2) +assert parse_line(r'b[0x3]="four\040\"\\"') == ((('b', 3),), b'four \"\\') + +# ABNF for a line of --list-diagnostics output. +diagnostics_abnf = r""" +HEXDIG = %x30-39 / %x61-6f ; lowercase a-f only +ALPHA = %x41-5a / %x61-7a / %x7f ; letters and underscore +ALPHA-NUMERIC = ALPHA / %x30-39 / "_" +DQUOTE = %x22 ; " + +; Numbers are always hexadecimal and use a 0x prefix. +hex-value-prefix = %x30 %x78 +hex-value = hex-value-prefix 1*HEXDIG + +; Strings use octal escape sequences and \\, \". +string-char = %x20-21 / %x23-5c / %x5d-7e ; printable but not "\ +string-quoted-octal = %x30-33 2*2%x30-37 +string-quoted = "\" ("\" / DQUOTE / string-quoted-octal) +string-value = DQUOTE *(string-char / string-quoted) DQUOTE + +value = hex-value / string-value + +label = ALPHA *ALPHA-NUMERIC +index = "[" hex-value "]" +subscript = label [index] + +line = subscript *("." subscript) "=" value +""" + +def check_consistency_with_manual(manual_path): + """Verify that the code fragments in the manual match this script. + + The code fragments are duplicated to clarify the dual license. + """ + + global errors + + def extract_lines(path, start_line, end_line, skip_lines=()): + result = [] + with open(path) as inp: + capturing = False + for line in inp: + if line.strip() == start_line: + capturing = True + elif not capturing or line.strip() in skip_lines: + continue + elif line.strip() == end_line: + capturing = False + else: + result.append(line) + if not result: + raise ValueError('{!r} not found in {!r}'.format(start_line, path)) + if capturing: + raise ValueError('{!r} not found in {!r}'.format(end_line, path)) + return result + + def check(name, manual, script): + global errors + + if manual == script: + return + print('error: {} fragment in manual is different'.format(name)) + import difflib + sys.stdout.writelines(difflib.unified_diff( + manual, script, fromfile='manual', tofile='script')) + errors += 1 + + manual_python = extract_lines(manual_path, + '@c PYTHON-START', '@end smallexample', + skip_lines=('@smallexample',)) + script_python = extract_lines(__file__, '# PYTHON-START', '# PYTHON-END') + check('Python code', manual_python, script_python) + + manual_abnf = extract_lines(manual_path, + '@c ABNF-START', '@end smallexample', + skip_lines=('@smallexample',)) + check('ABNF', diagnostics_abnf.splitlines(keepends=True)[1:], manual_abnf) + +# If the abnf module can be imported, run an additional check that the +# 'line' production from the ABNF grammar matches --list-diagnostics +# output lines. +try: + import abnf +except ImportError: + abnf = None + print('info: skipping ABNF validation because the abnf module is missing') + +if abnf is not None: + class Grammar(abnf.Rule): + pass + + Grammar.load_grammar(diagnostics_abnf) + + def parse_abnf(line): + global errors + + # Just verify that the line parses. + try: + Grammar('line').parse_all(line) + except abnf.ParseError: + print('error: ABNF parse error:', repr(line)) + errors += 1 +else: + def parse_abnf(line): + pass + + +def parse_diagnostics(cmd): + global errors + diag_out = subprocess.run(cmd, stdout=subprocess.PIPE, check=True, + universal_newlines=True).stdout + if diag_out[-1] != '\n': + print('error: ld.so output does not end in newline') + errors += 1 + + PathType = collections.namedtuple('PathType', + 'has_index value_type original_line') + # Mapping tuples of labels to PathType values. + path_types = {} + + seen_subscripts = {} + + for line in diag_out.splitlines(): + parse_abnf(line) + subscripts, value = parse_line(line) + + # Check for duplicates. + if subscripts in seen_subscripts: + print('error: duplicate value assignment:', repr(line)) + print(' previous line:,', repr(seen_subscripts[line])) + errors += 1 + else: + seen_subscripts[subscripts] = line + + # Compare types against the previously seen labels. + labels = tuple([label for label, index in subscripts]) + has_index = tuple([index is not None for label, index in subscripts]) + value_type = type(value) + if labels in path_types: + previous_type = path_types[labels] + if has_index != previous_type.has_index: + print('error: line has mismatch of indexing:', repr(line)) + print(' index types:', has_index) + print(' previous: ', previous_type.has_index) + print(' previous line:', repr(previous_type.original_line)) + errors += 1 + if value_type != previous_type.value_type: + print('error: line has mismatch of value type:', repr(line)) + print(' value type:', value_type.__name__) + print(' previous: ', previous_type.value_type.__name__) + print(' previous line:', repr(previous_type.original_line)) + errors += 1 + else: + path_types[labels] = PathType(has_index, value_type, line) + + # Check that this line does not add indexing to a previous value. + for idx in range(1, len(subscripts) - 1): + if subscripts[:idx] in path_types: + print('error: line assigns to atomic value:', repr(line)) + print(' previous line:', repr(previous_type.original_line)) + errors += 1 + + if errors: + sys.exit(1) + +def get_parser(): + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument('--manual', + help='path to .texi file for consistency checks') + parser.add_argument('command', + help='comand to run') + return parser + + +def main(argv): + parser = get_parser() + opts = parser.parse_args(argv) + + if opts.manual: + check_consistency_with_manual(opts.manual) + + # Remove the initial 'env' command. + parse_diagnostics(opts.command.split()[1:]) + + if errors: + sys.exit(1) + +if __name__ == '__main__': + main(sys.argv[1:]) diff --git a/manual/install.texi b/manual/install.texi index e8f36d5726..2107eb7268 100644 --- a/manual/install.texi +++ b/manual/install.texi @@ -632,6 +632,12 @@ GDB, and should be compatible with the Python version in your system. As of release time PExpect 4.8.0 is the newest verified to work to test the pretty printers. +@item +The Python @code{abnf} module. + +This module is used to verify some ABNF grammars in the manual. +Version 2.2.0 has been confirmed to work as expected. + @item GDB 7.8 or later with support for Python 2.7/3.4 or later