From patchwork Sat Dec 6 13:19:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Vivien Kraus X-Patchwork-Id: 126040 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from vm01.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D2DDB4363010 for ; Sat, 6 Dec 2025 13:29:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D2DDB4363010 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, secure) header.d=planete-kraus.eu header.i=@planete-kraus.eu header.a=rsa-sha1 header.s=albinoniA header.b=uKpnREYW X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from planete-kraus.eu (planete-kraus.eu [89.234.140.182]) by sourceware.org (Postfix) with ESMTPS id B6AC148EFFBD for ; Sat, 6 Dec 2025 13:23:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B6AC148EFFBD Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=planete-kraus.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=planete-kraus.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B6AC148EFFBD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.234.140.182 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1765027399; cv=none; b=b+o26QGSepSpP5McIUrYGxDCUkWQ1fOS/bHi90WT/ruw0sZZoAi9T3IALrtfZPXlmKsHheGspExTHEgWvkuJutAJwBqrT0uGMtI4NEUKjYHe/jyblZVrKCUfcS5yxviuH/astWNFZRxWMAAP0kOsvqQO4uI6xcGZZMfGgJP5zCo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1765027399; c=relaxed/simple; bh=OphVb6NEF8N0i7mxTb0sPo/hEijKcQgxS+unca9zHck=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=DlfXeoJElV6ecjohNYge6uCzdfEZswqkyo+255//lAfmJFpIaf8KQvToXQ7oKIspNay3A3MRcBqU9JaVnVTC3N4CR3Y++eocQeRug3K0Q2Q8y5oVAXnnr+TX4+wprsCxU/T56DABrsNyL9S9/GlDD0wBh/eo0F/TGqZDUvjXrAc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B6AC148EFFBD Received: from planete-kraus.eu (localhost [127.0.0.1]) by planete-kraus.eu (OpenSMTPD) with ESMTP id 4f5334f3; Sat, 6 Dec 2025 13:23:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=planete-kraus.eu; h=from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s= albinoniA; bh=7bL6p51YoiQeLn0XycMBN7GnklI=; b=uKpnREYWD6cIFian3Z lBGfCgzAqlW1PdalUG3XiNJ3SxQpfD/nBtj0e2VXdYNFpEb1NYqNrQsh8iRSbqyb y5u7fZrryfXe/nCVk79qiV1kfzc2fdhp4StbuHW7GnRrLVRLD2ZAYPTr4Brs9r4P ztMnmFmMTi8b+ZrUDWioeLKOp9m0zQc7PiTFyg6eQ5VyBpbbqV6A98JDRUn1Jhhf QyNUUWzKfEnAjbxgap2BuBsvesXkwDpmQbYrhizrOhR1VSbLnLHNXuWbHWCEHMId IgoaJw7y1ix4PIQzRZ08M30hm3JNTm4jODg3fjneictiQSFGefItUizdMRVQSxRT +0Pw== Received: by planete-kraus.eu (OpenSMTPD) with ESMTPSA id d8259cb4 (TLSv1.3:TLS_CHACHA20_POLY1305_SHA256:256:NO); Sat, 6 Dec 2025 13:23:17 +0000 (UTC) From: Vivien Kraus To: libc-alpha@sourceware.org, adhemerval.zanella@linaro.org Cc: Vivien Kraus Subject: [PATCH v19 09/11] posix: Add a script for static validation of getopt_long PO files Date: Sat, 6 Dec 2025 14:19:16 +0100 Message-Id: <2005141af05feed62f004bb5160b0dd9f5d4344a.1765017925.git.vivien@planete-kraus.eu> X-Mailer: git-send-email 2.34.1 In-Reply-To: References: <68a758ae45c064bad35bfec73c3d5ffd050398e3.1748369494.git.vivien@planete-kraus.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_BLOCKED, RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED, SPF_HELO_PASS, SPF_PASS, TXREP, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patchwork=sourceware.org@sourceware.org It is better to statically check the PO files on the developer’s side, because there is a chance to detect the problem early and not embarrass the translation team just before a release. This is a perl script that I made by adapting bits and pieces from mtrace.pl. On the test case, it should fail with the following output: ----- Translation toto is used for more than one option: - bar - foo bar is a translation of pub, but it is also a different option. There were 2 failures. ----- --- manual/getopt.texi | 10 + posix/Makefile | 17 ++ posix/check-getopt-translations.pl | 199 ++++++++++++++++++ .../standalone-multiple-getopt-collisions.po | 45 ++++ posix/tst-check-getopt-translations.sh | 59 ++++++ 5 files changed, 330 insertions(+) create mode 100644 posix/check-getopt-translations.pl create mode 100644 posix/standalone-multiple-getopt-collisions.po create mode 100644 posix/tst-check-getopt-translations.sh diff --git a/manual/getopt.texi b/manual/getopt.texi index 81093b910f..7174317770 100644 --- a/manual/getopt.texi +++ b/manual/getopt.texi @@ -388,6 +388,16 @@ conflicts with the translation of an existing option name. Such a case would disrupt the workflow of users as the new option would replace the existing option. Before adding a new option to a program, the developer should check for collisions with all known translations. +This can be done with the installed +@command{check-getopt-translations} script, by calling for each PO +file in the project: + +@smallexample +check-getopt-translations "context used for translations" @file{file.po} +@end smallexample + +Otherwise, you may repeatedly call the @command{getopt_long_collision} +function after setting the locale, for each known locale. @deftypefun int getopt_long_collision (const struct option *@var{longopts}, const char *@var{context}, const char *@var{domain}, const struct option **@var{first_collision}) diff --git a/posix/Makefile b/posix/Makefile index 357a24b749..a335a5446f 100644 --- a/posix/Makefile +++ b/posix/Makefile @@ -436,6 +436,9 @@ install-others-programs := \ $(inst_libexecdir)/getconf \ # install-others-programs +install-bin-scripts = check-getopt-translations +generated += check-getopt-translations + before-compile += \ $(objpfx)posix-conf-vars-def.h \ # before-compile @@ -448,6 +451,7 @@ generated += \ getconf.speclist \ ptestcases.h \ testcases.h \ + tst-check-getopt-translations.out \ tst-getconf.out \ wordexp-tst.out \ # generated @@ -510,6 +514,7 @@ tests-special += \ $(objpfx)bug-regex31-mem.out \ $(objpfx)bug-regex36-mem.out \ $(objpfx)tst-boost-mem.out \ + $(objpfx)tst-check-getopt-translations.out \ $(objpfx)tst-fnmatch-mem.out \ $(objpfx)tst-glob-tilde-mem.out \ $(objpfx)tst-pcre-mem.out \ @@ -798,3 +803,15 @@ $(objpfx)posix-conf-vars-def.h: $(..)scripts/gen-posix-conf-vars.awk \ $(make-target-directory) $(AWK) -f $(filter-out Makefile, $^) > $@.tmp mv -f $@.tmp $@ + +$(objpfx)check-getopt-translations: check-getopt-translations.pl + rm -f $@.new + sed -e 's|@XXX@|$(address-width)|' \ + -e 's|@VERSION@|$(version)|' \ + -e 's|@PKGVERSION@|$(PKGVERSION)|' \ + -e 's|@REPORT_BUGS_TO@|$(REPORT_BUGS_TO)|' $^ > $@.new \ + && rm -f $@ && mv $@.new $@ && chmod +x $@ + +$(objpfx)tst-check-getopt-translations.out: tst-check-getopt-translations.sh $(objpfx)check-getopt-translations standalone-multiple-getopt-collisions.po + $(SHELL) $^ $(common-objpfx)posix/tst-check-getopt-translations.out + $(evaluate-test) diff --git a/posix/check-getopt-translations.pl b/posix/check-getopt-translations.pl new file mode 100644 index 0000000000..39e556ef8c --- /dev/null +++ b/posix/check-getopt-translations.pl @@ -0,0 +1,199 @@ +#! /usr/bin/perl + +# Copyright (C) 2025 Free Software Foundation, Inc. +# This file is part of the GNU C Library. +# Based on the mtrace.awk script. + +# The GNU C Library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. + +# The GNU C Library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. + +# You should have received a copy of the GNU Lesser General Public +# License along with the GNU C Library; if not, see +# . + +use strict; +use warnings; +use Data::Dumper; + +my $VERSION = "@VERSION@"; + +my $PKGVERSION = "@PKGVERSION@"; +my $REPORT_BUGS_TO = '@REPORT_BUGS_TO@'; +my $progname = $_; + +sub usage { + print "Usage: getopt-check [OPTION]... msgctxt lang.po\n"; + print " --help print this help, then exit\n"; + print " --version print version number, then exit\n"; + print "\n"; + print "For bug reporting instructions, please see:\n"; + print "$REPORT_BUGS_TO.\n"; + exit 0; +} + +sub fatal { + print STDERR "$_[0]\n"; + exit 1; +} + +# This script takes two positional arguments: the context for +# translated option names, and the PO file to check. Then, the PO +# file is parsed, looking at three things: +# 1. The msgctxt: it must be equal to the first positional argument, msgctxt; +# 2. The msgid; +# 3. The space-separated list msgstr. +# +# We are looking for two different problems: +# +# 1. Every translation element, current or obsolete, must be unique +# across all option names. +# 2. For every option name, for every translation, current or +# deprecated, if it doesn’t match the untranslated name, then it +# should not match any other untranslated option names. +# +# If we detect an example of the first case, it is a problem with the +# translator only. They have to remove one use of the word, +# preferably one that is deprecated. +# +# If we detect an example of the second case, then it is a problem +# with the developer: they want to introduce an option name that is +# already used for something else by users of this native language! If +# nothing is done, these users will be surprised that the same word +# now means another option, as the untranslated options have +# precedence over the translations. If the translated name is already +# deprecated, then the language team may agree to completely remove +# it. Otherwise, it may be better to find a new untranslated name. + +# This script uses the same format as mtrace.pl. + + arglist: while (@ARGV) { + if ($ARGV[0] eq "--v" || $ARGV[0] eq "--ve" || $ARGV[0] eq "--ver" || + $ARGV[0] eq "--vers" || $ARGV[0] eq "--versi" || + $ARGV[0] eq "--versio" || $ARGV[0] eq "--version") { + print "getopt-check $PKGVERSION$VERSION\n"; + print "Copyright (C) 2025 Free Software Foundation, Inc.\n"; + print "This is free software; see the source for copying conditions. There is NO\n"; + print "warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n"; + print "Written by Vivien Kraus \n"; + + exit 0; + } elsif ($ARGV[0] eq "--h" || $ARGV[0] eq "--he" || $ARGV[0] eq "--hel" || + $ARGV[0] eq "--help") { + &usage; + } elsif ($ARGV[0] =~ /^-/) { + print "$progname: unrecognized option `$ARGV[0]'\n"; + print "Try `$progname --help' for more information.\n"; + exit 1; + } else { + last arglist; + } +} + +if ($#ARGV != 1) { + fatal "You must provide two arguments: the msgctxt for option names, and the name of the PO file."; +} + +my $relevant_msgctxt = $ARGV[0]; +my $pofilename = $ARGV[1]; +my %translations; + +# %translation_used will be populated to detect multiple use of a +# %translation directly when we parse. + +my $entry_msgid; + +# The ad-hoc PO file parser has 3 states: 1. Wating for msgctxt; +# 2. Waiting for msgid; 3. Wating for msgstr. At the start, the state +# is 1. Then, if we find "msgctxt \"$relevant_msgctxt\"" in a single +# line, we jump to 2. Otherwise, if this is the end of the file, stop +# parsing. Otherwise, whatever the line, stay in 1. This includes: +# the empty line, meaning we are considering a new entry; or a +# comment, a #: location, or another relevant line. +# +# When we are in state 2., we are waiting for the msgid (untranslated +# option name). If we find an empty line, we jump back to 1. If we +# find a line starting with "msgid \"" and ending with a double quote, +# we store what is in the middle in $entry_msgid and jump to 3. +# Otherwise, we stay in state 2. +# +# When we are in state 3., we are waiting for msgstr (canonical and +# obsolete translations). If we find an empty line, drop +# $entry_msgid, and back to 1. If the line starts with "msgstr \"", +# we add a record to %translations: the key is $entry_msgid, and the +# value, what is between the detected prefix and the next space. +# Then, back to state 1. + +my $parser_state = 1; + +open (my $pofile, "<", $pofilename) || fatal "PO file name ${pofilename} cannot be read."; + +while (my $line = <$pofile>) { + chomp $line; + if ($parser_state == 1 && $line =~ /^msgctxt\s*"${relevant_msgctxt}"$/) { + $parser_state = 2; + } elsif ($parser_state == 2 && $line eq "") { + $parser_state = 1; + } elsif ($parser_state == 2 && $line =~ /^msgid\s*"([^"]+)"$/) { + $parser_state = 3; + $entry_msgid = $1; + } elsif ($parser_state == 3 && $line eq "") { + $parser_state = 1; + } elsif ($parser_state == 3 && $line =~ /^msgstr\s*"([^"]*)"$/) { + my @translations_for_this = split(/\s+/, $1); + $translations{$entry_msgid} = \@translations_for_this; + $parser_state = 1; + } +} + +my $number_of_errors = 0; + +# Verify that every option name is unique. +my %untranslated_name; +for my $option_name (sort(keys %translations)) { + for my $translation (@{$translations{$option_name}}) { + my @existing; + if (exists $untranslated_name{$translation}) { + @existing = @{$untranslated_name{$translation}}; + } + push(@existing, $option_name); + $untranslated_name{$translation} = \@existing; + } +} +for my $translation (sort(keys %untranslated_name)) { + my $names = $untranslated_name{$translation}; + if (@{$names} > 1) { + print STDERR "Translation ${translation} is used for more than one option:\n"; + for my $untranslated (@{$names}) { + print STDERR " - ${untranslated}\n"; + } + ++$number_of_errors; + } +} + +# Verify that every option translation does not match any other +# untranslated name. +for my $option_name (sort(keys %translations)) { + for my $other_option_name (sort(keys %translations)) { + if ($option_name ne $other_option_name) { + for my $translation (@{$translations{$option_name}}) { + if ($translation eq $other_option_name) { + print STDERR "${translation} is a translation of ${option_name}, but it is also a different option.\n"; + ++$number_of_errors; + } + } + } + } +} + +if ($number_of_errors eq 0) { + exit 0 +} +print STDERR "There were ${number_of_errors} failures.\n"; +exit 1 diff --git a/posix/standalone-multiple-getopt-collisions.po b/posix/standalone-multiple-getopt-collisions.po new file mode 100644 index 0000000000..5088155583 --- /dev/null +++ b/posix/standalone-multiple-getopt-collisions.po @@ -0,0 +1,45 @@ +# French translations for the getopt static checker +# Copyright (C) 2025 THE GNU C Library'S COPYRIGHT HOLDER +# This file is distributed under the same license as the GNU C Library. +# +# This has two errors: +# 1. "toto" is used both as a translation of "foo" and "bar"; +# 2. "bar" is used as a translation of "pub", but it is another option. +msgid "" +msgstr "" +"Project-Id-Version: GNU C Library (see version.h)\n" +"Report-Msgid-Bugs-To: \n" +"POT-Creation-Date: 2025-06-06 22:37+0200\n" +"PO-Revision-Date: 2025-06-06 22:38+0200\n" +"Language-Team: French \n" +"Language: fr\n" +"MIME-Version: 1.0\n" +"Content-Type: text/plain; charset=ASCII\n" +"Content-Transfer-Encoding: 8bit\n" +"Plural-Forms: nplurals=2; plural=(n > 1);\n" + +# This is not an option name, so it’s OK for it to clash with option +# names. +msgctxt "fish" +msgid "bass" +msgstr "bar" + +# This is the --foo option. +msgctxt "command-line option" +msgid "foo" +msgstr "tata toto" + +# This is the --bar option. Oops, I translated with toto here too. +msgctxt "command-line option" +msgid "bar" +msgstr "titi toto" + +# Let’s go to the --pub! +msgctxt "command-line option" +msgid "pub" +msgstr "bar club" + +# Wait, it’s OK if baz is translated to baz though. +msgctxt "command-line option" +msgid "baz" +msgstr "baz" diff --git a/posix/tst-check-getopt-translations.sh b/posix/tst-check-getopt-translations.sh new file mode 100644 index 0000000000..a9a905dbae --- /dev/null +++ b/posix/tst-check-getopt-translations.sh @@ -0,0 +1,59 @@ +#!/bin/sh +# Test for check-getopt-translations. +# Copyright (C) 2025 Free Software Foundation, Inc. +# This file is part of the GNU C Library. + +# The GNU C Library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. + +# The GNU C Library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. + +# You should have received a copy of the GNU Lesser General Public +# License along with the GNU C Library; if not, see +# . + +set -e + +check_getopt_translations_program=$1; shift +po_file=$1; shift +logfile=$1; shift + +rm -f $logfile +result=0 +expected_output="\ +Translation toto is used for more than one option: + - bar + - foo +bar is a translation of pub, but it is also a different option. +There were 2 failures." + +if output=$(${check_getopt_translations_program} "command-line option" ${po_file} 2>&1) ; then + echo "the errors were not caught." >> $logfile + echo "*** check-getopt-translations FAILED" >> $logfile + result=1 +fi + +if test "$output" != "$expected_output"; then + echo "Expected:" >> $logfile + echo "$expected_output" >> $logfile + echo "Actual:" >> $logfile + echo "$output" >> $logfile + echo "*** check-getopt-translations FAILED" >> $logfile + result=1 +fi + +exit $result + +# Preserve executable bits for this shell script. +Local Variables: +eval:(defun frobme () (set-file-modes buffer-file-name file-mode)) +eval:(make-local-variable 'file-mode) +eval:(setq file-mode (file-modes (buffer-file-name))) +eval:(make-local-variable 'after-save-hook) +eval:(add-hook 'after-save-hook 'frobme) +End: