www.delorie.com/gnu/docs/gettext/gettext_15.html   search  
 
Buy GNU books!


GNU gettext utilities

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2 Preparing Translatable Strings

Before strings can be marked for translations, they sometimes need to be adjusted. Usually preparing a string for translation is done right before marking it, during the marking phase which is described in the next sections. What you have to keep in mind while doing that is the following.

Let's look at some examples of these guidelines.

Translatable strings should be in good English style. If slang language with abbreviations and shortcuts is used, often translators will not understand the message and will produce very inappropriate translations.

 
"%s: is parameter\n"

This is nearly untranslatable: Is the displayed item a parameter or the parameter?

 
"No match"

The ambiguity in this message makes it ununderstandable: Is the program attempting to set something on fire? Does it mean "The given object does not match the template"? Does it mean "The template does not fit for any of the objects"?

In both cases, adding more words to the message will help both the translator and the English speaking user.

Translatable strings should be entire sentences. It is often not possible to translate single verbs or adjectives in a substitutable way.

 
printf ("File %s is %s protected", filename, rw ? "write" : "read");

Most translators will not look at the source and will thus only see the string "File %s is %s protected", which is unintelligible. Change this to

 
printf (rw ? "File %s is write protected" : "File %s is read protected",
        filename);

This way the translator will not only understand the message, she will also be able to find the appropriate grammatical construction. The French translator for example translates "write protected" like "protected against writing".

Often sentences don't fit into a single line. If a sentence is output using two subsequent printf statements, like this

 
printf ("Locale charset \"%s\" is different from\n", lcharset);
printf ("input file charset \"%s\".\n", fcharset);

the translator would have to translate two half sentences, but nothing in the POT file would tell her that the two half sentences belong together. It is necessary to merge the two printf statements so that the translator can handle the entire sentence at once and decide at which place to insert a line break in the translation (if at all):

 
printf ("Locale charset \"%s\" is different from\n\
input file charset \"%s\".\n", lcharset, fcharset);

You may now ask: how about two or more adjacent sentences? Like in this case:

 
puts ("Apollo 13 scenario: Stack overflow handling failed.");
puts ("On the next stack overflow we will crash!!!");

Should these two statements merged into a single one? I would recommend to merge them if the two sentences are related to each other, because then it makes it easier for the translator to understand and translate both. On the other hand, if one of the two messages is a stereotypic one, occurring in other places as well, you will do a favour to the translator by not merging the two. (Identical messages occurring in several places are combined by xgettext, so the translator has to handle them once only.)

Translatable strings should be limited to one paragraph; don't let a single message be longer than ten lines. The reason is that when the translatable string changes, the translator is faced with the task of updating the entire translated string. Maybe only a single word will have changed in the English string, but the translator doesn't see that (with the current translation tools), therefore she has to proofread the entire message.

Many GNU programs have a `--help' output that extends over several screen pages. It is a courtesy towards the translators to split such a message into several ones of five to ten lines each. While doing that, you can also attempt to split the documented options into groups, such as the input options, the output options, and the informative output options. This will help every user to find the option he is looking for.

Hardcoded string concatenation is sometimes used to construct English strings:

 
strcpy (s, "Replace ");
strcat (s, object1);
strcat (s, " with ");
strcat (s, object2);
strcat (s, "?");

In order to present to the translator only entire sentences, and also because in some languages the translator might want to swap the order of object1 and object2, it is necessary to change this to use a format string:

 
sprintf (s, "Replace %s with %s?", object1, object2);

A similar case is compile time concatenation of strings. The ISO C 99 include file <inttypes.h> contains a macro PRId64 that can be used as a formatting directive for outputting an `int64_t' integer through printf. It expands to a constant string, usually "d" or "ld" or "lld" or something like this, depending on the platform. Assume you have code like

 
printf ("The amount is %0" PRId64 "\n", number);

The gettext tools and library have special support for these <inttypes.h> macros. You can therefore simply write

 
printf (gettext ("The amount is %0" PRId64 "\n"), number);

The PO file will contain the string "The amount is %0\n". The translators will provide a translation containing "%0" as well, and at runtime the gettext function's result will contain the appropriate constant string, "d" or "ld" or "lld".

This works only for the predefined <inttypes.h> macros. If you have defined your own similar macros, let's say `MYPRId64', that are not known to xgettext, the solution for this problem is to change the code like this:

 
char buf1[100];
sprintf (buf1, "%0" MYPRId64, number);
printf (gettext ("The amount is %s\n"), buf1);

This means, you put the platform dependent code in one statement, and the internationalization code in a different statement. Note that a buffer length of 100 is safe, because all available hardware integer types are limited to 128 bits, and to print a 128 bit integer one needs at most 54 characters, regardless whether in decimal, octal or hexadecimal.

All this applies to other programming languages as well. For example, in Java, string contenation is very frequently used, because it is a compiler built-in operator. Like in C, in Java, you would change

 
System.out.println("Replace "+object1+" with "+object2+"?");

into a statement involving a format string:

 
System.out.println(
    MessageFormat.format("Replace {0} with {1}?",
                         new Object[] { object1, object2 }));


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

  webmaster     delorie software   privacy  
  Copyright 2003   by The Free Software Foundation     Updated Jun 2003