3 <!-- This HTML file has been created by texi2html 1.52a
4 from gettext.texi on 11 April 2005 -->
6 <TITLE>GNU gettext utilities -
13 Other Programming Languages
</TITLE>
9 Go to the
<A HREF=
"gettext_1.html">first
</A>,
<A HREF=
"gettext_12.html">previous
</A>,
<A HREF=
"gettext_14.html">next
</A>,
<A HREF=
"gettext_22.html">last
</A> section,
<A HREF=
"gettext_toc.html">table of contents
</A>.
13 <H1><A NAME=
"SEC221" HREF=
"gettext_toc.html#TOC221">13 Other Programming Languages
</A></H1>
16 While the presentation of
<CODE>gettext
</CODE> focuses mostly on C and
17 implicitly applies to C++ as well, its scope is far broader than that:
18 Many programming languages, scripting languages and other textual data
19 like GUI resources or package descriptions can make use of the gettext
26 <H2><A NAME=
"SEC222" HREF=
"gettext_toc.html#TOC222">13.1 The Language Implementor's View
</A></H2>
28 <A NAME=
"IDX1072"></A>
29 <A NAME=
"IDX1073"></A>
33 All programming and scripting languages that have the notion of strings
34 are eligible to supporting
<CODE>gettext
</CODE>. Supporting
<CODE>gettext
</CODE>
42 You should add to the language a syntax for translatable strings. In
43 principle, a function call of
<CODE>gettext
</CODE> would do, but a shorthand
44 syntax helps keeping the legibility of internationalized programs. For
45 example, in C we use the syntax
<CODE>_(
"string")
</CODE>, and in GNU awk we use
46 the shorthand
<CODE>_
"string"</CODE>.
50 You should arrange that evaluation of such a translatable string at
51 runtime calls the
<CODE>gettext
</CODE> function, or performs equivalent
56 Similarly, you should make the functions
<CODE>ngettext
</CODE>,
57 <CODE>dcgettext
</CODE>,
<CODE>dcngettext
</CODE> available from within the language.
58 These functions are less often used, but are nevertheless necessary for
59 particular purposes:
<CODE>ngettext
</CODE> for correct plural handling, and
60 <CODE>dcgettext
</CODE> and
<CODE>dcngettext
</CODE> for obeying other locale
61 environment variables than
<CODE>LC_MESSAGES
</CODE>, such as
<CODE>LC_TIME
</CODE> or
62 <CODE>LC_MONETARY
</CODE>. For these latter functions, you need to make the
63 <CODE>LC_*
</CODE> constants, available in the C header
<CODE><locale.h
></CODE>,
64 referenceable from within the language, usually either as enumeration
69 You should allow the programmer to designate a message domain, either by
70 making the
<CODE>textdomain
</CODE> function available from within the
71 language, or by introducing a magic variable called
<CODE>TEXTDOMAIN
</CODE>.
72 Similarly, you should allow the programmer to designate where to search
73 for message catalogs, by providing access to the
<CODE>bindtextdomain
</CODE>
78 You should either perform a
<CODE>setlocale (LC_ALL,
"")
</CODE> call during
79 the startup of your language runtime, or allow the programmer to do so.
80 Remember that gettext will act as a no-op if the
<CODE>LC_MESSAGES
</CODE> and
81 <CODE>LC_CTYPE
</CODE> locale facets are not both set.
85 A programmer should have a way to extract translatable strings from a
86 program into a PO file. The GNU
<CODE>xgettext
</CODE> program is being
87 extended to support very different programming languages. Please
88 contact the GNU
<CODE>gettext
</CODE> maintainers to help them doing this. If
89 the string extractor is best integrated into your language's parser, GNU
90 <CODE>xgettext
</CODE> can function as a front end to your string extractor.
94 The language's library should have a string formatting facility where
95 the arguments of a format string are denoted by a positional number or a
96 name. This is needed because for some languages and some messages with
97 more than one substitutable argument, the translation will need to
98 output the substituted arguments in different order. See section
<A HREF=
"gettext_3.html#SEC18">3.5 Special Comments preceding Keywords
</A>.
102 If the language has more than one implementation, and not all of the
103 implementations use
<CODE>gettext
</CODE>, but the programs should be portable
104 across implementations, you should provide a no-i18n emulation, that
105 makes the other implementations accept programs written for yours,
106 without actually translating the strings.
110 To help the programmer in the task of marking translatable strings,
111 which is usually performed using the Emacs PO mode, you are welcome to
112 contact the GNU
<CODE>gettext
</CODE> maintainers, so they can add support for
113 your language to
<TT>`po-mode.el
´</TT>.
117 On the implementation side, three approaches are possible, with
118 different effects on portability and copyright:
125 You may integrate the GNU
<CODE>gettext
</CODE>'s
<TT>`intl/
´</TT> directory in
126 your package, as described in section
<A HREF=
"gettext_12.html#SEC192">12 The Maintainer's View
</A>. This allows you to
127 have internationalization on all kinds of platforms. Note that when you
128 then distribute your package, it legally falls under the GNU General
129 Public License, and the GNU project will be glad about your contribution
130 to the Free Software pool.
134 You may link against GNU
<CODE>gettext
</CODE> functions if they are found in
135 the C library. For example, an autoconf test for
<CODE>gettext()
</CODE> and
136 <CODE>ngettext()
</CODE> will detect this situation. For the moment, this test
137 will succeed on GNU systems and not on other platforms. No severe
138 copyright restrictions apply.
142 You may emulate or reimplement the GNU
<CODE>gettext
</CODE> functionality.
143 This has the advantage of full portability and no copyright
144 restrictions, but also the drawback that you have to reimplement the GNU
145 <CODE>gettext
</CODE> features (such as the
<CODE>LANGUAGE
</CODE> environment
146 variable, the locale aliases database, the automatic charset conversion,
147 and plural handling).
152 <H2><A NAME=
"SEC223" HREF=
"gettext_toc.html#TOC223">13.2 The Programmer's View
</A></H2>
155 For the programmer, the general procedure is the same as for the C
156 language. The Emacs PO mode supports other languages, and the GNU
157 <CODE>xgettext
</CODE> string extractor recognizes other languages based on the
158 file extension or a command-line option. In some languages,
159 <CODE>setlocale
</CODE> is not needed because it is already performed by the
160 underlying language runtime.
165 <H2><A NAME=
"SEC224" HREF=
"gettext_toc.html#TOC224">13.3 The Translator's View
</A></H2>
168 The translator works exactly as in the C language case. The only
169 difference is that when translating format strings, she has to be aware
170 of the language's particular syntax for positional arguments in format
177 <H3><A NAME=
"SEC225" HREF=
"gettext_toc.html#TOC225">13.3.1 C Format Strings
</A></H3>
180 C format strings are described in POSIX (IEEE P1003.1
2001), section
182 <A HREF=
"http://www.opengroup.org/onlinepubs/007904975/functions/fprintf.html">http://www.opengroup.org/onlinepubs/
007904975/functions/fprintf.html
</A>.
183 See also the fprintf(
3) manual page,
184 <A HREF=
"http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf.3.php">http://www.linuxvalley.it/encyclopedia/ldp/manpage/man3/printf
.3.php
</A>,
185 <A HREF=
"http://informatik.fh-wuerzburg.de/student/i510/man/printf.html">http://informatik.fh-wuerzburg.de/student/i510/man/printf.html
</A>.
189 Although format strings with positions that reorder arguments, such as
194 "Only %2$d bytes free on '%1$s'."
198 which is semantically equivalent to
203 "'%s' has only %d bytes free."
207 are a POSIX/XSI feature and not specified by ISO C
99, translators can rely
208 on this reordering ability: On the few platforms where
<CODE>printf()
</CODE>,
209 <CODE>fprintf()
</CODE> etc. don't support this feature natively,
<TT>`libintl.a
´</TT>
210 or
<TT>`libintl.so
´</TT> provides replacement functions, and GNU
<CODE><libintl.h
></CODE>
211 activates these replacement functions automatically.
215 <A NAME=
"IDX1074"></A>
216 <A NAME=
"IDX1075"></A>
217 As a special feature for Farsi (Persian) and maybe Arabic, translators can
218 insert an
<SAMP>`I
´</SAMP> flag into numeric format directives. For example, the
219 translation of
<CODE>"%d"</CODE> can be
<CODE>"%Id"</CODE>. The effect of this flag,
220 on systems with GNU
<CODE>libc
</CODE>, is that in the output, the ASCII digits are
221 replaced with the
<SAMP>`outdigits
´</SAMP> defined in the
<CODE>LC_CTYPE
</CODE> locale
222 facet. On other systems, the
<CODE>gettext
</CODE> function removes this flag,
223 so that it has no effect.
227 Note that the programmer should
<EM>not
</EM> put this flag into the
228 untranslated string. (Putting the
<SAMP>`I
´</SAMP> format directive flag into an
229 <VAR>msgid
</VAR> string would lead to undefined behaviour on platforms without
230 glibc when NLS is disabled.)
235 <H3><A NAME=
"SEC226" HREF=
"gettext_toc.html#TOC226">13.3.2 Objective C Format Strings
</A></H3>
238 Objective C format strings are like C format strings. They support an
239 additional format directive:
"$@", which when executed consumes an argument
240 of type
<CODE>Object *
</CODE>.
245 <H3><A NAME=
"SEC227" HREF=
"gettext_toc.html#TOC227">13.3.3 Shell Format Strings
</A></H3>
248 Shell format strings, as supported by GNU gettext and the
<SAMP>`envsubst
´</SAMP>
249 program, are strings with references to shell variables in the form
250 <CODE>$
<VAR>variable
</VAR></CODE> or
<CODE>${
<VAR>variable
</VAR>}
</CODE>. References of the form
251 <CODE>${
<VAR>variable
</VAR>-
<VAR>default
</VAR>}
</CODE>,
252 <CODE>${
<VAR>variable
</VAR>:-
<VAR>default
</VAR>}
</CODE>,
253 <CODE>${
<VAR>variable
</VAR>=
<VAR>default
</VAR>}
</CODE>,
254 <CODE>${
<VAR>variable
</VAR>:=
<VAR>default
</VAR>}
</CODE>,
255 <CODE>${
<VAR>variable
</VAR>+
<VAR>replacement
</VAR>}
</CODE>,
256 <CODE>${
<VAR>variable
</VAR>:+
<VAR>replacement
</VAR>}
</CODE>,
257 <CODE>${
<VAR>variable
</VAR>?
<VAR>ignored
</VAR>}
</CODE>,
258 <CODE>${
<VAR>variable
</VAR>:?
<VAR>ignored
</VAR>}
</CODE>,
259 that would be valid inside shell scripts, are not supported. The
260 <VAR>variable
</VAR> names must consist solely of alphanumeric or underscore
261 ASCII characters, not start with a digit and be nonempty; otherwise such
262 a variable reference is ignored.
267 <H3><A NAME=
"SEC228" HREF=
"gettext_toc.html#TOC228">13.3.4 Python Format Strings
</A></H3>
270 Python format strings are described in
271 Python Library reference /
272 2. Built-in Types, Exceptions and Functions /
273 2.2. Built-in Types /
274 2.2.6. Sequence Types /
275 2.2.6.2. String Formatting Operations.
276 <A HREF=
"http://www.python.org/doc/2.2.1/lib/typesseq-strings.html">http://www.python.org/doc/
2.2.1/lib/typesseq-strings.html
</A>.
281 <H3><A NAME=
"SEC229" HREF=
"gettext_toc.html#TOC229">13.3.5 Lisp Format Strings
</A></H3>
284 Lisp format strings are described in the Common Lisp HyperSpec,
285 chapter
22.3 Formatted Output,
286 <A HREF=
"http://www.lisp.org/HyperSpec/Body/sec_22-3.html">http://www.lisp.org/HyperSpec/Body/sec_22-
3.html
</A>.
291 <H3><A NAME=
"SEC230" HREF=
"gettext_toc.html#TOC230">13.3.6 Emacs Lisp Format Strings
</A></H3>
294 Emacs Lisp format strings are documented in the Emacs Lisp reference,
295 section Formatting Strings,
296 <A HREF=
"http://www.gnu.org/manual/elisp-manual-21-2.8/html_chapter/elisp_4.html#SEC75">http://www.gnu.org/manual/elisp-manual-
21-
2.8/html_chapter/elisp_4.html#SEC75
</A>.
297 Note that as of version
21, XEmacs supports numbered argument specifications
298 in format strings while FSF Emacs doesn't.
303 <H3><A NAME=
"SEC231" HREF=
"gettext_toc.html#TOC231">13.3.7 librep Format Strings
</A></H3>
306 librep format strings are documented in the librep manual, section
308 <A HREF=
"http://librep.sourceforge.net/librep-manual.html#Formatted%20Output">http://librep.sourceforge.net/librep-manual.html#Formatted%
20Output
</A>,
309 <A HREF=
"http://www.gwinnup.org/research/docs/librep.html#SEC122">http://www.gwinnup.org/research/docs/librep.html#SEC122
</A>.
314 <H3><A NAME=
"SEC232" HREF=
"gettext_toc.html#TOC232">13.3.8 Scheme Format Strings
</A></H3>
317 Scheme format strings are documented in the SLIB manual, section
318 Format Specification.
323 <H3><A NAME=
"SEC233" HREF=
"gettext_toc.html#TOC233">13.3.9 Smalltalk Format Strings
</A></H3>
326 Smalltalk format strings are described in the GNU Smalltalk documentation,
327 class
<CODE>CharArray
</CODE>, methods
<SAMP>`bindWith:
´</SAMP> and
328 <SAMP>`bindWithArguments:
´</SAMP>.
329 <A HREF=
"http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238">http://www.gnu.org/software/smalltalk/gst-manual/gst_68.html#SEC238
</A>.
330 In summary, a directive starts with
<SAMP>`%
´</SAMP> and is followed by
<SAMP>`%
´</SAMP>
331 or a nonzero digit (
<SAMP>`
1´</SAMP> to
<SAMP>`
9´</SAMP>).
336 <H3><A NAME=
"SEC234" HREF=
"gettext_toc.html#TOC234">13.3.10 Java Format Strings
</A></H3>
339 Java format strings are described in the JDK documentation for class
340 <CODE>java.text.MessageFormat
</CODE>,
341 <A HREF=
"http://java.sun.com/j2se/1.4/docs/api/java/text/MessageFormat.html">http://java.sun.com/j2se/
1.4/docs/api/java/text/MessageFormat.html
</A>.
342 See also the ICU documentation
343 <A HREF=
"http://oss.software.ibm.com/icu/apiref/classMessageFormat.html">http://oss.software.ibm.com/icu/apiref/classMessageFormat.html
</A>.
348 <H3><A NAME=
"SEC235" HREF=
"gettext_toc.html#TOC235">13.3.11 C# Format Strings
</A></H3>
351 C# format strings are described in the .NET documentation for class
352 <CODE>System.String
</CODE> and in
353 <A HREF=
"http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpConFormattingOverview.asp">http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpConFormattingOverview.asp
</A>.
358 <H3><A NAME=
"SEC236" HREF=
"gettext_toc.html#TOC236">13.3.12 awk Format Strings
</A></H3>
361 awk format strings are described in the gawk documentation, section
363 <A HREF=
"http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf">http://www.gnu.org/manual/gawk/html_node/Printf.html#Printf
</A>.
368 <H3><A NAME=
"SEC237" HREF=
"gettext_toc.html#TOC237">13.3.13 Object Pascal Format Strings
</A></H3>
371 Where is this documented?
376 <H3><A NAME=
"SEC238" HREF=
"gettext_toc.html#TOC238">13.3.14 YCP Format Strings
</A></H3>
379 YCP sformat strings are described in the libycp documentation
380 <A HREF=
"file:/usr/share/doc/packages/libycp/YCP-builtins.html">file:/usr/share/doc/packages/libycp/YCP-builtins.html
</A>.
381 In summary, a directive starts with
<SAMP>`%
´</SAMP> and is followed by
<SAMP>`%
´</SAMP>
382 or a nonzero digit (
<SAMP>`
1´</SAMP> to
<SAMP>`
9´</SAMP>).
387 <H3><A NAME=
"SEC239" HREF=
"gettext_toc.html#TOC239">13.3.15 Tcl Format Strings
</A></H3>
390 Tcl format strings are described in the
<TT>`format.n
´</TT> manual page,
391 <A HREF=
"http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm">http://www.scriptics.com/man/tcl8.3/TclCmd/format.htm
</A>.
396 <H3><A NAME=
"SEC240" HREF=
"gettext_toc.html#TOC240">13.3.16 Perl Format Strings
</A></H3>
399 There are two kinds format strings in Perl: those acceptable to the
400 Perl built-in function
<CODE>printf
</CODE>, labelled as
<SAMP>`perl-format
´</SAMP>,
401 and those acceptable to the
<CODE>libintl-perl
</CODE> function
<CODE>__x
</CODE>,
402 labelled as
<SAMP>`perl-brace-format
´</SAMP>.
406 Perl
<CODE>printf
</CODE> format strings are described in the
<CODE>sprintf
</CODE>
407 section of
<SAMP>`man perlfunc
´</SAMP>.
411 Perl brace format strings are described in the
412 <TT>`Locale::TextDomain(
3pm)
´</TT> manual page of the CPAN package
413 libintl-perl. In brief, Perl format uses placeholders put between
414 braces (
<SAMP>`{
´</SAMP> and
<SAMP>`}
´</SAMP>). The placeholder must have the syntax
415 of simple identifiers.
420 <H3><A NAME=
"SEC241" HREF=
"gettext_toc.html#TOC241">13.3.17 PHP Format Strings
</A></H3>
423 PHP format strings are described in the documentation of the PHP function
424 <CODE>sprintf
</CODE>, in
<TT>`phpdoc/manual/function.sprintf.html
´</TT> or
425 <A HREF=
"http://www.php.net/manual/en/function.sprintf.php">http://www.php.net/manual/en/function.sprintf.php
</A>.
430 <H3><A NAME=
"SEC242" HREF=
"gettext_toc.html#TOC242">13.3.18 GCC internal Format Strings
</A></H3>
433 These format strings are used inside the GCC sources. In such a format
434 string, a directive starts with
<SAMP>`%
´</SAMP>, is optionally followed by a
435 size specifier
<SAMP>`l
´</SAMP>, an optional flag
<SAMP>`+
´</SAMP>, another optional flag
436 <SAMP>`#
´</SAMP>, and is finished by a specifier:
<SAMP>`%
´</SAMP> denotes a literal
437 percent sign,
<SAMP>`c
´</SAMP> denotes a character,
<SAMP>`s
´</SAMP> denotes a string,
438 <SAMP>`i
´</SAMP> and
<SAMP>`d
´</SAMP> denote an integer,
<SAMP>`o
´</SAMP>,
<SAMP>`u
´</SAMP>,
<SAMP>`x
´</SAMP>
439 denote an unsigned integer,
<SAMP>`.*s
´</SAMP> denotes a string preceded by a
440 width specification,
<SAMP>`H
´</SAMP> denotes a
<SAMP>`location_t *
´</SAMP> pointer,
441 <SAMP>`D
´</SAMP> denotes a general declaration,
<SAMP>`F
´</SAMP> denotes a function
442 declaration,
<SAMP>`T
´</SAMP> denotes a type,
<SAMP>`A
´</SAMP> denotes a function argument,
443 <SAMP>`C
´</SAMP> denotes a tree code,
<SAMP>`E
´</SAMP> denotes an expression,
<SAMP>`L
´</SAMP>
444 denotes a programming language,
<SAMP>`O
´</SAMP> denotes a binary operator,
445 <SAMP>`P
´</SAMP> denotes a function parameter,
<SAMP>`Q
´</SAMP> denotes an assignment
446 operator,
<SAMP>`V
´</SAMP> denotes a const/volatile qualifier.
451 <H3><A NAME=
"SEC243" HREF=
"gettext_toc.html#TOC243">13.3.19 Qt Format Strings
</A></H3>
454 Qt format strings are described in the documentation of the QString class
455 <A HREF=
"file:/usr/lib/qt-3.0.5/doc/html/qstring.html">file:/usr/lib/qt-
3.0.5/doc/html/qstring.html
</A>.
456 In summary, a directive consists of a
<SAMP>`%
´</SAMP> followed by a digit. The same
457 directive cannot occur more than once in a format string.
462 <H2><A NAME=
"SEC244" HREF=
"gettext_toc.html#TOC244">13.4 The Maintainer's View
</A></H2>
465 For the maintainer, the general procedure differs from the C language
473 For those languages that don't use GNU gettext, the
<TT>`intl/
´</TT> directory
474 is not needed and can be omitted. This means that the maintainer calls the
475 <CODE>gettextize
</CODE> program without the
<SAMP>`--intl
´</SAMP> option, and that he
476 invokes the
<CODE>AM_GNU_GETTEXT
</CODE> autoconf macro via
477 <SAMP>`AM_GNU_GETTEXT([external])
´</SAMP>.
481 If only a single programming language is used, the
<CODE>XGETTEXT_OPTIONS
</CODE>
482 variable in
<TT>`po/Makevars
´</TT> (see section
<A HREF=
"gettext_12.html#SEC199">12.4.3 <TT>`Makevars
´</TT> in
<TT>`po/
´</TT></A>) should be adjusted to
483 match the
<CODE>xgettext
</CODE> options for that particular programming language.
484 If the package uses more than one programming language with
<CODE>gettext
</CODE>
485 support, it becomes necessary to change the POT file construction rule
486 in
<TT>`po/Makefile.in.in
´</TT>. It is recommended to make one
<CODE>xgettext
</CODE>
487 invocation per programming language, each with the options appropriate for
488 that language, and to combine the resulting files using
<CODE>msgcat
</CODE>.
493 <H2><A NAME=
"SEC245" HREF=
"gettext_toc.html#TOC245">13.5 Individual Programming Languages
</A></H2>
497 <H3><A NAME=
"SEC246" HREF=
"gettext_toc.html#TOC246">13.5.1 C, C++, Objective C
</A></H3>
499 <A NAME=
"IDX1076"></A>
506 gcc, gpp, gobjc, glibc, gettext
510 For C:
<CODE>c
</CODE>,
<CODE>h
</CODE>.
511 <BR>For C++:
<CODE>C
</CODE>,
<CODE>c++
</CODE>,
<CODE>cc
</CODE>,
<CODE>cxx
</CODE>,
<CODE>cpp
</CODE>,
<CODE>hpp
</CODE>.
512 <BR>For Objective C:
<CODE>m
</CODE>.
518 <DT>gettext shorthand
520 <CODE>_(
"abc")
</CODE>
522 <DT>gettext/ngettext functions
524 <CODE>gettext
</CODE>,
<CODE>dgettext
</CODE>,
<CODE>dcgettext
</CODE>,
<CODE>ngettext
</CODE>,
525 <CODE>dngettext
</CODE>,
<CODE>dcngettext
</CODE>
529 <CODE>textdomain
</CODE> function
533 <CODE>bindtextdomain
</CODE> function
537 Programmer must call
<CODE>setlocale (LC_ALL,
"")
</CODE>
541 <CODE>#include
<libintl.h
></CODE>
542 <BR><CODE>#include
<locale.h
></CODE>
543 <BR><CODE>#define _(string) gettext (string)
</CODE>
545 <DT>Use or emulate GNU gettext
551 <CODE>xgettext -k_
</CODE>
553 <DT>Formatting with positions
555 <CODE>fprintf
"%2$d %1$d"</CODE>
556 <BR>In C++:
<CODE>autosprintf
"%2$d %1$d"</CODE>
557 (see section `Introduction' in
<CITE>GNU autosprintf
</CITE>)
561 autoconf (gettext.m4) and #if ENABLE_NLS
569 The following examples are available in the
<TT>`examples
´</TT> directory:
570 <CODE>hello-c
</CODE>,
<CODE>hello-c-gnome
</CODE>,
<CODE>hello-c++
</CODE>,
<CODE>hello-c++-qt
</CODE>,
571 <CODE>hello-c++-kde
</CODE>,
<CODE>hello-c++-gnome
</CODE>,
<CODE>hello-objc
</CODE>,
572 <CODE>hello-objc-gnustep
</CODE>,
<CODE>hello-objc-gnome
</CODE>.
577 <H3><A NAME=
"SEC247" HREF=
"gettext_toc.html#TOC247">13.5.2 sh - Shell Script
</A></H3>
579 <A NAME=
"IDX1077"></A>
594 <CODE>"abc"</CODE>,
<CODE>'abc'
</CODE>,
<CODE>abc
</CODE>
596 <DT>gettext shorthand
598 <CODE>"`gettext \"abc\
"`"</CODE>
600 <DT>gettext/ngettext functions
602 <A NAME=
"IDX1078"></A>
603 <A NAME=
"IDX1079"></A>
604 <CODE>gettext
</CODE>,
<CODE>ngettext
</CODE> programs
605 <BR><CODE>eval_gettext
</CODE>,
<CODE>eval_ngettext
</CODE> shell functions
609 <A NAME=
"IDX1080"></A>
610 environment variable
<CODE>TEXTDOMAIN
</CODE>
614 <A NAME=
"IDX1081"></A>
615 environment variable
<CODE>TEXTDOMAINDIR
</CODE>
623 <CODE>. gettext.sh
</CODE>
625 <DT>Use or emulate GNU gettext
631 <CODE>xgettext
</CODE>
633 <DT>Formatting with positions
647 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-sh
</CODE>.
653 <H4><A NAME=
"SEC248" HREF=
"gettext_toc.html#TOC248">13.5.2.1 Preparing Shell Scripts for Internationalization
</A></H4>
655 <A NAME=
"IDX1082"></A>
659 Preparing a shell script for internationalization is conceptually similar
660 to the steps described in section
<A HREF=
"gettext_3.html#SEC13">3 Preparing Program Sources
</A>. The concrete steps for shell
661 scripts are as follows.
675 near the top of the script.
<CODE>gettext.sh
</CODE> is a shell function library
676 that provides the functions
677 <CODE>eval_gettext
</CODE> (see section
<A HREF=
"gettext_13.html#SEC253">13.5.2.6 Invoking the
<CODE>eval_gettext
</CODE> function
</A>) and
678 <CODE>eval_ngettext
</CODE> (see section
<A HREF=
"gettext_13.html#SEC254">13.5.2.7 Invoking the
<CODE>eval_ngettext
</CODE> function
</A>).
679 You have to ensure that
<CODE>gettext.sh
</CODE> can be found in the
<CODE>PATH
</CODE>.
683 Set and export the
<CODE>TEXTDOMAIN
</CODE> and
<CODE>TEXTDOMAINDIR
</CODE> environment
684 variables. Usually
<CODE>TEXTDOMAIN
</CODE> is the package or program name, and
685 <CODE>TEXTDOMAINDIR
</CODE> is the absolute pathname corresponding to
686 <CODE>$prefix/share/locale
</CODE>, where
<CODE>$prefix
</CODE> is the installation location.
692 TEXTDOMAINDIR=@LOCALEDIR@
698 Prepare the strings for translation, as described in section
<A HREF=
"gettext_3.html#SEC15">3.2 Preparing Translatable Strings
</A>.
702 Simplify translatable strings so that they don't contain command substitution
703 (
<CODE>"`...`"</CODE> or
<CODE>"$(...)"</CODE>), variable access with defaulting (like
704 <CODE>${
<VAR>variable
</VAR>-
<VAR>default
</VAR>}
</CODE>), access to positional arguments
705 (like
<CODE>$
0</CODE>,
<CODE>$
1</CODE>, ...) or highly volatile shell variables (like
706 <CODE>$?
</CODE>). This can always be done through simple local code restructuring.
711 echo
"Usage: $0 [OPTION] FILE..."
719 echo
"Usage: $program_name [OPTION] FILE..."
726 echo
"Remaining files: `ls | wc -l`"
733 filecount=
"`ls | wc -l`"
734 echo
"Remaining files: $filecount"
739 For each translatable string, change the output command
<SAMP>`echo
´</SAMP> or
740 <SAMP>`$echo
´</SAMP> to
<SAMP>`gettext
´</SAMP> (if the string contains no references to
741 shell variables) or to
<SAMP>`eval_gettext
´</SAMP> (if it refers to shell variables),
742 followed by a no-argument
<SAMP>`echo
´</SAMP> command (to account for the terminating
743 newline). Similarly, for cases with plural handling, replace a conditional
744 <SAMP>`echo
´</SAMP> command with an invocation of
<SAMP>`ngettext
´</SAMP> or
745 <SAMP>`eval_ngettext
´</SAMP>, followed by a no-argument
<SAMP>`echo
´</SAMP> command.
747 When doing this, you also need to add an extra backslash before the dollar
748 sign in references to shell variables, so that the
<SAMP>`eval_gettext
´</SAMP>
749 function receives the translatable string before the variable values are
750 substituted into it. For example,
754 echo
"Remaining files: $filecount"
761 eval_gettext
"Remaining files: \$filecount"; echo
764 If the output command is not
<SAMP>`echo
´</SAMP>, you can make it use
<SAMP>`echo
´</SAMP>
765 nevertheless, through the use of backquotes. However, note that inside
766 backquotes, backslashes must be doubled to be effective (because the
767 backquoting eats one level of backslashes). For example, assuming that
768 <SAMP>`error
´</SAMP> is a shell function that signals an error,
772 error
"file not found: $filename"
775 is first transformed into
779 error
"`echo \"file not found: \$filename\
"`"
786 error
"`eval_gettext \"file not found: \\\$filename\
"`"
793 <H4><A NAME=
"SEC249" HREF=
"gettext_toc.html#TOC249">13.5.2.2 Contents of
<CODE>gettext.sh
</CODE></A></H4>
796 <CODE>gettext.sh
</CODE>, contained in the run-time package of GNU gettext, provides
804 The variable
<CODE>echo
</CODE> is set to a command that outputs its first argument
805 and a newline, without interpreting backslashes in the argument string.
809 See section
<A HREF=
"gettext_13.html#SEC253">13.5.2.6 Invoking the
<CODE>eval_gettext
</CODE> function
</A>.
813 See section
<A HREF=
"gettext_13.html#SEC254">13.5.2.7 Invoking the
<CODE>eval_ngettext
</CODE> function
</A>.
818 <H4><A NAME=
"SEC250" HREF=
"gettext_toc.html#TOC250">13.5.2.3 Invoking the
<CODE>gettext
</CODE> program
</A></H4>
821 <A NAME=
"IDX1083"></A>
822 <A NAME=
"IDX1084"></A>
825 gettext [
<VAR>option
</VAR>] [[
<VAR>textdomain
</VAR>]
<VAR>msgid
</VAR>]
826 gettext [
<VAR>option
</VAR>] -s [
<VAR>msgid
</VAR>]...
830 <A NAME=
"IDX1085"></A>
831 The
<CODE>gettext
</CODE> program displays the native language translation of a
836 <STRONG>Arguments
</STRONG>
841 <DT><SAMP>`-d
<VAR>textdomain
</VAR>´</SAMP>
843 <DT><SAMP>`--domain=
<VAR>textdomain
</VAR>´</SAMP>
845 <A NAME=
"IDX1086"></A>
846 <A NAME=
"IDX1087"></A>
847 Retrieve translated messages from
<VAR>textdomain
</VAR>. Usually a
<VAR>textdomain
</VAR>
848 corresponds to a package, a program, or a module of a program.
850 <DT><SAMP>`-e
´</SAMP>
852 <A NAME=
"IDX1088"></A>
853 Enable expansion of some escape sequences. This option is for compatibility
854 with the
<SAMP>`echo
´</SAMP> program or shell built-in. The escape sequences
855 <SAMP>`\a
´</SAMP>,
<SAMP>`\b
´</SAMP>,
<SAMP>`\c
´</SAMP>,
<SAMP>`\f
´</SAMP>,
<SAMP>`\n
´</SAMP>,
<SAMP>`\r
´</SAMP>,
<SAMP>`\t
´</SAMP>,
856 <SAMP>`\v
´</SAMP>,
<SAMP>`\\
´</SAMP>, and
<SAMP>`\
´</SAMP> followed by one to three octal digits, are
857 interpreted like the SystemV
<SAMP>`echo
´</SAMP> program does.
859 <DT><SAMP>`-E
´</SAMP>
861 <A NAME=
"IDX1089"></A>
862 This option is only for compatibility with the
<SAMP>`echo
´</SAMP> program or shell
863 built-in. It has no effect.
865 <DT><SAMP>`-h
´</SAMP>
867 <DT><SAMP>`--help
´</SAMP>
869 <A NAME=
"IDX1090"></A>
870 <A NAME=
"IDX1091"></A>
871 Display this help and exit.
873 <DT><SAMP>`-n
´</SAMP>
875 <A NAME=
"IDX1092"></A>
876 Suppress trailing newline. By default,
<CODE>gettext
</CODE> adds a newline to
879 <DT><SAMP>`-V
´</SAMP>
881 <DT><SAMP>`--version
´</SAMP>
883 <A NAME=
"IDX1093"></A>
884 <A NAME=
"IDX1094"></A>
885 Output version information and exit.
887 <DT><SAMP>`[
<VAR>textdomain
</VAR>]
<VAR>msgid
</VAR>´</SAMP>
889 Retrieve translated message corresponding to
<VAR>msgid
</VAR> from
<VAR>textdomain
</VAR>.
894 If the
<VAR>textdomain
</VAR> parameter is not given, the domain is determined from
895 the environment variable
<CODE>TEXTDOMAIN
</CODE>. If the message catalog is not
896 found in the regular directory, another location can be specified with the
897 environment variable
<CODE>TEXTDOMAINDIR
</CODE>.
901 When used with the
<CODE>-s
</CODE> option the program behaves like the
<SAMP>`echo
´</SAMP>
902 command. But it does not simply copy its arguments to stdout. Instead those
903 messages found in the selected catalog are translated.
908 <H4><A NAME=
"SEC251" HREF=
"gettext_toc.html#TOC251">13.5.2.4 Invoking the
<CODE>ngettext
</CODE> program
</A></H4>
911 <A NAME=
"IDX1095"></A>
912 <A NAME=
"IDX1096"></A>
915 ngettext [
<VAR>option
</VAR>] [
<VAR>textdomain
</VAR>]
<VAR>msgid
</VAR> <VAR>msgid-plural
</VAR> <VAR>count
</VAR>
919 <A NAME=
"IDX1097"></A>
920 The
<CODE>ngettext
</CODE> program displays the native language translation of a
921 textual message whose grammatical form depends on a number.
925 <STRONG>Arguments
</STRONG>
930 <DT><SAMP>`-d
<VAR>textdomain
</VAR>´</SAMP>
932 <DT><SAMP>`--domain=
<VAR>textdomain
</VAR>´</SAMP>
934 <A NAME=
"IDX1098"></A>
935 <A NAME=
"IDX1099"></A>
936 Retrieve translated messages from
<VAR>textdomain
</VAR>. Usually a
<VAR>textdomain
</VAR>
937 corresponds to a package, a program, or a module of a program.
939 <DT><SAMP>`-e
´</SAMP>
941 <A NAME=
"IDX1100"></A>
942 Enable expansion of some escape sequences. This option is for compatibility
943 with the
<SAMP>`gettext
´</SAMP> program. The escape sequences
944 <SAMP>`\a
´</SAMP>,
<SAMP>`\b
´</SAMP>,
<SAMP>`\c
´</SAMP>,
<SAMP>`\f
´</SAMP>,
<SAMP>`\n
´</SAMP>,
<SAMP>`\r
´</SAMP>,
<SAMP>`\t
´</SAMP>,
945 <SAMP>`\v
´</SAMP>,
<SAMP>`\\
´</SAMP>, and
<SAMP>`\
´</SAMP> followed by one to three octal digits, are
946 interpreted like the SystemV
<SAMP>`echo
´</SAMP> program does.
948 <DT><SAMP>`-E
´</SAMP>
950 <A NAME=
"IDX1101"></A>
951 This option is only for compatibility with the
<SAMP>`gettext
´</SAMP> program. It has
954 <DT><SAMP>`-h
´</SAMP>
956 <DT><SAMP>`--help
´</SAMP>
958 <A NAME=
"IDX1102"></A>
959 <A NAME=
"IDX1103"></A>
960 Display this help and exit.
962 <DT><SAMP>`-V
´</SAMP>
964 <DT><SAMP>`--version
´</SAMP>
966 <A NAME=
"IDX1104"></A>
967 <A NAME=
"IDX1105"></A>
968 Output version information and exit.
970 <DT><SAMP>`
<VAR>textdomain
</VAR>´</SAMP>
972 Retrieve translated message from
<VAR>textdomain
</VAR>.
974 <DT><SAMP>`
<VAR>msgid
</VAR> <VAR>msgid-plural
</VAR>´</SAMP>
976 Translate
<VAR>msgid
</VAR> (English singular) /
<VAR>msgid-plural
</VAR> (English plural).
978 <DT><SAMP>`
<VAR>count
</VAR>´</SAMP>
980 Choose singular/plural form based on this value.
985 If the
<VAR>textdomain
</VAR> parameter is not given, the domain is determined from
986 the environment variable
<CODE>TEXTDOMAIN
</CODE>. If the message catalog is not
987 found in the regular directory, another location can be specified with the
988 environment variable
<CODE>TEXTDOMAINDIR
</CODE>.
993 <H4><A NAME=
"SEC252" HREF=
"gettext_toc.html#TOC252">13.5.2.5 Invoking the
<CODE>envsubst
</CODE> program
</A></H4>
996 <A NAME=
"IDX1106"></A>
997 <A NAME=
"IDX1107"></A>
1000 envsubst [
<VAR>option
</VAR>] [
<VAR>shell-format
</VAR>]
1004 <A NAME=
"IDX1108"></A>
1005 <A NAME=
"IDX1109"></A>
1006 <A NAME=
"IDX1110"></A>
1007 The
<CODE>envsubst
</CODE> program substitutes the values of environment variables.
1011 <STRONG>Operation mode
</STRONG>
1016 <DT><SAMP>`-v
´</SAMP>
1018 <DT><SAMP>`--variables
´</SAMP>
1020 <A NAME=
"IDX1111"></A>
1021 <A NAME=
"IDX1112"></A>
1022 Output the variables occurring in
<VAR>shell-format
</VAR>.
1027 <STRONG>Informative output
</STRONG>
1032 <DT><SAMP>`-h
´</SAMP>
1034 <DT><SAMP>`--help
´</SAMP>
1036 <A NAME=
"IDX1113"></A>
1037 <A NAME=
"IDX1114"></A>
1038 Display this help and exit.
1040 <DT><SAMP>`-V
´</SAMP>
1042 <DT><SAMP>`--version
´</SAMP>
1044 <A NAME=
"IDX1115"></A>
1045 <A NAME=
"IDX1116"></A>
1046 Output version information and exit.
1051 In normal operation mode, standard input is copied to standard output,
1052 with references to environment variables of the form
<CODE>$VARIABLE
</CODE> or
1053 <CODE>${VARIABLE}
</CODE> being replaced with the corresponding values. If a
1054 <VAR>shell-format
</VAR> is given, only those environment variables that are
1055 referenced in
<VAR>shell-format
</VAR> are substituted; otherwise all environment
1056 variables references occurring in standard input are substituted.
1060 These substitutions are a subset of the substitutions that a shell performs
1061 on unquoted and double-quoted strings. Other kinds of substitutions done
1062 by a shell, such as
<CODE>${
<VAR>variable
</VAR>-
<VAR>default
</VAR>}
</CODE> or
1063 <CODE>$(
<VAR>command-list
</VAR>)
</CODE> or
<CODE>`
<VAR>command-list
</VAR>`
</CODE>, are not performed
1064 by the
<CODE>envsubst
</CODE> program, due to security reasons.
1068 When
<CODE>--variables
</CODE> is used, standard input is ignored, and the output
1069 consists of the environment variables that are referenced in
1070 <VAR>shell-format
</VAR>, one per line.
1075 <H4><A NAME=
"SEC253" HREF=
"gettext_toc.html#TOC253">13.5.2.6 Invoking the
<CODE>eval_gettext
</CODE> function
</A></H4>
1078 <A NAME=
"IDX1117"></A>
1081 eval_gettext
<VAR>msgid
</VAR>
1085 <A NAME=
"IDX1118"></A>
1086 This function outputs the native language translation of a textual message,
1087 performing dollar-substitution on the result. Note that only shell variables
1088 mentioned in
<VAR>msgid
</VAR> will be dollar-substituted in the result.
1093 <H4><A NAME=
"SEC254" HREF=
"gettext_toc.html#TOC254">13.5.2.7 Invoking the
<CODE>eval_ngettext
</CODE> function
</A></H4>
1096 <A NAME=
"IDX1119"></A>
1099 eval_ngettext
<VAR>msgid
</VAR> <VAR>msgid-plural
</VAR> <VAR>count
</VAR>
1103 <A NAME=
"IDX1120"></A>
1104 This function outputs the native language translation of a textual message
1105 whose grammatical form depends on a number, performing dollar-substitution
1106 on the result. Note that only shell variables mentioned in
<VAR>msgid
</VAR> or
1107 <VAR>msgid-plural
</VAR> will be dollar-substituted in the result.
1112 <H3><A NAME=
"SEC255" HREF=
"gettext_toc.html#TOC255">13.5.3 bash - Bourne-Again Shell Script
</A></H3>
1114 <A NAME=
"IDX1121"></A>
1118 GNU
<CODE>bash
</CODE> 2.0 or newer has a special shorthand for translating a
1119 string and substituting variable values in it:
<CODE>$
"msgid"</CODE>. But
1120 the use of this construct is
<STRONG>discouraged
</STRONG>, due to the security
1121 holes it opens and due to its portability problems.
1125 The security holes of
<CODE>$
"..."</CODE> come from the fact that after looking up
1126 the translation of the string,
<CODE>bash
</CODE> processes it like it processes
1127 any double-quoted string: dollar and backquote processing, like
<SAMP>`eval
´</SAMP>
1135 In a locale whose encoding is one of BIG5, BIG5-HKSCS, GBK, GB18030, SHIFT_JIS,
1136 JOHAB, some double-byte characters have a second byte whose value is
1137 <CODE>0x60</CODE>. For example, the byte sequence
<CODE>\xe0\x60
</CODE> is a single
1138 character in these locales. Many versions of
<CODE>bash
</CODE> (all versions
1139 up to bash-
2.05, and newer versions on platforms without
<CODE>mbsrtowcs()
</CODE>
1140 function) don't know about character boundaries and see a backquote character
1141 where there is only a particular Chinese character. Thus it can start
1142 executing part of the translation as a command list. This situation can occur
1143 even without the translator being aware of it: if the translator provides
1144 translations in the UTF-
8 encoding, it is the
<CODE>gettext()
</CODE> function which
1145 will, during its conversion from the translator's encoding to the user's
1146 locale's encoding, produce the dangerous
<CODE>\x60
</CODE> bytes.
1150 A translator could - voluntarily or inadvertantly - use backquotes
1151 <CODE>"`...`"</CODE> or dollar-parentheses
<CODE>"$(...)"</CODE> in her translations.
1152 The enclosed strings would be executed as command lists by the shell.
1156 The portability problem is that
<CODE>bash
</CODE> must be built with
1157 internationalization support; this is normally not the case on systems
1158 that don't have the
<CODE>gettext()
</CODE> function in libc.
1163 <H3><A NAME=
"SEC256" HREF=
"gettext_toc.html#TOC256">13.5.4 Python
</A></H3>
1165 <A NAME=
"IDX1122"></A>
1180 <CODE>'abc'
</CODE>,
<CODE>u'abc'
</CODE>,
<CODE>r'abc'
</CODE>,
<CODE>ur'abc'
</CODE>,
1181 <BR><CODE>"abc"</CODE>,
<CODE>u
"abc"</CODE>,
<CODE>r
"abc"</CODE>,
<CODE>ur
"abc"</CODE>,
1182 <BR><CODE>"'abc"'
</CODE>,
<CODE>u
"'abc"'
</CODE>,
<CODE>r
"'abc"'
</CODE>,
<CODE>ur
"'abc"'
</CODE>,
1183 <BR><CODE>"""abc"""</CODE>,
<CODE>u
"""abc"""</CODE>,
<CODE>r
"""abc"""</CODE>,
<CODE>ur
"""abc"""</CODE>
1185 <DT>gettext shorthand
1187 <CODE>_('abc')
</CODE> etc.
1189 <DT>gettext/ngettext functions
1191 <CODE>gettext.gettext
</CODE>,
<CODE>gettext.dgettext
</CODE>,
1192 <CODE>gettext.ngettext
</CODE>,
<CODE>gettext.dngettext
</CODE>,
1193 also
<CODE>ugettext
</CODE>,
<CODE>ungettext
</CODE>
1197 <CODE>gettext.textdomain
</CODE> function, or
1198 <CODE>gettext.install(
<VAR>domain
</VAR>)
</CODE> function
1202 <CODE>gettext.bindtextdomain
</CODE> function, or
1203 <CODE>gettext.install(
<VAR>domain
</VAR>,
<VAR>localedir
</VAR>)
</CODE> function
1207 not used by the gettext emulation
1211 <CODE>import gettext
</CODE>
1213 <DT>Use or emulate GNU gettext
1219 <CODE>xgettext
</CODE>
1221 <DT>Formatting with positions
1223 <CODE>'...%(ident)d...' % { 'ident': value }
</CODE>
1235 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-python
</CODE>.
1240 <H3><A NAME=
"SEC257" HREF=
"gettext_toc.html#TOC257">13.5.5 GNU clisp - Common Lisp
</A></H3>
1242 <A NAME=
"IDX1123"></A>
1243 <A NAME=
"IDX1124"></A>
1244 <A NAME=
"IDX1125"></A>
1261 <DT>gettext shorthand
1263 <CODE>(_
"abc")
</CODE>,
<CODE>(ENGLISH
"abc")
</CODE>
1265 <DT>gettext/ngettext functions
1267 <CODE>i18n:gettext
</CODE>,
<CODE>i18n:ngettext
</CODE>
1271 <CODE>i18n:textdomain
</CODE>
1275 <CODE>i18n:textdomaindir
</CODE>
1285 <DT>Use or emulate GNU gettext
1291 <CODE>xgettext -k_ -kENGLISH
</CODE>
1293 <DT>Formatting with positions
1295 <CODE>format
"~1@*~D ~0@*~D"</CODE>
1299 On platforms without gettext, no translation.
1307 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-clisp
</CODE>.
1312 <H3><A NAME=
"SEC258" HREF=
"gettext_toc.html#TOC258">13.5.6 GNU clisp C sources
</A></H3>
1314 <A NAME=
"IDX1126"></A>
1331 <DT>gettext shorthand
1333 <CODE>ENGLISH ?
"abc" :
""</CODE>
1334 <BR><CODE>GETTEXT(
"abc")
</CODE>
1335 <BR><CODE>GETTEXTL(
"abc")
</CODE>
1337 <DT>gettext/ngettext functions
1339 <CODE>clgettext
</CODE>,
<CODE>clgettextl
</CODE>
1355 <CODE>#include
"lispbibl.c"</CODE>
1357 <DT>Use or emulate GNU gettext
1363 <CODE>clisp-xgettext
</CODE>
1365 <DT>Formatting with positions
1367 <CODE>fprintf
"%2$d %1$d"</CODE>
1371 On platforms without gettext, no translation.
1380 <H3><A NAME=
"SEC259" HREF=
"gettext_toc.html#TOC259">13.5.7 Emacs Lisp
</A></H3>
1382 <A NAME=
"IDX1127"></A>
1399 <DT>gettext shorthand
1401 <CODE>(_
"abc")
</CODE>
1403 <DT>gettext/ngettext functions
1405 <CODE>gettext
</CODE>,
<CODE>dgettext
</CODE> (xemacs only)
1409 <CODE>domain
</CODE> special form (xemacs only)
1413 <CODE>bind-text-domain
</CODE> function (xemacs only)
1423 <DT>Use or emulate GNU gettext
1429 <CODE>xgettext
</CODE>
1431 <DT>Formatting with positions
1433 <CODE>format
"%2$d %1$d"</CODE>
1437 Only XEmacs. Without
<CODE>I18N3
</CODE> defined at build time, no translation.
1446 <H3><A NAME=
"SEC260" HREF=
"gettext_toc.html#TOC260">13.5.8 librep
</A></H3>
1448 <A NAME=
"IDX1128"></A>
1455 librep
0.15.3 or newer
1465 <DT>gettext shorthand
1467 <CODE>(_
"abc")
</CODE>
1469 <DT>gettext/ngettext functions
1471 <CODE>gettext
</CODE>
1475 <CODE>textdomain
</CODE> function
1479 <CODE>bindtextdomain
</CODE> function
1487 <CODE>(require 'rep.i18n.gettext)
</CODE>
1489 <DT>Use or emulate GNU gettext
1495 <CODE>xgettext
</CODE>
1497 <DT>Formatting with positions
1499 <CODE>format
"%2$d %1$d"</CODE>
1503 On platforms without gettext, no translation.
1511 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-librep
</CODE>.
1516 <H3><A NAME=
"SEC261" HREF=
"gettext_toc.html#TOC261">13.5.9 GNU guile - Scheme
</A></H3>
1518 <A NAME=
"IDX1129"></A>
1519 <A NAME=
"IDX1130"></A>
1536 <DT>gettext shorthand
1538 <CODE>(_
"abc")
</CODE>
1540 <DT>gettext/ngettext functions
1542 <CODE>gettext
</CODE>,
<CODE>ngettext
</CODE>
1546 <CODE>textdomain
</CODE>
1550 <CODE>bindtextdomain
</CODE>
1554 <CODE>(catch #t (lambda () (setlocale LC_ALL
"")) (lambda args #f))
</CODE>
1558 <CODE>(use-modules (ice-
9 format))
</CODE>
1560 <DT>Use or emulate GNU gettext
1566 <CODE>xgettext -k_
</CODE>
1568 <DT>Formatting with positions
1574 On platforms without gettext, no translation.
1582 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-guile
</CODE>.
1587 <H3><A NAME=
"SEC262" HREF=
"gettext_toc.html#TOC262">13.5.10 GNU Smalltalk
</A></H3>
1589 <A NAME=
"IDX1131"></A>
1606 <DT>gettext shorthand
1608 <CODE>NLS ? 'abc'
</CODE>
1610 <DT>gettext/ngettext functions
1612 <CODE>LcMessagesDomain
>>#at:
</CODE>,
<CODE>LcMessagesDomain
>>#at:plural:with:
</CODE>
1616 <CODE>LcMessages
>>#domain:localeDirectory:
</CODE> (returns a
<CODE>LcMessagesDomain
</CODE>
1618 Example:
<CODE>I18N Locale default messages domain: 'gettext' localeDirectory: /usr/local/share/locale'
</CODE>
1622 <CODE>LcMessages
>>#domain:localeDirectory:
</CODE>, see above.
1626 Automatic if you use
<CODE>I18N Locale default
</CODE>.
1630 <CODE>PackageLoader fileInPackage: 'I18N'!
</CODE>
1632 <DT>Use or emulate GNU gettext
1638 <CODE>xgettext
</CODE>
1640 <DT>Formatting with positions
1642 <CODE>'%
1 %
2' bindWith: 'Hello' with: 'world'
</CODE>
1654 An example is available in the
<TT>`examples
´</TT> directory:
1655 <CODE>hello-smalltalk
</CODE>.
1660 <H3><A NAME=
"SEC263" HREF=
"gettext_toc.html#TOC263">13.5.11 Java
</A></H3>
1662 <A NAME=
"IDX1132"></A>
1679 <DT>gettext shorthand
1683 <DT>gettext/ngettext functions
1685 <CODE>GettextResource.gettext
</CODE>,
<CODE>GettextResource.ngettext
</CODE>
1689 ---, use
<CODE>ResourceBundle.getResource
</CODE> instead
1693 ---, use CLASSPATH instead
1703 <DT>Use or emulate GNU gettext
1705 ---, uses a Java specific message catalog format
1709 <CODE>xgettext -k_
</CODE>
1711 <DT>Formatting with positions
1713 <CODE>MessageFormat.format
"{1,number} {0,number}"</CODE>
1725 Before marking strings as internationalizable, uses of the string
1726 concatenation operator need to be converted to
<CODE>MessageFormat
</CODE>
1727 applications. For example,
<CODE>"file "+filename+
" not found"</CODE> becomes
1728 <CODE>MessageFormat.format(
"file {0} not found", new Object[] { filename })
</CODE>.
1729 Only after this is done, can the strings be marked and extracted.
1733 GNU gettext uses the native Java internationalization mechanism, namely
1734 <CODE>ResourceBundle
</CODE>s. There are two formats of
<CODE>ResourceBundle
</CODE>s:
1735 <CODE>.properties
</CODE> files and
<CODE>.class
</CODE> files. The
<CODE>.properties
</CODE>
1736 format is a text file which the translators can directly edit, like PO
1737 files, but which doesn't support plural forms. Whereas the
<CODE>.class
</CODE>
1738 format is compiled from
<CODE>.java
</CODE> source code and can support plural
1739 forms (provided it is accessed through an appropriate API, see below).
1743 To convert a PO file to a
<CODE>.properties
</CODE> file, the
<CODE>msgcat
</CODE>
1744 program can be used with the option
<CODE>--properties-output
</CODE>. To convert
1745 a
<CODE>.properties
</CODE> file back to a PO file, the
<CODE>msgcat
</CODE> program
1746 can be used with the option
<CODE>--properties-input
</CODE>. All the tools
1747 that manipulate PO files can work with
<CODE>.properties
</CODE> files as well,
1748 if given the
<CODE>--properties-input
</CODE> and/or
<CODE>--properties-output
</CODE>
1753 To convert a PO file to a ResourceBundle class, the
<CODE>msgfmt
</CODE> program
1754 can be used with the option
<CODE>--java
</CODE> or
<CODE>--java2
</CODE>. To convert a
1755 ResourceBundle back to a PO file, the
<CODE>msgunfmt
</CODE> program can be used
1756 with the option
<CODE>--java
</CODE>.
1760 Two different programmatic APIs can be used to access ResourceBundles.
1761 Note that both APIs work with all kinds of ResourceBundles, whether
1762 GNU gettext generated classes, or other
<CODE>.class
</CODE> or
<CODE>.properties
</CODE>
1770 The
<CODE>java.util.ResourceBundle
</CODE> API.
1772 In particular, its
<CODE>getString
</CODE> function returns a string translation.
1773 Note that a missing translation yields a
<CODE>MissingResourceException
</CODE>.
1775 This has the advantage of being the standard API. And it does not require
1776 any additional libraries, only the
<CODE>msgcat
</CODE> generated
<CODE>.properties
</CODE>
1777 files or the
<CODE>msgfmt
</CODE> generated
<CODE>.class
</CODE> files. But it cannot do
1778 plural handling, even if the resource was generated by
<CODE>msgfmt
</CODE> from
1779 a PO file with plural handling.
1783 The
<CODE>gnu.gettext.GettextResource
</CODE> API.
1785 Reference documentation in Javadoc
1.1 style format
1786 is in the
<A HREF=
"javadoc1/tree.html">javadoc1 directory
</A> and
1787 in Javadoc
2 style format
1788 in the
<A HREF=
"javadoc2/index.html">javadoc2 directory
</A>.
1790 Its
<CODE>gettext
</CODE> function returns a string translation. Note that when
1791 a translation is missing, the
<VAR>msgid
</VAR> argument is returned unchanged.
1793 This has the advantage of having the
<CODE>ngettext
</CODE> function for plural
1796 <A NAME=
"IDX1133"></A>
1797 To use this API, one needs the
<CODE>libintl.jar
</CODE> file which is part of
1798 the GNU gettext package and distributed under the LGPL.
1802 Three examples, using the second API, are available in the
<TT>`examples
´</TT>
1803 directory:
<CODE>hello-java
</CODE>,
<CODE>hello-java-awt
</CODE>,
<CODE>hello-java-swing
</CODE>.
1807 Now, to make use of the API and define a shorthand for
<SAMP>`getString
´</SAMP>,
1808 there are two idioms that you can choose from:
1815 In a unique class of your project, say
<SAMP>`Util
´</SAMP>, define a static variable
1816 holding the
<CODE>ResourceBundle
</CODE> instance:
1820 public static ResourceBundle myResources =
1821 ResourceBundle.getBundle(
"domain-name");
1824 All classes containing internationalized strings then contain
1828 private static ResourceBundle res = Util.myResources;
1829 private static String _(String s) { return res.getString(s); }
1832 and the shorthand is used like this:
1836 System.out.println(_(
"Operation completed."));
1841 You add a class with a very short name, say
<SAMP>`S
´</SAMP>, containing just the
1842 definition of the resource bundle and of the shorthand:
1847 public static ResourceBundle myResources =
1848 ResourceBundle.getBundle(
"domain-name");
1849 public static String _(String s) {
1850 return myResources.getString(s);
1855 and the shorthand is used like this:
1859 System.out.println(S._(
"Operation completed."));
1865 Which of the two idioms you choose, will depend on whether copying two lines
1866 of codes into every class is more acceptable in your project than a class
1867 with a single-letter name.
1872 <H3><A NAME=
"SEC264" HREF=
"gettext_toc.html#TOC264">13.5.12 C#
</A></H3>
1874 <A NAME=
"IDX1134"></A>
1881 pnet, pnetlib
0.6.2 or newer, or mono
0.29 or newer
1889 <CODE>"abc"</CODE>,
<CODE>@
"abc"</CODE>
1891 <DT>gettext shorthand
1895 <DT>gettext/ngettext functions
1897 <CODE>GettextResourceManager.GetString
</CODE>,
1898 <CODE>GettextResourceManager.GetPluralString
</CODE>
1902 <CODE>new GettextResourceManager(domain)
</CODE>
1906 ---, compiled message catalogs are located in subdirectories of the directory
1907 containing the executable
1917 <DT>Use or emulate GNU gettext
1919 ---, uses a C# specific message catalog format
1923 <CODE>xgettext -k_
</CODE>
1925 <DT>Formatting with positions
1927 <CODE>String.Format
"{1} {0}"</CODE>
1939 Before marking strings as internationalizable, uses of the string
1940 concatenation operator need to be converted to
<CODE>String.Format
</CODE>
1941 invocations. For example,
<CODE>"file "+filename+
" not found"</CODE> becomes
1942 <CODE>String.Format(
"file {0} not found", filename)
</CODE>.
1943 Only after this is done, can the strings be marked and extracted.
1947 GNU gettext uses the native C#/.NET internationalization mechanism, namely
1948 the classes
<CODE>ResourceManager
</CODE> and
<CODE>ResourceSet
</CODE>. Applications
1949 use the
<CODE>ResourceManager
</CODE> methods to retrieve the native language
1950 translation of strings. An instance of
<CODE>ResourceSet
</CODE> is the in-memory
1951 representation of a message catalog file. The
<CODE>ResourceManager
</CODE> loads
1952 and accesses
<CODE>ResourceSet
</CODE> instances as needed to look up the
1957 There are two formats of
<CODE>ResourceSet
</CODE>s that can be directly loaded by
1958 the C# runtime:
<CODE>.resources
</CODE> files and
<CODE>.dll
</CODE> files.
1965 The
<CODE>.resources
</CODE> format is a binary file usually generated through the
1966 <CODE>resgen
</CODE> or
<CODE>monoresgen
</CODE> utility, but which doesn't support plural
1967 forms.
<CODE>.resources
</CODE> files can also be embedded in .NET
<CODE>.exe
</CODE> files.
1968 This only affects whether a file system access is performed to load the message
1969 catalog; it doesn't affect the contents of the message catalog.
1973 On the other hand, the
<CODE>.dll
</CODE> format is a binary file that is compiled
1974 from
<CODE>.cs
</CODE> source code and can support plural forms (provided it is
1975 accessed through the GNU gettext API, see below).
1979 Note that these .NET
<CODE>.dll
</CODE> and
<CODE>.exe
</CODE> files are not tied to a
1980 particular platform; their file format and GNU gettext for C# can be used
1985 To convert a PO file to a
<CODE>.resources
</CODE> file, the
<CODE>msgfmt
</CODE> program
1986 can be used with the option
<SAMP>`--csharp-resources
´</SAMP>. To convert a
1987 <CODE>.resources
</CODE> file back to a PO file, the
<CODE>msgunfmt
</CODE> program can be
1988 used with the option
<SAMP>`--csharp-resources
´</SAMP>. You can also, in some cases,
1989 use the
<CODE>resgen
</CODE> program (from the
<CODE>pnet
</CODE> package) or the
1990 <CODE>monoresgen
</CODE> program (from the
<CODE>mono
</CODE>/
<CODE>mcs
</CODE> package). These
1991 programs can also convert a
<CODE>.resources
</CODE> file back to a PO file. But
1992 beware: as of this writing (January
2004), the
<CODE>monoresgen
</CODE> converter is
1993 quite buggy and the
<CODE>resgen
</CODE> converter ignores the encoding of the PO
1998 To convert a PO file to a
<CODE>.dll
</CODE> file, the
<CODE>msgfmt
</CODE> program can be
1999 used with the option
<CODE>--csharp
</CODE>. The result will be a
<CODE>.dll
</CODE> file
2000 containing a subclass of
<CODE>GettextResourceSet
</CODE>, which itself is a subclass
2001 of
<CODE>ResourceSet
</CODE>. To convert a
<CODE>.dll
</CODE> file containing a
2002 <CODE>GettextResourceSet
</CODE> subclass back to a PO file, the
<CODE>msgunfmt
</CODE>
2003 program can be used with the option
<CODE>--csharp
</CODE>.
2007 The advantages of the
<CODE>.dll
</CODE> format over the
<CODE>.resources
</CODE> format
2015 Freedom to localize: Users can add their own translations to an application
2016 after it has been built and distributed. Whereas when the programmer uses
2017 a
<CODE>ResourceManager
</CODE> constructor provided by the system, the set of
2018 <CODE>.resources
</CODE> files for an application must be specified when the
2019 application is built and cannot be extended afterwards.
2023 Plural handling: A message catalog in
<CODE>.dll
</CODE> format supports the plural
2024 handling function
<CODE>GetPluralString
</CODE>. Whereas
<CODE>.resources
</CODE> files can
2025 only contain data and only support lookups that depend on a single string.
2029 The
<CODE>GettextResourceManager
</CODE> that loads the message catalogs in
2030 <CODE>.dll
</CODE> format also provides for inheritance on a per-message basis.
2031 For example, in Austrian (
<CODE>de_AT
</CODE>) locale, translations from the German
2032 (
<CODE>de
</CODE>) message catalog will be used for messages not found in the
2033 Austrian message catalog. This has the consequence that the Austrian
2034 translators need only translate those few messages for which the translation
2035 into Austrian differs from the German one. Whereas when working with
2036 <CODE>.resources
</CODE> files, each message catalog must provide the translations
2037 of all messages by itself.
2041 The
<CODE>GettextResourceManager
</CODE> that loads the message catalogs in
2042 <CODE>.dll
</CODE> format also provides for a fallback: The English
<VAR>msgid
</VAR> is
2043 returned when no translation can be found. Whereas when working with
2044 <CODE>.resources
</CODE> files, a language-neutral
<CODE>.resources
</CODE> file must
2045 explicitly be provided as a fallback.
2049 On the side of the programmatic APIs, the programmer can use either the
2050 standard
<CODE>ResourceManager
</CODE> API and the GNU
<CODE>GettextResourceManager
</CODE>
2051 API. The latter is an extension of the former, because
2052 <CODE>GettextResourceManager
</CODE> is a subclass of
<CODE>ResourceManager
</CODE>.
2059 The
<CODE>System.Resources.ResourceManager
</CODE> API.
2061 This API works with resources in
<CODE>.resources
</CODE> format.
2063 The creation of the
<CODE>ResourceManager
</CODE> is done through
2066 new ResourceManager(domainname, Assembly.GetExecutingAssembly())
2070 The
<CODE>GetString
</CODE> function returns a string's translation. Note that this
2071 function returns null when a translation is missing (i.e. not even found in
2072 the fallback resource file).
2076 The
<CODE>GNU.Gettext.GettextResourceManager
</CODE> API.
2078 This API works with resources in
<CODE>.dll
</CODE> format.
2080 Reference documentation is in the
2081 <A HREF=
"csharpdoc/index.html">csharpdoc directory
</A>.
2083 The creation of the
<CODE>ResourceManager
</CODE> is done through
2086 new GettextResourceManager(domainname)
2089 The
<CODE>GetString
</CODE> function returns a string's translation. Note that when
2090 a translation is missing, the
<VAR>msgid
</VAR> argument is returned unchanged.
2092 The
<CODE>GetPluralString
</CODE> function returns a string translation with plural
2093 handling, like the
<CODE>ngettext
</CODE> function in C.
2095 <A NAME=
"IDX1135"></A>
2096 To use this API, one needs the
<CODE>GNU.Gettext.dll
</CODE> file which is part of
2097 the GNU gettext package and distributed under the LGPL.
2101 You can also mix both approaches: use the
2102 <CODE>GNU.Gettext.GettextResourceManager
</CODE> constructor, but otherwise use
2103 only the
<CODE>ResourceManager
</CODE> type and only the
<CODE>GetString
</CODE> method.
2104 This is appropriate when you want to profit from the tools for PO files,
2105 but don't want to change an existing source code that uses
2106 <CODE>ResourceManager
</CODE> and don't (yet) need the
<CODE>GetPluralString
</CODE> method.
2110 Two examples, using the second API, are available in the
<TT>`examples
´</TT>
2111 directory:
<CODE>hello-csharp
</CODE>,
<CODE>hello-csharp-forms
</CODE>.
2115 Now, to make use of the API and define a shorthand for
<SAMP>`GetString
´</SAMP>,
2116 there are two idioms that you can choose from:
2123 In a unique class of your project, say
<SAMP>`Util
´</SAMP>, define a static variable
2124 holding the
<CODE>ResourceManager
</CODE> instance:
2128 public static GettextResourceManager MyResourceManager =
2129 new GettextResourceManager(
"domain-name");
2132 All classes containing internationalized strings then contain
2136 private static GettextResourceManager Res = Util.MyResourceManager;
2137 private static String _(String s) { return Res.GetString(s); }
2140 and the shorthand is used like this:
2144 Console.WriteLine(_(
"Operation completed."));
2149 You add a class with a very short name, say
<SAMP>`S
´</SAMP>, containing just the
2150 definition of the resource manager and of the shorthand:
2155 public static GettextResourceManager MyResourceManager =
2156 new GettextResourceManager(
"domain-name");
2157 public static String _(String s) {
2158 return MyResourceManager.GetString(s);
2163 and the shorthand is used like this:
2167 Console.WriteLine(S._(
"Operation completed."));
2173 Which of the two idioms you choose, will depend on whether copying two lines
2174 of codes into every class is more acceptable in your project than a class
2175 with a single-letter name.
2180 <H3><A NAME=
"SEC265" HREF=
"gettext_toc.html#TOC265">13.5.13 GNU awk
</A></H3>
2182 <A NAME=
"IDX1136"></A>
2183 <A NAME=
"IDX1137"></A>
2200 <DT>gettext shorthand
2204 <DT>gettext/ngettext functions
2206 <CODE>dcgettext
</CODE>, missing
<CODE>dcngettext
</CODE> in gawk-
3.1.0
2210 <CODE>TEXTDOMAIN
</CODE> variable
2214 <CODE>bindtextdomain
</CODE> function
2218 automatic, but missing
<CODE>setlocale (LC_MESSAGES,
"")
</CODE> in gawk-
3.1.0
2224 <DT>Use or emulate GNU gettext
2230 <CODE>xgettext
</CODE>
2232 <DT>Formatting with positions
2234 <CODE>printf
"%2$d %1$d"</CODE> (GNU awk only)
2238 On platforms without gettext, no translation. On non-GNU awks, you must
2239 define
<CODE>dcgettext
</CODE>,
<CODE>dcngettext
</CODE> and
<CODE>bindtextdomain
</CODE>
2248 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-gawk
</CODE>.
2253 <H3><A NAME=
"SEC266" HREF=
"gettext_toc.html#TOC266">13.5.14 Pascal - Free Pascal Compiler
</A></H3>
2255 <A NAME=
"IDX1138"></A>
2256 <A NAME=
"IDX1139"></A>
2257 <A NAME=
"IDX1140"></A>
2268 <CODE>pp
</CODE>,
<CODE>pas
</CODE>
2274 <DT>gettext shorthand
2278 <DT>gettext/ngettext functions
2280 ---, use
<CODE>ResourceString
</CODE> data type instead
2284 ---, use
<CODE>TranslateResourceStrings
</CODE> function instead
2288 ---, use
<CODE>TranslateResourceStrings
</CODE> function instead
2292 automatic, but uses only LANG, not LC_MESSAGES or LC_ALL
2296 <CODE>{$mode delphi}
</CODE> or
<CODE>{$mode objfpc}
</CODE><BR><CODE>uses gettext;
</CODE>
2298 <DT>Use or emulate GNU gettext
2304 <CODE>ppc386
</CODE> followed by
<CODE>xgettext
</CODE> or
<CODE>rstconv
</CODE>
2306 <DT>Formatting with positions
2308 <CODE>uses sysutils;
</CODE><BR><CODE>format
"%1:d %0:d"</CODE>
2320 The Pascal compiler has special support for the
<CODE>ResourceString
</CODE> data
2321 type. It generates a
<CODE>.rst
</CODE> file. This is then converted to a
2322 <CODE>.pot
</CODE> file by use of
<CODE>xgettext
</CODE> or
<CODE>rstconv
</CODE>. At runtime,
2323 a
<CODE>.mo
</CODE> file corresponding to translations of this
<CODE>.pot
</CODE> file
2324 can be loaded using the
<CODE>TranslateResourceStrings
</CODE> function in the
2325 <CODE>gettext
</CODE> unit.
2329 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-pascal
</CODE>.
2334 <H3><A NAME=
"SEC267" HREF=
"gettext_toc.html#TOC267">13.5.15 wxWindows library
</A></H3>
2336 <A NAME=
"IDX1141"></A>
2353 <DT>gettext shorthand
2355 <CODE>_(
"abc")
</CODE>
2357 <DT>gettext/ngettext functions
2359 <CODE>wxLocale::GetString
</CODE>,
<CODE>wxGetTranslation
</CODE>
2363 <CODE>wxLocale::AddCatalog
</CODE>
2367 <CODE>wxLocale::AddCatalogLookupPathPrefix
</CODE>
2371 <CODE>wxLocale::Init
</CODE>,
<CODE>wxSetLocale
</CODE>
2375 <CODE>#include
<wx/intl.h
></CODE>
2377 <DT>Use or emulate GNU gettext
2379 emulate, see
<CODE>include/wx/intl.h
</CODE> and
<CODE>src/common/intl.cpp
</CODE>
2383 <CODE>xgettext
</CODE>
2385 <DT>Formatting with positions
2400 <H3><A NAME=
"SEC268" HREF=
"gettext_toc.html#TOC268">13.5.16 YCP - YaST2 scripting language
</A></H3>
2402 <A NAME=
"IDX1142"></A>
2403 <A NAME=
"IDX1143"></A>
2410 libycp, libycp-devel, yast2-core, yast2-core-devel
2420 <DT>gettext shorthand
2422 <CODE>_(
"abc")
</CODE>
2424 <DT>gettext/ngettext functions
2426 <CODE>_()
</CODE> with
1 or
3 arguments
2430 <CODE>textdomain
</CODE> statement
2444 <DT>Use or emulate GNU gettext
2450 <CODE>xgettext
</CODE>
2452 <DT>Formatting with positions
2454 <CODE>sformat
"%2 %1"</CODE>
2466 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-ycp
</CODE>.
2471 <H3><A NAME=
"SEC269" HREF=
"gettext_toc.html#TOC269">13.5.17 Tcl - Tk's scripting language
</A></H3>
2473 <A NAME=
"IDX1144"></A>
2474 <A NAME=
"IDX1145"></A>
2491 <DT>gettext shorthand
2493 <CODE>[_
"abc"]
</CODE>
2495 <DT>gettext/ngettext functions
2497 <CODE>::msgcat::mc
</CODE>
2505 ---, use
<CODE>::msgcat::mcload
</CODE> instead
2509 automatic, uses LANG, but ignores LC_MESSAGES and LC_ALL
2513 <CODE>package require msgcat
</CODE>
2514 <BR><CODE>proc _ {s} {return [::msgcat::mc $s]}
</CODE>
2516 <DT>Use or emulate GNU gettext
2518 ---, uses a Tcl specific message catalog format
2522 <CODE>xgettext -k_
</CODE>
2524 <DT>Formatting with positions
2526 <CODE>format
"%2\$d %1\$d"</CODE>
2538 Two examples are available in the
<TT>`examples
´</TT> directory:
2539 <CODE>hello-tcl
</CODE>,
<CODE>hello-tcl-tk
</CODE>.
2543 Before marking strings as internationalizable, substitutions of variables
2544 into the string need to be converted to
<CODE>format
</CODE> applications. For
2545 example,
<CODE>"file $filename not found"</CODE> becomes
2546 <CODE>[format
"file %s not found" $filename]
</CODE>.
2547 Only after this is done, can the strings be marked and extracted.
2548 After marking, this example becomes
2549 <CODE>[format [_
"file %s not found"] $filename]
</CODE> or
2550 <CODE>[msgcat::mc
"file %s not found" $filename]
</CODE>. Note that the
2551 <CODE>msgcat::mc
</CODE> function implicitly calls
<CODE>format
</CODE> when more than one
2557 <H3><A NAME=
"SEC270" HREF=
"gettext_toc.html#TOC270">13.5.18 Perl
</A></H3>
2559 <A NAME=
"IDX1146"></A>
2570 <CODE>pl
</CODE>,
<CODE>PL
</CODE>,
<CODE>pm
</CODE>,
<CODE>cgi
</CODE>
2577 <LI><CODE>"abc"</CODE>
2579 <LI><CODE>'abc'
</CODE>
2581 <LI><CODE>qq (abc)
</CODE>
2583 <LI><CODE>q (abc)
</CODE>
2585 <LI><CODE>qr /abc/
</CODE>
2587 <LI><CODE>qx (/bin/date)
</CODE>
2589 <LI><CODE>/pattern match/
</CODE>
2591 <LI><CODE>?pattern match?
</CODE>
2593 <LI><CODE>s/substitution/operators/
</CODE>
2595 <LI><CODE>$tied_hash{
"message"}
</CODE>
2597 <LI><CODE>$tied_hash_reference-
>{
"message"}
</CODE>
2599 <LI>etc., issue the command
<SAMP>`man perlsyn
´</SAMP> for details
2603 <DT>gettext shorthand
2605 <CODE>__
</CODE> (double underscore)
2607 <DT>gettext/ngettext functions
2609 <CODE>gettext
</CODE>,
<CODE>dgettext
</CODE>,
<CODE>dcgettext
</CODE>,
<CODE>ngettext
</CODE>,
2610 <CODE>dngettext
</CODE>,
<CODE>dcngettext
</CODE>
2614 <CODE>textdomain
</CODE> function
2618 <CODE>bindtextdomain
</CODE> function
2620 <DT>bind_textdomain_codeset
2622 <CODE>bind_textdomain_codeset
</CODE> function
2626 Use
<CODE>setlocale (LC_ALL,
"");
</CODE>
2630 <CODE>use POSIX;
</CODE>
2631 <BR><CODE>use Locale::TextDomain;
</CODE> (included in the package libintl-perl
2632 which is available on the Comprehensive Perl Archive Network CPAN,
2633 http://www.cpan.org/).
2635 <DT>Use or emulate GNU gettext
2637 platform dependent: gettext_pp emulates, gettext_xs uses GNU gettext
2641 <CODE>xgettext -k__ -k\$__ -k%__ -k__x -k__n:
1,
2 -k__nx:
1,
2 -k__xn:
1,
2 -kN__ -k
</CODE>
2643 <DT>Formatting with positions
2645 Both kinds of format strings support formatting with positions.
2646 <BR><CODE>printf
"%2\$d %1\$d", ...
</CODE> (requires Perl
5.8.0 or newer)
2647 <BR><CODE>__expand(
"[new] replaces [old]", old =
> $oldvalue, new =
> $newvalue)
</CODE>
2651 The
<CODE>libintl-perl
</CODE> package is platform independent but is not
2652 part of the Perl core. The programmer is responsible for
2653 providing a dummy implementation of the required functions if the
2654 package is not installed on the target system.
2662 Included in
<CODE>libintl-perl
</CODE>, available on CPAN
2663 (http://www.cpan.org/).
2668 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-perl
</CODE>.
2672 <A NAME=
"IDX1147"></A>
2676 The
<CODE>xgettext
</CODE> parser backend for Perl differs significantly from
2677 the parser backends for other programming languages, just as Perl
2678 itself differs significantly from other programming languages. The
2679 Perl parser backend offers many more string marking facilities than
2680 the other backends but it also has some Perl specific limitations, the
2681 worst probably being its imperfectness.
2687 <H4><A NAME=
"SEC271" HREF=
"gettext_toc.html#TOC271">13.5.18.1 General Problems Parsing Perl Code
</A></H4>
2690 It is often heard that only Perl can parse Perl. This is not true.
2691 Perl cannot be
<EM>parsed
</EM> at all, it can only be
<EM>executed
</EM>.
2692 Perl has various built-in ambiguities that can only be resolved at runtime.
2696 The following example may illustrate one common problem:
2701 print gettext
"Hello World!";
2705 Although this example looks like a bullet-proof case of a function
2706 invocation, it is not:
2711 open gettext,
">testfile" or die;
2712 print gettext
"Hello world!"
2716 In this context, the string
<CODE>gettext
</CODE> looks more like a
2717 file handle. But not necessarily:
2722 use Locale::Messages qw (:libintl_h);
2723 open gettext
">testfile" or die;
2724 print gettext
"Hello world!";
2728 Now, the file is probably syntactically incorrect, provided that the module
2729 <CODE>Locale::Messages
</CODE> found first in the Perl include path exports a
2730 function
<CODE>gettext
</CODE>. But what if the module
2731 <CODE>Locale::Messages
</CODE> really looks like this?
2736 use vars qw (*gettext);
2742 In this case, the string
<CODE>gettext
</CODE> will be interpreted as a file
2743 handle again, and the above example will create a file
<TT>`testfile
´</TT>
2744 and write the string
"Hello world!" into it. Even advanced
2745 control flow analysis will not really help:
2750 if (
0.5 < rand) {
2755 print gettext
"Hello world!";
2759 If the module
<CODE>Sane
</CODE> exports a function
<CODE>gettext
</CODE> that does
2760 what we expect, and the module
<CODE>InSane
</CODE> opens a file for writing
2761 and associates the
<EM>handle
</EM> <CODE>gettext
</CODE> with this output
2762 stream, we are clueless again about what will happen at runtime. It is
2763 completely unpredictable. The truth is that Perl has so many ways to
2764 fill its symbol table at runtime that it is impossible to interpret a
2765 particular piece of code without executing it.
2769 Of course,
<CODE>xgettext
</CODE> will not execute your Perl sources while
2770 scanning for translatable strings, but rather use heuristics in order
2771 to guess what you meant.
2775 Another problem is the ambiguity of the slash and the question mark.
2776 Their interpretation depends on the context:
2782 print
"OK\n" if /foobar/;
2787 # Another pattern match.
2788 print
"OK\n" if ?foobar?;
2791 print $x ?
"foo" :
"bar";
2795 The slash may either act as the division operator or introduce a
2796 pattern match, whereas the question mark may act as the ternary
2797 conditional operator or as a pattern match, too. Other programming
2798 languages like
<CODE>awk
</CODE> present similar problems, but the consequences of a
2799 misinterpretation are particularly nasty with Perl sources. In
<CODE>awk
</CODE>
2800 for instance, a statement can never exceed one line and the parser
2801 can recover from a parsing error at the next newline and interpret
2802 the rest of the input stream correctly. Perl is different, as a
2803 pattern match is terminated by the next appearance of the delimiter
2804 (the slash or the question mark) in the input stream, regardless of
2805 the semantic context. If a slash is really a division sign but
2806 mis-interpreted as a pattern match, the rest of the input file is most
2807 probably parsed incorrectly.
2811 If you find that
<CODE>xgettext
</CODE> fails to extract strings from
2812 portions of your sources, you should therefore look out for slashes
2813 and/or question marks preceding these sections. You may have come
2814 across a bug in
<CODE>xgettext
</CODE>'s Perl parser (and of course you
2815 should report that bug). In the meantime you should consider to
2816 reformulate your code in a manner less challenging to
<CODE>xgettext
</CODE>.
2821 <H4><A NAME=
"SEC272" HREF=
"gettext_toc.html#TOC272">13.5.18.2 Which keywords will xgettext look for?
</A></H4>
2823 <A NAME=
"IDX1148"></A>
2827 Unless you instruct
<CODE>xgettext
</CODE> otherwise by invoking it with one
2828 of the options
<CODE>--keyword
</CODE> or
<CODE>-k
</CODE>, it will recognize the
2829 following keywords in your Perl sources:
2835 <LI><CODE>gettext
</CODE>
2837 <LI><CODE>dgettext
</CODE>
2839 <LI><CODE>dcgettext
</CODE>
2841 <LI><CODE>ngettext:
1,
2</CODE>
2843 The first (singular) and the second (plural) argument will be
2846 <LI><CODE>dngettext:
1,
2</CODE>
2848 The first (singular) and the second (plural) argument will be
2851 <LI><CODE>dcngettext:
1,
2</CODE>
2853 The first (singular) and the second (plural) argument will be
2856 <LI><CODE>gettext_noop
</CODE>
2858 <LI><CODE>%gettext
</CODE>
2860 The keys of lookups into the hash
<CODE>%gettext
</CODE> will be extracted.
2862 <LI><CODE>$gettext
</CODE>
2864 The keys of lookups into the hash reference
<CODE>$gettext
</CODE> will be extracted.
2870 <H4><A NAME=
"SEC273" HREF=
"gettext_toc.html#TOC273">13.5.18.3 How to Extract Hash Keys
</A></H4>
2872 <A NAME=
"IDX1149"></A>
2876 Translating messages at runtime is normally performed by looking up the
2877 original string in the translation database and returning the
2878 translated version. The
"natural" Perl implementation is a hash
2879 lookup, and, of course,
<CODE>xgettext
</CODE> supports such practice.
2884 print __
"Hello world!";
2885 print $__{
"Hello world!"};
2886 print $__-
>{
"Hello world!"};
2887 print $$__{
"Hello world!"};
2891 The above four lines all do the same thing. The Perl module
2892 <CODE>Locale::TextDomain
</CODE> exports by default a hash
<CODE>%__
</CODE> that
2893 is tied to the function
<CODE>__()
</CODE>. It also exports a reference
2894 <CODE>$__
</CODE> to
<CODE>%__
</CODE>.
2898 If an argument to the
<CODE>xgettext
</CODE> option
<CODE>--keyword
</CODE>,
2899 resp.
<CODE>-k
</CODE> starts with a percent sign, the rest of the keyword is
2900 interpreted as the name of a hash. If it starts with a dollar
2901 sign, the rest of the keyword is interpreted as a reference to a
2906 Note that you can omit the quotation marks (single or double) around
2907 the hash key (almost) whenever Perl itself allows it:
2912 print $gettext{Error};
2916 The exact rule is: You can omit the surrounding quotes, when the hash
2917 key is a valid C (!) identifier, i. e. when it starts with an
2918 underscore or an ASCII letter and is followed by an arbitrary number
2919 of underscores, ASCII letters or digits. Other Unicode characters
2920 are
<EM>not
</EM> allowed, regardless of the
<CODE>use utf8
</CODE> pragma.
2925 <H4><A NAME=
"SEC274" HREF=
"gettext_toc.html#TOC274">13.5.18.4 What are Strings And Quote-like Expressions?
</A></H4>
2927 <A NAME=
"IDX1150"></A>
2931 Perl offers a plethora of different string constructs. Those that can
2932 be used either as arguments to functions or inside braces for hash
2933 lookups are generally supported by
<CODE>xgettext
</CODE>.
2938 <LI><STRONG>double-quoted strings
</STRONG>
2943 print gettext
"Hello World!";
2946 <LI><STRONG>single-quoted strings
</STRONG>
2951 print gettext 'Hello World!';
2954 <LI><STRONG>the operator qq
</STRONG>
2959 print gettext qq |Hello World!|;
2960 print gettext qq
<E-mail:
<guido\@imperia.net
>>;
2963 The operator
<CODE>qq
</CODE> is fully supported. You can use arbitrary
2964 delimiters, including the four bracketing delimiters (round, angle,
2965 square, curly) that nest.
2967 <LI><STRONG>the operator q
</STRONG>
2972 print gettext q |Hello World!|;
2973 print gettext q
<E-mail:
<guido@imperia.net
>>;
2976 The operator
<CODE>q
</CODE> is fully supported. You can use arbitrary
2977 delimiters, including the four bracketing delimiters (round, angle,
2978 square, curly) that nest.
2980 <LI><STRONG>the operator qx
</STRONG>
2985 print gettext qx ;LANGUAGE=C /bin/date;
2986 print gettext qx [/usr/bin/ls | grep '^[A-Z]*'];
2989 The operator
<CODE>qx
</CODE> is fully supported. You can use arbitrary
2990 delimiters, including the four bracketing delimiters (round, angle,
2991 square, curly) that nest.
2993 The example is actually a useless use of
<CODE>gettext
</CODE>. It will
2994 invoke the
<CODE>gettext
</CODE> function on the output of the command
2995 specified with the
<CODE>qx
</CODE> operator. The feature was included
2996 in order to make the interface consistent (the parser will extract
2997 all strings and quote-like expressions).
2999 <LI><STRONG>here documents
</STRONG>
3004 print gettext
<<'EOF';
3005 program not found in $PATH
3008 print ngettext
<<EOF,
<<"EOF";
3011 several files deleted
3015 Here-documents are recognized. If the delimiter is enclosed in single
3016 quotes, the string is not interpolated. If it is enclosed in double
3017 quotes or has no quotes at all, the string is interpolated.
3019 Delimiters that start with a digit are not supported!
3025 <H4><A NAME=
"SEC275" HREF=
"gettext_toc.html#TOC275">13.5.18.5 Invalid Uses Of String Interpolation
</A></H4>
3027 <A NAME=
"IDX1151"></A>
3031 Perl is capable of interpolating variables into strings. This offers
3032 some nice features in localized programs but can also lead to
3037 A common error is a construct like the following:
3042 print gettext
"This is the program $0!\n";
3046 Perl will interpolate at runtime the value of the variable
<CODE>$
0</CODE>
3047 into the argument of the
<CODE>gettext()
</CODE> function. Hence, this
3048 argument is not a string constant but a variable argument (
<CODE>$
0</CODE>
3049 is a global variable that holds the name of the Perl script being
3050 executed). The interpolation is performed by Perl before the string
3051 argument is passed to
<CODE>gettext()
</CODE> and will therefore depend on
3052 the name of the script which can only be determined at runtime.
3053 Consequently, it is almost impossible that a translation can be looked
3054 up at runtime (except if, by accident, the interpolated string is found
3055 in the message catalog).
3059 The
<CODE>xgettext
</CODE> program will therefore terminate parsing with a fatal
3060 error if it encounters a variable inside of an extracted string. In
3061 general, this will happen for all kinds of string interpolations that
3062 cannot be safely performed at compile time. If you absolutely know
3063 what you are doing, you can always circumvent this behavior:
3068 my $know_what_i_am_doing =
"This is program $0!\n";
3069 print gettext $know_what_i_am_doing;
3073 Since the parser only recognizes strings and quote-like expressions,
3074 but not variables or other terms, the above construct will be
3075 accepted. You will have to find another way, however, to let your
3076 original string make it into your message catalog.
3080 If invoked with the option
<CODE>--extract-all
</CODE>, resp.
<CODE>-a
</CODE>,
3081 variable interpolation will be accepted. Rationale: You will
3082 generally use this option in order to prepare your sources for
3083 internationalization.
3087 Please see the manual page
<SAMP>`man perlop
´</SAMP> for details of strings and
3088 quote-like expressions that are subject to interpolation and those
3089 that are not. Safe interpolations (that will not lead to a fatal
3096 <LI>the escape sequences
<CODE>\t
</CODE> (tab, HT, TAB),
<CODE>\n
</CODE>
3098 (newline, NL),
<CODE>\r
</CODE> (return, CR),
<CODE>\f
</CODE> (form feed, FF),
3099 <CODE>\b
</CODE> (backspace, BS),
<CODE>\a
</CODE> (alarm, bell, BEL), and
<CODE>\e
</CODE>
3102 <LI>octal chars, like
<CODE>\
033</CODE>
3105 Note that octal escapes in the range of
400-
777 are translated into a
3106 UTF-
8 representation, regardless of the presence of the
<CODE>use utf8
</CODE> pragma.
3108 <LI>hex chars, like
<CODE>\x1b
</CODE>
3110 <LI>wide hex chars, like
<CODE>\x{
263a}
</CODE>
3113 Note that this escape is translated into a UTF-
8 representation,
3114 regardless of the presence of the
<CODE>use utf8
</CODE> pragma.
3116 <LI>control chars, like
<CODE>\c[
</CODE> (CTRL-[)
3118 <LI>named Unicode chars, like
<CODE>\N{LATIN CAPITAL LETTER C WITH CEDILLA}
</CODE>
3121 Note that this escape is translated into a UTF-
8 representation,
3122 regardless of the presence of the
<CODE>use utf8
</CODE> pragma.
3126 The following escapes are considered partially safe:
3132 <LI><CODE>\l
</CODE> lowercase next char
3134 <LI><CODE>\u
</CODE> uppercase next char
3136 <LI><CODE>\L
</CODE> lowercase till \E
3138 <LI><CODE>\U
</CODE> uppercase till \E
3140 <LI><CODE>\E
</CODE> end case modification
3142 <LI><CODE>\Q
</CODE> quote non-word characters till \E
3147 These escapes are only considered safe if the string consists of
3148 ASCII characters only. Translation of characters outside the range
3149 defined by ASCII is locale-dependent and can actually only be performed
3150 at runtime;
<CODE>xgettext
</CODE> doesn't do these locale-dependent translations
3155 Except for the modifier
<CODE>\Q
</CODE>, these translations, albeit valid,
3156 are generally useless and only obfuscate your sources. If a
3157 translation can be safely performed at compile time you can just as
3158 well write what you mean.
3163 <H4><A NAME=
"SEC276" HREF=
"gettext_toc.html#TOC276">13.5.18.6 Valid Uses Of String Interpolation
</A></H4>
3165 <A NAME=
"IDX1152"></A>
3169 Perl is often used to generate sources for other programming languages
3170 or arbitrary file formats. Web applications that output HTML code
3171 make a prominent example for such usage.
3175 You will often come across situations where you want to intersperse
3176 code written in the target (programming) language with translatable
3177 messages, like in the following HTML example:
3182 print gettext
<<EOF;
3183 <h1
>My Homepage
</h1
>
3184 <script
language=
"JavaScript"><!--
3185 for (i =
0; i
< 100; ++i) {
3186 alert (
"Thank you so much for visiting my homepage!");
3188 //--
></script
>
3193 The parser will extract the entire here document, and it will appear
3194 entirely in the resulting PO file, including the JavaScript snippet
3195 embedded in the HTML code. If you exaggerate with constructs like
3196 the above, you will run the risk that the translators of your package
3197 will look out for a less challenging project. You should consider an
3198 alternative expression here:
3203 print
<<EOF;
3204 <h1
>$gettext{
"My Homepage"}
</h1
>
3205 <script
language=
"JavaScript"><!--
3206 for (i =
0; i
< 100; ++i) {
3207 alert (
"$gettext{'Thank you so much for visiting my homepage!'}");
3209 //--
></script
>
3214 Only the translatable portions of the code will be extracted here, and
3215 the resulting PO file will begrudgingly improve in terms of readability.
3219 You can interpolate hash lookups in all strings or quote-like
3220 expressions that are subject to interpolation (see the manual page
3221 <SAMP>`man perlop
´</SAMP> for details). Double interpolation is invalid, however:
3226 # TRANSLATORS: Replace
"the earth" with the name of your planet.
3227 print gettext qq{Welcome to $gettext-
>{
"the earth"}};
3231 The
<CODE>qq
</CODE>-quoted string is recognized as an argument to
<CODE>xgettext
</CODE> in
3232 the first place, and checked for invalid variable interpolation. The
3233 dollar sign of hash-dereferencing will therefore terminate the parser
3234 with an
"invalid interpolation" error.
3238 It is valid to interpolate hash lookups in regular expressions:
3243 if ($var =~ /$gettext{
"the earth"}/) {
3244 print gettext
"Match!\n";
3246 s/$gettext{
"U. S. A."}/$gettext{
"U. S. A."} $gettext{
"(dial +0)"}/g;
3251 <H4><A NAME=
"SEC277" HREF=
"gettext_toc.html#TOC277">13.5.18.7 When To Use Parentheses
</A></H4>
3253 <A NAME=
"IDX1153"></A>
3257 In Perl, parentheses around function arguments are mostly optional.
3258 <CODE>xgettext
</CODE> will always assume that all
3259 recognized keywords (except for hashs and hash references) are names
3260 of properly prototyped functions, and will (hopefully) only require
3261 parentheses where Perl itself requires them. All constructs in the
3262 following example are therefore ok to use:
3267 print gettext (
"Hello World!\n");
3268 print gettext
"Hello World!\n";
3269 print dgettext ($package =
> "Hello World!\n");
3270 print dgettext $package,
"Hello World!\n";
3272 # The
"fat comma" =
> turns the left-hand side argument into a
3273 # single-quoted string!
3274 print dgettext smellovision =
> "Hello World!\n";
3276 # The following assignment only works with prototyped functions.
3277 # Otherwise, the functions will act as
"greedy" list operators and
3278 # eat up all following arguments.
3279 my $anonymous_hash = {
3280 planet =
> gettext
"earth",
3281 cakes =
> ngettext
"one cake",
"several cakes", $n,
3282 still =
> $works,
3284 # The same without fat comma:
3286 'planet', gettext
"earth",
3287 'cakes', ngettext
"one cake",
"several cakes", $n,
3291 # Parentheses are only significant for the first argument.
3292 print dngettext 'package', (
"one cake",
"several cakes", $n), $discarded;
3297 <H4><A NAME=
"SEC278" HREF=
"gettext_toc.html#TOC278">13.5.18.8 How To Grok with Long Lines
</A></H4>
3299 <A NAME=
"IDX1154"></A>
3303 The necessity of long messages can often lead to a cumbersome or
3304 unreadable coding style. Perl has several options that may prevent
3305 you from writing unreadable code, and
3306 <CODE>xgettext
</CODE> does its best to do likewise. This is where the dot
3307 operator (the string concatenation operator) may come in handy:
3312 print gettext (
"This is a very long"
3313 .
" message that is still"
3314 .
" readable, because"
3315 .
" it is split into"
3316 .
" multiple lines.\n");
3320 Perl is smart enough to concatenate these constant string fragments
3321 into one long string at compile time, and so is
3322 <CODE>xgettext
</CODE>. You will only find one long message in the resulting
3327 Note that the future Perl
6 will probably use the underscore
3328 (
<SAMP>`_
´</SAMP>) as the string concatenation operator, and the dot
3329 (
<SAMP>`.
´</SAMP>) for dereferencing. This new syntax is not yet supported by
3330 <CODE>xgettext
</CODE>.
3334 If embedded newline characters are not an issue, or even desired, you
3335 may also insert newline characters inside quoted strings wherever you
3341 print gettext (
"<em>In HTML output
3342 embedded newlines are generally no
3343 problem, since adjacent whitespace
3344 is always rendered into a single
3345 space character.</em>");
3349 You may also consider to use here documents:
3354 print gettext
<<EOF;
3355 <em
>In HTML output
3356 embedded newlines are generally no
3357 problem, since adjacent whitespace
3358 is always rendered into a single
3359 space character.
</em
>
3364 Please do not forget, that the line breaks are real, i. e. they
3365 translate into newline characters that will consequently show up in
3366 the resulting POT file.
3371 <H4><A NAME=
"SEC279" HREF=
"gettext_toc.html#TOC279">13.5.18.9 Bugs, Pitfalls, And Things That Do Not Work
</A></H4>
3373 <A NAME=
"IDX1155"></A>
3377 The foregoing sections should have proven that
3378 <CODE>xgettext
</CODE> is quite smart in extracting translatable strings from
3379 Perl sources. Yet, some more or less exotic constructs that could be
3380 expected to work, actually do not work.
3384 One of the more relevant limitations can be found in the
3385 implementation of variable interpolation inside quoted strings. Only
3386 simple hash lookups can be used there:
3391 print
<<EOF;
3392 $gettext{
"The dot operator"
3395 Likewise, you cannot @{[ gettext (
"interpolate function calls") ]}
3396 inside quoted strings or quote-like expressions.
3401 This is valid Perl code and will actually trigger invocations of the
3402 <CODE>gettext
</CODE> function at runtime. Yet, the Perl parser in
3403 <CODE>xgettext
</CODE> will fail to recognize the strings. A less obvious
3404 example can be found in the interpolation of regular expressions:
3409 s/
<!--START_OF_WEEK--
>/gettext (
"Sunday")/e;
3413 The modifier
<CODE>e
</CODE> will cause the substitution to be interpreted as
3414 an evaluable statement. Consequently, at runtime the function
3415 <CODE>gettext()
</CODE> is called, but again, the parser fails to extract the
3416 string
"Sunday". Use a temporary variable as a simple workaround if
3417 you really happen to need this feature:
3422 my $sunday = gettext
"Sunday";
3423 s/
<!--START_OF_WEEK--
>/$sunday/;
3427 Hash slices would also be handy but are not recognized:
3432 my @weekdays = @gettext{'Sunday', 'Monday', 'Tuesday', 'Wednesday',
3433 'Thursday', 'Friday', 'Saturday'};
3435 @weekdays = @gettext{qw (Sunday Monday Tuesday Wednesday Thursday
3440 This is perfectly valid usage of the tied hash
<CODE>%gettext
</CODE> but the
3441 strings are not recognized and therefore will not be extracted.
3445 Another caveat of the current version is its rudimentary support for
3446 non-ASCII characters in identifiers. You may encounter serious
3447 problems if you use identifiers with characters outside the range of
3448 'A'-'Z', 'a'-'z', '
0'-'
9' and the underscore '_'.
3452 Maybe some of these missing features will be implemented in future
3453 versions, but since you can always make do without them at minimal effort,
3454 these todos have very low priority.
3458 A nasty problem are brace format strings that already contain braces
3459 as part of the normal text, for example the usage strings typically
3460 encountered in programs:
3465 die
"usage: $0 {OPTIONS} FILENAME...\n";
3469 If you want to internationalize this code with Perl brace format strings,
3470 you will run into a problem:
3475 die __x (
"usage: {program} {OPTIONS} FILENAME...\n", program =
> $
0);
3479 Whereas
<SAMP>`{program}
´</SAMP> is a placeholder,
<SAMP>`{OPTIONS}
´</SAMP>
3480 is not and should probably be translated. Yet, there is no way to teach
3481 the Perl parser in
<CODE>xgettext
</CODE> to recognize the first one, and leave
3482 the other one alone.
3486 There are two possible work-arounds for this problem. If you are
3487 sure that your program will run under Perl
5.8.0 or newer (these
3488 Perl versions handle positional parameters in
<CODE>printf()
</CODE>) or
3489 if you are sure that the translator will not have to reorder the arguments
3490 in her translation -- for example if you have only one brace placeholder
3491 in your string, or if it describes a syntax, like in this one --, you can
3492 mark the string as
<CODE>no-perl-brace-format
</CODE> and use
<CODE>printf()
</CODE>:
3497 # xgettext: no-perl-brace-format
3498 die sprintf (
"usage: %s {OPTIONS} FILENAME...\n", $
0);
3502 If you want to use the more portable Perl brace format, you will have to do
3503 put placeholders in place of the literal braces:
3508 die __x (
"usage: {program} {[}OPTIONS{]} FILENAME...\n",
3509 program =
> $
0, '[' =
> '{', ']' =
> '}');
3513 Perl brace format strings know no escaping mechanism. No matter how this
3514 escaping mechanism looked like, it would either give the programmer a
3515 hard time, make translating Perl brace format strings heavy-going, or
3516 result in a performance penalty at runtime, when the format directives
3517 get executed. Most of the time you will happily get along with
3518 <CODE>printf()
</CODE> for this special case.
3523 <H3><A NAME=
"SEC280" HREF=
"gettext_toc.html#TOC280">13.5.19 PHP Hypertext Preprocessor
</A></H3>
3525 <A NAME=
"IDX1156"></A>
3532 mod_php4, mod_php4-core, phpdoc
3536 <CODE>php
</CODE>,
<CODE>php3
</CODE>,
<CODE>php4
</CODE>
3540 <CODE>"abc"</CODE>,
<CODE>'abc'
</CODE>
3542 <DT>gettext shorthand
3544 <CODE>_(
"abc")
</CODE>
3546 <DT>gettext/ngettext functions
3548 <CODE>gettext
</CODE>,
<CODE>dgettext
</CODE>,
<CODE>dcgettext
</CODE>; starting with PHP
4.2.0
3549 also
<CODE>ngettext
</CODE>,
<CODE>dngettext
</CODE>,
<CODE>dcngettext
</CODE>
3553 <CODE>textdomain
</CODE> function
3557 <CODE>bindtextdomain
</CODE> function
3561 Programmer must call
<CODE>setlocale (LC_ALL,
"")
</CODE>
3567 <DT>Use or emulate GNU gettext
3573 <CODE>xgettext
</CODE>
3575 <DT>Formatting with positions
3577 <CODE>printf
"%2\$d %1\$d"</CODE>
3581 On platforms without gettext, the functions are not available.
3589 An example is available in the
<TT>`examples
´</TT> directory:
<CODE>hello-php
</CODE>.
3594 <H3><A NAME=
"SEC281" HREF=
"gettext_toc.html#TOC281">13.5.20 Pike
</A></H3>
3596 <A NAME=
"IDX1157"></A>
3613 <DT>gettext shorthand
3617 <DT>gettext/ngettext functions
3619 <CODE>gettext
</CODE>,
<CODE>dgettext
</CODE>,
<CODE>dcgettext
</CODE>
3623 <CODE>textdomain
</CODE> function
3627 <CODE>bindtextdomain
</CODE> function
3631 <CODE>setlocale
</CODE> function
3635 <CODE>import Locale.Gettext;
</CODE>
3637 <DT>Use or emulate GNU gettext
3645 <DT>Formatting with positions
3651 On platforms without gettext, the functions are not available.
3660 <H3><A NAME=
"SEC282" HREF=
"gettext_toc.html#TOC282">13.5.21 GNU Compiler Collection sources
</A></H3>
3662 <A NAME=
"IDX1158"></A>
3673 <CODE>c
</CODE>,
<CODE>h
</CODE>.
3679 <DT>gettext shorthand
3681 <CODE>_(
"abc")
</CODE>
3683 <DT>gettext/ngettext functions
3685 <CODE>gettext
</CODE>,
<CODE>dgettext
</CODE>,
<CODE>dcgettext
</CODE>,
<CODE>ngettext
</CODE>,
3686 <CODE>dngettext
</CODE>,
<CODE>dcngettext
</CODE>
3690 <CODE>textdomain
</CODE> function
3694 <CODE>bindtextdomain
</CODE> function
3698 Programmer must call
<CODE>setlocale (LC_ALL,
"")
</CODE>
3702 <CODE>#include
"intl.h"</CODE>
3704 <DT>Use or emulate GNU gettext
3710 <CODE>xgettext -k_
</CODE>
3712 <DT>Formatting with positions
3718 Uses autoconf macros
3727 <H2><A NAME=
"SEC283" HREF=
"gettext_toc.html#TOC283">13.6 Internationalizable Data
</A></H2>
3730 Here is a list of other data formats which can be internationalized
3737 <H3><A NAME=
"SEC284" HREF=
"gettext_toc.html#TOC284">13.6.1 POT - Portable Object Template
</A></H3>
3747 <CODE>pot
</CODE>,
<CODE>po
</CODE>
3751 <CODE>xgettext
</CODE>
3756 <H3><A NAME=
"SEC285" HREF=
"gettext_toc.html#TOC285">13.6.2 Resource String Table
</A></H3>
3758 <A NAME=
"IDX1159"></A>
3773 <CODE>xgettext
</CODE>,
<CODE>rstconv
</CODE>
3778 <H3><A NAME=
"SEC286" HREF=
"gettext_toc.html#TOC286">13.6.3 Glade - GNOME user interface description
</A></H3>
3784 glade, libglade, glade2, libglade2, intltool
3788 <CODE>glade
</CODE>,
<CODE>glade2
</CODE>
3792 <CODE>xgettext
</CODE>,
<CODE>libglade-xgettext
</CODE>,
<CODE>xml-i18n-extract
</CODE>,
<CODE>intltool-extract
</CODE>
3796 Go to the
<A HREF=
"gettext_1.html">first
</A>,
<A HREF=
"gettext_12.html">previous
</A>,
<A HREF=
"gettext_14.html">next
</A>,
<A HREF=
"gettext_22.html">last
</A> section,
<A HREF=
"gettext_toc.html">table of contents
</A>.