3 <!-- This HTML file has been created by texi2html 1.52a
4 from gettext.texi on 11 April 2005 -->
6 <TITLE>GNU gettext utilities -
2 PO Files and PO Mode Basics
</TITLE>
9 Go to the
<A HREF=
"gettext_1.html">first
</A>,
<A HREF=
"gettext_1.html">previous
</A>,
<A HREF=
"gettext_3.html">next
</A>,
<A HREF=
"gettext_22.html">last
</A> section,
<A HREF=
"gettext_toc.html">table of contents
</A>.
13 <H1><A NAME=
"SEC7" HREF=
"gettext_toc.html#TOC7">2 PO Files and PO Mode Basics
</A></H1>
16 The GNU
<CODE>gettext
</CODE> toolset helps programmers and translators
17 at producing, updating and using translation files, mainly those
18 PO files which are textual, editable files. This chapter stresses
19 the format of PO files, and contains a PO mode starter. PO mode
20 description is spread throughout this manual instead of being concentrated
21 in one place. Here we present only the basics of PO mode.
27 <H2><A NAME=
"SEC8" HREF=
"gettext_toc.html#TOC8">2.1 Completing GNU
<CODE>gettext
</CODE> Installation
</A></H2>
32 Once you have received, unpacked, configured and compiled the GNU
33 <CODE>gettext
</CODE> distribution, the
<SAMP>`make install
´</SAMP> command puts in
34 place the programs
<CODE>xgettext
</CODE>,
<CODE>msgfmt
</CODE>,
<CODE>gettext
</CODE>, and
35 <CODE>msgmerge
</CODE>, as well as their available message catalogs. To
36 top off a comfortable installation, you might also want to make the
37 PO mode available to your Emacs users.
43 During the installation of the PO mode, you might want to modify your
44 file
<TT>`.emacs
´</TT>, once and for all, so it contains a few lines looking
51 (cons '(
"\\.po\\'\\|\\.po\\." . po-mode) auto-mode-alist))
52 (autoload 'po-mode
"po-mode" "Major mode for translators to edit PO files" t)
56 Later, whenever you edit some
<TT>`.po
´</TT>
57 file, or any file having the string
<SAMP>`.po.
´</SAMP> within its name,
58 Emacs loads
<TT>`po-mode.elc
´</TT> (or
<TT>`po-mode.el
´</TT>) as needed, and
59 automatically activates PO mode commands for the associated buffer.
60 The string
<EM>PO
</EM> appears in the mode line for any buffer for
61 which PO mode is active. Many PO files may be active at once in a
66 If you are using Emacs version
20 or newer, and have already installed
67 the appropriate international fonts on your system, you may also tell
68 Emacs how to determine automatically the coding system of every PO file.
69 This will often (but not always) cause the necessary fonts to be loaded
70 and used for displaying the translations on your Emacs screen. For this
71 to happen, add the lines:
76 (modify-coding-system-alist 'file
"\\.po\\'\\|\\.po\\."
77 'po-find-file-coding-system)
78 (autoload 'po-find-file-coding-system
"po-mode")
82 to your
<TT>`.emacs
´</TT> file. If, with this, you still see boxes instead
83 of international characters, try a different font set (via Shift Mouse
89 <H2><A NAME=
"SEC9" HREF=
"gettext_toc.html#TOC9">2.2 The Format of PO Files
</A></H2>
96 A PO file is made up of many entries, each entry holding the relation
97 between an original untranslated string and its corresponding
98 translation. All entries in a given PO file usually pertain
99 to a single project, and all translations are expressed in a single
100 target language. One PO file
<EM>entry
</EM> has the following schematic
106 <VAR>white-space
</VAR>
107 #
<VAR>translator-comments
</VAR>
108 #.
<VAR>automatic-comments
</VAR>
109 #:
<VAR>reference
</VAR>...
110 #,
<VAR>flag
</VAR>...
111 msgid
<VAR>untranslated-string
</VAR>
112 msgstr
<VAR>translated-string
</VAR>
116 The general structure of a PO file should be well understood by
117 the translator. When using PO mode, very little has to be known
118 about the format details, as PO mode takes care of them for her.
122 A simple entry can look like this:
128 msgid
"Unknown system error"
129 msgstr
"Error desconegut del sistema"
133 Entries begin with some optional white space. Usually, when generated
134 through GNU
<CODE>gettext
</CODE> tools, there is exactly one blank line
135 between entries. Then comments follow, on lines all starting with the
136 character
<CODE>#
</CODE>. There are two kinds of comments: those which have
137 some white space immediately following the
<CODE>#
</CODE>, which comments are
138 created and maintained exclusively by the translator, and those which
139 have some non-white character just after the
<CODE>#
</CODE>, which comments
140 are created and maintained automatically by GNU
<CODE>gettext
</CODE> tools.
141 All comments, of either kind, are optional.
147 After white space and comments, entries show two strings, namely
148 first the untranslated string as it appears in the original program
149 sources, and then, the translation of this string. The original
150 string is introduced by the keyword
<CODE>msgid
</CODE>, and the translation,
151 by
<CODE>msgstr
</CODE>. The two strings, untranslated and translated,
152 are quoted in various ways in the PO file, using
<CODE>"</CODE>
153 delimiters and <CODE>\</CODE> escapes, but the translator does not really
154 have to pay attention to the precise quoting format, as PO mode fully
155 takes care of quoting for her.
159 The <CODE>msgid</CODE> strings, as well as automatic comments, are produced
160 and managed by other GNU <CODE>gettext</CODE> tools, and PO mode does not
161 provide means for the translator to alter these. The most she can
162 do is merely deleting them, and only by deleting the whole entry.
163 On the other hand, the <CODE>msgstr</CODE> string, as well as translator
164 comments, are really meant for the translator, and PO mode gives her
165 the full control she needs.
169 The comment lines beginning with <CODE>#,</CODE> are special because they are
170 not completely ignored by the programs as comments generally are. The
171 comma separated list of <VAR>flag</VAR>s is used by the <CODE>msgfmt</CODE>
172 program to give the user some better diagnostic messages. Currently
173 there are two forms of flags defined:
178 <DT><CODE>fuzzy</CODE>
181 This flag can be generated by the <CODE>msgmerge</CODE> program or it can be
182 inserted by the translator herself. It shows that the <CODE>msgstr</CODE>
183 string might not be a correct translation (anymore). Only the translator
184 can judge if the translation requires further modification, or is
185 acceptable as is. Once satisfied with the translation, she then removes
186 this <CODE>fuzzy</CODE> attribute. The <CODE>msgmerge</CODE> program inserts this
187 when it combined the <CODE>msgid</CODE> and <CODE>msgstr</CODE> entries after fuzzy
188 search only. See section <A HREF="gettext_6.html#SEC51
">6.3 Fuzzy Entries</A>.
190 <DT><CODE>c-format</CODE>
193 <DT><CODE>no-c-format</CODE>
196 These flags should not be added by a human. Instead only the
197 <CODE>xgettext</CODE> program adds them. In an automated PO file processing
198 system as proposed here the user changes would be thrown away again as
199 soon as the <CODE>xgettext</CODE> program generates a new template file.
201 The <CODE>c-format</CODE> flag tells that the untranslated string and the
202 translation are supposed to be C format strings. The <CODE>no-c-format</CODE>
203 flag tells that they are not C format strings, even though the untranslated
204 string happens to look like a C format string (with <SAMP>`%´</SAMP> directives).
206 In case the <CODE>c-format</CODE> flag is given for a string the <CODE>msgfmt</CODE>
207 does some more tests to check to validity of the translation.
208 See section <A HREF="gettext_8.html#SEC135
">8.1 Invoking the <CODE>msgfmt</CODE> Program</A>, section <A HREF="gettext_3.html#SEC18
">3.5 Special Comments preceding Keywords</A> and section <A HREF="gettext_13.html#SEC225
">13.3.1 C Format Strings</A>.
210 <DT><CODE>objc-format</CODE>
213 <DT><CODE>no-objc-format</CODE>
216 Likewise for Objective C, see section <A HREF="gettext_13.html#SEC226
">13.3.2 Objective C Format Strings</A>.
218 <DT><CODE>sh-format</CODE>
221 <DT><CODE>no-sh-format</CODE>
224 Likewise for Shell, see section <A HREF="gettext_13.html#SEC227
">13.3.3 Shell Format Strings</A>.
226 <DT><CODE>python-format</CODE>
229 <DT><CODE>no-python-format</CODE>
232 Likewise for Python, see section <A HREF="gettext_13.html#SEC228
">13.3.4 Python Format Strings</A>.
234 <DT><CODE>lisp-format</CODE>
237 <DT><CODE>no-lisp-format</CODE>
240 Likewise for Lisp, see section <A HREF="gettext_13.html#SEC229
">13.3.5 Lisp Format Strings</A>.
242 <DT><CODE>elisp-format</CODE>
245 <DT><CODE>no-elisp-format</CODE>
248 Likewise for Emacs Lisp, see section <A HREF="gettext_13.html#SEC230
">13.3.6 Emacs Lisp Format Strings</A>.
250 <DT><CODE>librep-format</CODE>
253 <DT><CODE>no-librep-format</CODE>
256 Likewise for librep, see section <A HREF="gettext_13.html#SEC231
">13.3.7 librep Format Strings</A>.
258 <DT><CODE>scheme-format</CODE>
261 <DT><CODE>no-scheme-format</CODE>
264 Likewise for Scheme, see section <A HREF="gettext_13.html#SEC232
">13.3.8 Scheme Format Strings</A>.
266 <DT><CODE>smalltalk-format</CODE>
269 <DT><CODE>no-smalltalk-format</CODE>
272 Likewise for Smalltalk, see section <A HREF="gettext_13.html#SEC233
">13.3.9 Smalltalk Format Strings</A>.
274 <DT><CODE>java-format</CODE>
277 <DT><CODE>no-java-format</CODE>
280 Likewise for Java, see section <A HREF="gettext_13.html#SEC234
">13.3.10 Java Format Strings</A>.
282 <DT><CODE>csharp-format</CODE>
285 <DT><CODE>no-csharp-format</CODE>
288 Likewise for C#, see section <A HREF="gettext_13.html#SEC235
">13.3.11 C# Format Strings</A>.
290 <DT><CODE>awk-format</CODE>
293 <DT><CODE>no-awk-format</CODE>
296 Likewise for awk, see section <A HREF="gettext_13.html#SEC236
">13.3.12 awk Format Strings</A>.
298 <DT><CODE>object-pascal-format</CODE>
301 <DT><CODE>no-object-pascal-format</CODE>
304 Likewise for Object Pascal, see section <A HREF="gettext_13.html#SEC237
">13.3.13 Object Pascal Format Strings</A>.
306 <DT><CODE>ycp-format</CODE>
309 <DT><CODE>no-ycp-format</CODE>
312 Likewise for YCP, see section <A HREF="gettext_13.html#SEC238
">13.3.14 YCP Format Strings</A>.
314 <DT><CODE>tcl-format</CODE>
317 <DT><CODE>no-tcl-format</CODE>
320 Likewise for Tcl, see section <A HREF="gettext_13.html#SEC239
">13.3.15 Tcl Format Strings</A>.
322 <DT><CODE>perl-format</CODE>
325 <DT><CODE>no-perl-format</CODE>
328 Likewise for Perl, see section <A HREF="gettext_13.html#SEC240
">13.3.16 Perl Format Strings</A>.
330 <DT><CODE>perl-brace-format</CODE>
333 <DT><CODE>no-perl-brace-format</CODE>
336 Likewise for Perl brace, see section <A HREF="gettext_13.html#SEC240
">13.3.16 Perl Format Strings</A>.
338 <DT><CODE>php-format</CODE>
341 <DT><CODE>no-php-format</CODE>
344 Likewise for PHP, see section <A HREF="gettext_13.html#SEC241
">13.3.17 PHP Format Strings</A>.
346 <DT><CODE>gcc-internal-format</CODE>
349 <DT><CODE>no-gcc-internal-format</CODE>
352 Likewise for the GCC sources, see section <A HREF="gettext_13.html#SEC242
">13.3.18 GCC internal Format Strings</A>.
354 <DT><CODE>qt-format</CODE>
357 <DT><CODE>no-qt-format</CODE>
360 Likewise for Qt, see section <A HREF="gettext_13.html#SEC243
">13.3.19 Qt Format Strings</A>.
367 A different kind of entries is used for translations which involve
373 <VAR>white-space</VAR>
374 # <VAR>translator-comments</VAR>
375 #. <VAR>automatic-comments</VAR>
376 #: <VAR>reference</VAR>...
377 #, <VAR>flag</VAR>...
378 msgid <VAR>untranslated-string-singular</VAR>
379 msgid_plural <VAR>untranslated-string-plural</VAR>
380 msgstr[0] <VAR>translated-string-case-0</VAR>
382 msgstr[N] <VAR>translated-string-case-n</VAR>
386 Such an entry can look like this:
391 #: src/msgcmp.c:338 src/po-lex.c:699
393 msgid "found %d fatal error
"
394 msgid_plural "found %d fatal errors
"
395 msgstr[0] "s'ha trobat %d error fatal
"
396 msgstr[1] "s'han trobat %d errors fatals
"
401 It happens that some lines, usually whitespace or comments, follow the
402 very last entry of a PO file. Such lines are not part of any entry,
403 and PO mode is unable to take action on those lines. By using the
404 PO mode function <KBD>M-x po-normalize</KBD>, the translator may get
405 rid of those spurious lines. See section <A HREF="gettext_2.html#SEC12
">2.5 Normalizing Strings in Entries</A>.
409 The remainder of this section may be safely skipped by those using
410 PO mode, yet it may be interesting for everybody to have a better
411 idea of the precise format of a PO file. On the other hand, those
412 not having Emacs handy should carefully continue reading on.
416 Each of <VAR>untranslated-string</VAR> and <VAR>translated-string</VAR> respects
417 the C syntax for a character string, including the surrounding quotes
418 and embedded backslashed escape sequences. When the time comes
419 to write multi-line strings, one should not use escaped newlines.
420 Instead, a closing quote should follow the last character on the
421 line to be continued, and an opening quote should resume the string
422 at the beginning of the following PO file line. For example:
428 "Here is an example of how one might continue a very long string\n
"
429 "for the common case the string represents multi-line output.\n
"
433 In this example, the empty string is used on the first line, to
434 allow better alignment of the <CODE>H</CODE> from the word <SAMP>`Here´</SAMP>
435 over the <CODE>f</CODE> from the word <SAMP>`for´</SAMP>. In this example, the
436 <CODE>msgid</CODE> keyword is followed by three strings, which are meant
437 to be concatenated. Concatenating the empty string does not change
438 the resulting overall string, but it is a way for us to comply with
439 the necessity of <CODE>msgid</CODE> to be followed by a string on the same
440 line, while keeping the multi-line presentation left-justified, as
441 we find this to be a cleaner disposition. The empty string could have
442 been omitted, but only if the string starting with <SAMP>`Here´</SAMP> was
443 promoted on the first line, right after <CODE>msgid</CODE>.<A NAME="DOCF2
" HREF="gettext_foot.html#FOOT2
">(2)</A> It was not really necessary
444 either to switch between the two last quoted strings immediately after
445 the newline <SAMP>`\n´</SAMP>, the switch could have occurred after <EM>any</EM>
446 other character, we just did it this way because it is neater.
451 One should carefully distinguish between end of lines marked as
452 <SAMP>`\n´</SAMP> <EM>inside</EM> quotes, which are part of the represented
453 string, and end of lines in the PO file itself, outside string quotes,
454 which have no incidence on the represented string.
459 Outside strings, white lines and comments may be used freely.
460 Comments start at the beginning of a line with <SAMP>`#´</SAMP> and extend
461 until the end of the PO file line. Comments written by translators
462 should have the initial <SAMP>`#´</SAMP> immediately followed by some white
463 space. If the <SAMP>`#´</SAMP> is not immediately followed by white space,
464 this comment is most likely generated and managed by specialized GNU
465 tools, and might disappear or be replaced unexpectedly when the PO
466 file is given to <CODE>msgmerge</CODE>.
471 <H2><A NAME="SEC10
" HREF="gettext_toc.html#TOC10
">2.3 Main PO mode Commands</A></H2>
476 After setting up Emacs with something similar to the lines in
477 section <A HREF="gettext_2.html#SEC8
">2.1 Completing GNU <CODE>gettext</CODE> Installation</A>, PO mode is activated for a window when Emacs finds a
478 PO file in that window. This puts the window read-only and establishes a
479 po-mode-map, which is a genuine Emacs mode, in a way that is not derived
480 from text mode in any way. Functions found on <CODE>po-mode-hook</CODE>,
481 if any, will be executed.
485 When PO mode is active in a window, the letters <SAMP>`PO´</SAMP> appear
486 in the mode line for that window. The mode line also displays how
487 many entries of each kind are held in the PO file. For example,
488 the string <SAMP>`132t+3f+10u+2o´</SAMP> would tell the translator that the
489 PO mode contains 132 translated entries (see section <A HREF="gettext_6.html#SEC50
">6.2 Translated Entries</A>,
490 3 fuzzy entries (see section <A HREF="gettext_6.html#SEC51
">6.3 Fuzzy Entries</A>), 10 untranslated entries
491 (see section <A HREF="gettext_6.html#SEC52
">6.4 Untranslated Entries</A>) and 2 obsolete entries (see section <A HREF="gettext_6.html#SEC53
">6.5 Obsolete Entries</A>). Zero-coefficients items are not shown. So, in this example, if
492 the fuzzy entries were unfuzzied, the untranslated entries were translated
493 and the obsolete entries were deleted, the mode line would merely display
494 <SAMP>`145t´</SAMP> for the counters.
498 The main PO commands are those which do not fit into the other categories of
499 subsequent sections. These allow for quitting PO mode or for managing windows
508 Undo last modification to the PO file (<CODE>po-undo</CODE>).
513 Quit processing and save the PO file (<CODE>po-quit</CODE>).
518 Quit processing, possibly after confirmation (<CODE>po-confirm-and-quit</CODE>).
523 Temporary leave the PO file window (<CODE>po-other-window</CODE>).
530 <A NAME="IDX100
"></A>
531 Show help about PO mode (<CODE>po-help</CODE>).
535 <A NAME="IDX101
"></A>
536 Give some PO file statistics (<CODE>po-statistics</CODE>).
540 <A NAME="IDX102
"></A>
541 Batch validate the format of the whole PO file (<CODE>po-validate</CODE>).
546 <A NAME="IDX103
"></A>
547 <A NAME="IDX104
"></A>
548 The command <KBD>_</KBD> (<CODE>po-undo</CODE>) interfaces to the Emacs
549 <EM>undo</EM> facility. See section `Undoing Changes' in <CITE>The Emacs Editor</CITE>. Each time <KBD>U</KBD> is typed, modifications which the translator
550 did to the PO file are undone a little more. For the purpose of
551 undoing, each PO mode command is atomic. This is especially true for
552 the <KBD><KBD>RET</KBD></KBD> command: the whole edition made by using a single
553 use of this command is undone at once, even if the edition itself
554 implied several actions. However, while in the editing window, one
555 can undo the edition work quite parsimoniously.
559 <A NAME="IDX105
"></A>
560 <A NAME="IDX106
"></A>
561 <A NAME="IDX107
"></A>
562 <A NAME="IDX108
"></A>
563 The commands <KBD>Q</KBD> (<CODE>po-quit</CODE>) and <KBD>q</KBD>
564 (<CODE>po-confirm-and-quit</CODE>) are used when the translator is done with the
565 PO file. The former is a bit less verbose than the latter. If the file
566 has been modified, it is saved to disk first. In both cases, and prior to
567 all this, the commands check if any untranslated messages remain in the
568 PO file and, if so, the translator is asked if she really wants to leave
569 off working with this PO file. This is the preferred way of getting rid
570 of an Emacs PO file buffer. Merely killing it through the usual command
571 <KBD>C-x k</KBD> (<CODE>kill-buffer</CODE>) is not the tidiest way to proceed.
575 <A NAME="IDX109
"></A>
576 <A NAME="IDX110
"></A>
577 The command <KBD>0</KBD> (<CODE>po-other-window</CODE>) is another, softer way,
578 to leave PO mode, temporarily. It just moves the cursor to some other
579 Emacs window, and pops one if necessary. For example, if the translator
580 just got PO mode to show some source context in some other, she might
581 discover some apparent bug in the program source that needs correction.
582 This command allows the translator to change sex, become a programmer,
583 and have the cursor right into the window containing the program she
584 (or rather <EM>he</EM>) wants to modify. By later getting the cursor back
585 in the PO file window, or by asking Emacs to edit this file once again,
586 PO mode is then recovered.
590 <A NAME="IDX111
"></A>
591 <A NAME="IDX112
"></A>
592 <A NAME="IDX113
"></A>
593 The command <KBD>h</KBD> (<CODE>po-help</CODE>) displays a summary of all available PO
594 mode commands. The translator should then type any character to resume
595 normal PO mode operations. The command <KBD>?</KBD> has the same effect
600 <A NAME="IDX114
"></A>
601 <A NAME="IDX115
"></A>
602 The command <KBD>=</KBD> (<CODE>po-statistics</CODE>) computes the total number of
603 entries in the PO file, the ordinal of the current entry (counted from
604 1), the number of untranslated entries, the number of obsolete entries,
605 and displays all these numbers.
609 <A NAME="IDX116
"></A>
610 <A NAME="IDX117
"></A>
611 The command <KBD>V</KBD> (<CODE>po-validate</CODE>) launches <CODE>msgfmt</CODE> in
613 mode over the current PO file. This command first offers to save the
614 current PO file on disk. The <CODE>msgfmt</CODE> tool, from GNU <CODE>gettext</CODE>,
615 has the purpose of creating a MO file out of a PO file, and PO mode uses
616 the features of this program for checking the overall format of a PO file,
617 as well as all individual entries.
621 <A NAME="IDX118
"></A>
622 The program <CODE>msgfmt</CODE> runs asynchronously with Emacs, so the
623 translator regains control immediately while her PO file is being studied.
624 Error output is collected in the Emacs <SAMP>`*compilation*´</SAMP> buffer,
625 displayed in another window. The regular Emacs command <KBD>C-x`</KBD>
626 (<CODE>next-error</CODE>), as well as other usual compile commands, allow the
627 translator to reposition quickly to the offending parts of the PO file.
628 Once the cursor is on the line in error, the translator may decide on
629 any PO mode action which would help correcting the error.
634 <H2><A NAME="SEC11
" HREF="gettext_toc.html#TOC11
">2.4 Entry Positioning</A></H2>
637 <A NAME="IDX119
"></A>
638 The cursor in a PO file window is almost always part of
639 an entry. The only exceptions are the special case when the cursor
640 is after the last entry in the file, or when the PO file is
641 empty. The entry where the cursor is found to be is said to be the
642 current entry. Many PO mode commands operate on the current entry,
643 so moving the cursor does more than allowing the translator to browse
644 the PO file, this also selects on which entry commands operate.
648 <A NAME="IDX120
"></A>
649 Some PO mode commands alter the position of the cursor in a specialized
650 way. A few of those special purpose positioning are described here,
651 the others are described in following sections (for a complete list try
659 <A NAME="IDX121
"></A>
660 Redisplay the current entry (<CODE>po-current-entry</CODE>).
664 <A NAME="IDX122
"></A>
665 Select the entry after the current one (<CODE>po-next-entry</CODE>).
669 <A NAME="IDX123
"></A>
670 Select the entry before the current one (<CODE>po-previous-entry</CODE>).
674 <A NAME="IDX124
"></A>
675 Select the first entry in the PO file (<CODE>po-first-entry</CODE>).
679 <A NAME="IDX125
"></A>
680 Select the last entry in the PO file (<CODE>po-last-entry</CODE>).
684 <A NAME="IDX126
"></A>
685 Record the location of the current entry for later use
686 (<CODE>po-push-location</CODE>).
690 <A NAME="IDX127
"></A>
691 Return to a previously saved entry location (<CODE>po-pop-location</CODE>).
695 <A NAME="IDX128
"></A>
696 Exchange the current entry location with the previously saved one
697 (<CODE>po-exchange-location</CODE>).
702 <A NAME="IDX129
"></A>
703 <A NAME="IDX130
"></A>
704 Any Emacs command able to reposition the cursor may be used
705 to select the current entry in PO mode, including commands which
706 move by characters, lines, paragraphs, screens or pages, and search
707 commands. However, there is a kind of standard way to display the
708 current entry in PO mode, which usual Emacs commands moving
709 the cursor do not especially try to enforce. The command <KBD>.</KBD>
710 (<CODE>po-current-entry</CODE>) has the sole purpose of redisplaying the
711 current entry properly, after the current entry has been changed by
712 means external to PO mode, or the Emacs screen otherwise altered.
716 It is yet to be decided if PO mode helps the translator, or otherwise
717 irritates her, by forcing a rigid window disposition while she
718 is doing her work. We originally had quite precise ideas about
719 how windows should behave, but on the other hand, anyone used to
720 Emacs is often happy to keep full control. Maybe a fixed window
721 disposition might be offered as a PO mode option that the translator
722 might activate or deactivate at will, so it could be offered on an
723 experimental basis. If nobody feels a real need for using it, or
724 a compulsion for writing it, we should drop this whole idea.
725 The incentive for doing it should come from translators rather than
726 programmers, as opinions from an experienced translator are surely
727 more worth to me than opinions from programmers <EM>thinking</EM> about
728 how <EM>others</EM> should do translation.
732 <A NAME="IDX131
"></A>
733 <A NAME="IDX132
"></A>
734 <A NAME="IDX133
"></A>
735 <A NAME="IDX134
"></A>
736 The commands <KBD>n</KBD> (<CODE>po-next-entry</CODE>) and <KBD>p</KBD>
737 (<CODE>po-previous-entry</CODE>) move the cursor the entry following,
738 or preceding, the current one. If <KBD>n</KBD> is given while the
739 cursor is on the last entry of the PO file, or if <KBD>p</KBD>
740 is given while the cursor is on the first entry, no move is done.
744 <A NAME="IDX135
"></A>
745 <A NAME="IDX136
"></A>
746 <A NAME="IDX137
"></A>
747 <A NAME="IDX138
"></A>
748 The commands <KBD><</KBD> (<CODE>po-first-entry</CODE>) and <KBD>></KBD>
749 (<CODE>po-last-entry</CODE>) move the cursor to the first entry, or last
750 entry, of the PO file. When the cursor is located past the last
751 entry in a PO file, most PO mode commands will return an error saying
752 <SAMP>`After last entry´</SAMP>. Moreover, the commands <KBD><</KBD> and <KBD>></KBD>
753 have the special property of being able to work even when the cursor
754 is not into some PO file entry, and one may use them for nicely
755 correcting this situation. But even these commands will fail on a
756 truly empty PO file. There are development plans for the PO mode for it
757 to interactively fill an empty PO file from sources. See section <A HREF="gettext_3.html#SEC17
">3.4 Marking Translatable Strings</A>.
761 The translator may decide, before working at the translation of
762 a particular entry, that she needs to browse the remainder of the
763 PO file, maybe for finding the terminology or phraseology used
764 in related entries. She can of course use the standard Emacs idioms
765 for saving the current cursor location in some register, and use that
766 register for getting back, or else, use the location ring.
770 <A NAME="IDX139
"></A>
771 <A NAME="IDX140
"></A>
772 <A NAME="IDX141
"></A>
773 <A NAME="IDX142
"></A>
774 PO mode offers another approach, by which cursor locations may be saved
775 onto a special stack. The command <KBD>m</KBD> (<CODE>po-push-location</CODE>)
776 merely adds the location of current entry to the stack, pushing
777 the already saved locations under the new one. The command
778 <KBD>r</KBD> (<CODE>po-pop-location</CODE>) consumes the top stack element and
779 repositions the cursor to the entry associated with that top element.
780 This position is then lost, for the next <KBD>r</KBD> will move the cursor
781 to the previously saved location, and so on until no locations remain
786 If the translator wants the position to be kept on the location stack,
787 maybe for taking a look at the entry associated with the top
788 element, then go elsewhere with the intent of getting back later, she
789 ought to use <KBD>m</KBD> immediately after <KBD>r</KBD>.
793 <A NAME="IDX143
"></A>
794 <A NAME="IDX144
"></A>
795 The command <KBD>x</KBD> (<CODE>po-exchange-location</CODE>) simultaneously
796 repositions the cursor to the entry associated with the top element of
797 the stack of saved locations, and replaces that top element with the
798 location of the current entry before the move. Consequently, repeating
799 the <KBD>x</KBD> command toggles alternatively between two entries.
800 For achieving this, the translator will position the cursor on the
801 first entry, use <KBD>m</KBD>, then position to the second entry, and
802 merely use <KBD>x</KBD> for making the switch.
807 <H2><A NAME="SEC12
" HREF="gettext_toc.html#TOC12
">2.5 Normalizing Strings in Entries</A></H2>
809 <A NAME="IDX145
"></A>
813 There are many different ways for encoding a particular string into a
814 PO file entry, because there are so many different ways to split and
815 quote multi-line strings, and even, to represent special characters
816 by backslashed escaped sequences. Some features of PO mode rely on
817 the ability for PO mode to scan an already existing PO file for a
818 particular string encoded into the <CODE>msgid</CODE> field of some entry.
819 Even if PO mode has internally all the built-in machinery for
820 implementing this recognition easily, doing it fast is technically
821 difficult. To facilitate a solution to this efficiency problem,
822 we decided on a canonical representation for strings.
826 A conventional representation of strings in a PO file is currently
827 under discussion, and PO mode experiments with a canonical representation.
828 Having both <CODE>xgettext</CODE> and PO mode converging towards a uniform
829 way of representing equivalent strings would be useful, as the internal
830 normalization needed by PO mode could be automatically satisfied
831 when using <CODE>xgettext</CODE> from GNU <CODE>gettext</CODE>. An explicit
832 PO mode normalization should then be only necessary for PO files
833 imported from elsewhere, or for when the convention itself evolves.
837 So, for achieving normalization of at least the strings of a given
838 PO file needing a canonical representation, the following PO mode
839 command is available:
843 <A NAME="IDX146
"></A>
846 <DT><KBD>M-x po-normalize</KBD>
848 <A NAME="IDX147
"></A>
849 Tidy the whole PO file by making entries more uniform.
854 The special command <KBD>M-x po-normalize</KBD>, which has no associated
855 keys, revises all entries, ensuring that strings of both original
856 and translated entries use uniform internal quoting in the PO file.
857 It also removes any crumb after the last entry. This command may be
858 useful for PO files freshly imported from elsewhere, or if we ever
859 improve on the canonical quoting format we use. This canonical format
860 is not only meant for getting cleaner PO files, but also for greatly
861 speeding up <CODE>msgid</CODE> string lookup for some other PO mode commands.
865 <KBD>M-x po-normalize</KBD> presently makes three passes over the entries.
866 The first implements heuristics for converting PO files for GNU
867 <CODE>gettext</CODE> 0.6 and earlier, in which <CODE>msgid</CODE> and <CODE>msgstr</CODE>
868 fields were using K&R style C string syntax for multi-line strings.
869 These heuristics may fail for comments not related to obsolete
870 entries and ending with a backslash; they also depend on subsequent
871 passes for finalizing the proper commenting of continued lines for
872 obsolete entries. This first pass might disappear once all oldish PO
873 files would have been adjusted. The second and third pass normalize
874 all <CODE>msgid</CODE> and <CODE>msgstr</CODE> strings respectively. They also
875 clean out those trailing backslashes used by XView's <CODE>msgfmt</CODE>
880 <A NAME="IDX148
"></A>
881 Having such an explicit normalizing command allows for importing PO
882 files from other sources, but also eases the evolution of the current
883 convention, evolution driven mostly by aesthetic concerns, as of now.
884 It is easy to make suggested adjustments at a later time, as the
885 normalizing command and eventually, other GNU <CODE>gettext</CODE> tools
886 should greatly automate conformance. A description of the canonical
887 string format is given below, for the particular benefit of those not
888 having Emacs handy, and who would nevertheless want to handcraft
889 their PO files in nice ways.
893 <A NAME="IDX149
"></A>
894 Right now, in PO mode, strings are single line or multi-line. A string
895 goes multi-line if and only if it has <EM>embedded</EM> newlines, that
896 is, if it matches <SAMP>`[^\n]\n+[^\n]´</SAMP>. So, we would have:
901 msgstr "\n\nHello, world!\n\n\n
"
905 but, replacing the space by a newline, this becomes:
920 We are deliberately using a caricatural example, here, to make the
921 point clearer. Usually, multi-lines are not that bad looking.
922 It is probable that we will implement the following suggestion.
923 We might lump together all initial newlines into the empty string,
924 and also all newlines introducing empty lines (that is, for <VAR>n</VAR>
925 > 1, the <VAR>n</VAR>-1'th last newlines would go together on a separate
926 string), so making the previous example appear:
938 There are a few yet undecided little points about string normalization,
939 to be documented in this manual, once these questions settle.
943 Go to the <A HREF="gettext_1.html
">first</A>, <A HREF="gettext_1.html
">previous</A>, <A HREF="gettext_3.html
">next</A>, <A HREF="gettext_22.html
">last</A> section, <A HREF="gettext_toc.html
">table of contents</A>.