Sync usage with man page.
[netbsd-mini2440.git] / gnu / dist / gettext / gettext-tools / doc / gettext_8.html
blob03dc1255634fd4ac86cc74058a1442dd7f1c4119
1 <HTML>
2 <HEAD>
3 <!-- This HTML file has been created by texi2html 1.52a
4 from gettext.texi on 11 April 2005 -->
6 <TITLE>GNU gettext utilities - 8 Producing Binary MO Files</TITLE>
7 </HEAD>
8 <BODY>
9 Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_7.html">previous</A>, <A HREF="gettext_9.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
10 <P><HR><P>
13 <H1><A NAME="SEC134" HREF="gettext_toc.html#TOC134">8 Producing Binary MO Files</A></H1>
17 <H2><A NAME="SEC135" HREF="gettext_toc.html#TOC135">8.1 Invoking the <CODE>msgfmt</CODE> Program</A></H2>
19 <P>
20 <A NAME="IDX858"></A>
21 <A NAME="IDX859"></A>
23 <PRE>
24 msgfmt [<VAR>option</VAR>] <VAR>filename</VAR>.po ...
25 </PRE>
27 <P>
28 <A NAME="IDX860"></A>
29 The <CODE>msgfmt</CODE> programs generates a binary message catalog from a textual
30 translation description.
32 </P>
35 <H3><A NAME="SEC136" HREF="gettext_toc.html#TOC136">8.1.1 Input file location</A></H3>
37 <DL COMPACT>
39 <DT><SAMP>`<VAR>filename</VAR>.po ...&acute;</SAMP>
40 <DD>
41 <DT><SAMP>`-D <VAR>directory</VAR>&acute;</SAMP>
42 <DD>
43 <DT><SAMP>`--directory=<VAR>directory</VAR>&acute;</SAMP>
44 <DD>
45 <A NAME="IDX861"></A>
46 <A NAME="IDX862"></A>
47 Add <VAR>directory</VAR> to the list of directories. Source files are
48 searched relative to this list of directories. The resulting <TT>`.po&acute;</TT>
49 file will be written relative to the current directory, though.
51 </DL>
53 <P>
54 If an input file is <SAMP>`-&acute;</SAMP>, standard input is read.
56 </P>
59 <H3><A NAME="SEC137" HREF="gettext_toc.html#TOC137">8.1.2 Operation mode</A></H3>
61 <DL COMPACT>
63 <DT><SAMP>`-j&acute;</SAMP>
64 <DD>
65 <DT><SAMP>`--java&acute;</SAMP>
66 <DD>
67 <A NAME="IDX863"></A>
68 <A NAME="IDX864"></A>
69 <A NAME="IDX865"></A>
70 Java mode: generate a Java <CODE>ResourceBundle</CODE> class.
72 <DT><SAMP>`--java2&acute;</SAMP>
73 <DD>
74 <A NAME="IDX866"></A>
75 Like --java, and assume Java2 (JDK 1.2 or higher).
77 <DT><SAMP>`--csharp&acute;</SAMP>
78 <DD>
79 <A NAME="IDX867"></A>
80 <A NAME="IDX868"></A>
81 C# mode: generate a .NET .dll file containing a subclass of
82 <CODE>GettextResourceSet</CODE>.
84 <DT><SAMP>`--csharp-resources&acute;</SAMP>
85 <DD>
86 <A NAME="IDX869"></A>
87 <A NAME="IDX870"></A>
88 C# resources mode: generate a .NET <TT>`.resources&acute;</TT> file.
90 <DT><SAMP>`--tcl&acute;</SAMP>
91 <DD>
92 <A NAME="IDX871"></A>
93 <A NAME="IDX872"></A>
94 Tcl mode: generate a tcl/msgcat <TT>`.msg&acute;</TT> file.
96 <DT><SAMP>`--qt&acute;</SAMP>
97 <DD>
98 <A NAME="IDX873"></A>
99 <A NAME="IDX874"></A>
100 Qt mode: generate a Qt <TT>`.qm&acute;</TT> file.
102 </DL>
106 <H3><A NAME="SEC138" HREF="gettext_toc.html#TOC138">8.1.3 Output file location</A></H3>
108 <DL COMPACT>
110 <DT><SAMP>`-o <VAR>file</VAR>&acute;</SAMP>
111 <DD>
112 <DT><SAMP>`--output-file=<VAR>file</VAR>&acute;</SAMP>
113 <DD>
114 <A NAME="IDX875"></A>
115 <A NAME="IDX876"></A>
116 Write output to specified file.
118 <DT><SAMP>`--strict&acute;</SAMP>
119 <DD>
120 <A NAME="IDX877"></A>
121 Direct the program to work strictly following the Uniforum/Sun
122 implementation. Currently this only affects the naming of the output
123 file. If this option is not given the name of the output file is the
124 same as the domain name. If the strict Uniforum mode is enabled the
125 suffix <TT>`.mo&acute;</TT> is added to the file name if it is not already
126 present.
128 We find this behaviour of Sun's implementation rather silly and so by
129 default this mode is <EM>not</EM> selected.
131 </DL>
134 If the output <VAR>file</VAR> is <SAMP>`-&acute;</SAMP>, output is written to standard output.
136 </P>
139 <H3><A NAME="SEC139" HREF="gettext_toc.html#TOC139">8.1.4 Output file location in Java mode</A></H3>
141 <DL COMPACT>
143 <DT><SAMP>`-r <VAR>resource</VAR>&acute;</SAMP>
144 <DD>
145 <DT><SAMP>`--resource=<VAR>resource</VAR>&acute;</SAMP>
146 <DD>
147 <A NAME="IDX878"></A>
148 <A NAME="IDX879"></A>
149 Specify the resource name.
151 <DT><SAMP>`-l <VAR>locale</VAR>&acute;</SAMP>
152 <DD>
153 <DT><SAMP>`--locale=<VAR>locale</VAR>&acute;</SAMP>
154 <DD>
155 <A NAME="IDX880"></A>
156 <A NAME="IDX881"></A>
157 Specify the locale name, either a language specification of the form <VAR>ll</VAR>
158 or a combined language and country specification of the form <VAR>ll_CC</VAR>.
160 <DT><SAMP>`-d <VAR>directory</VAR>&acute;</SAMP>
161 <DD>
162 <A NAME="IDX882"></A>
163 Specify the base directory of classes directory hierarchy.
165 </DL>
168 The class name is determined by appending the locale name to the resource name,
169 separated with an underscore. The <SAMP>`-d&acute;</SAMP> option is mandatory. The class
170 is written under the specified directory.
172 </P>
175 <H3><A NAME="SEC140" HREF="gettext_toc.html#TOC140">8.1.5 Output file location in C# mode</A></H3>
177 <DL COMPACT>
179 <DT><SAMP>`-r <VAR>resource</VAR>&acute;</SAMP>
180 <DD>
181 <DT><SAMP>`--resource=<VAR>resource</VAR>&acute;</SAMP>
182 <DD>
183 <A NAME="IDX883"></A>
184 <A NAME="IDX884"></A>
185 Specify the resource name.
187 <DT><SAMP>`-l <VAR>locale</VAR>&acute;</SAMP>
188 <DD>
189 <DT><SAMP>`--locale=<VAR>locale</VAR>&acute;</SAMP>
190 <DD>
191 <A NAME="IDX885"></A>
192 <A NAME="IDX886"></A>
193 Specify the locale name, either a language specification of the form <VAR>ll</VAR>
194 or a combined language and country specification of the form <VAR>ll_CC</VAR>.
196 <DT><SAMP>`-d <VAR>directory</VAR>&acute;</SAMP>
197 <DD>
198 <A NAME="IDX887"></A>
199 Specify the base directory for locale dependent <TT>`.dll&acute;</TT> files.
201 </DL>
204 The <SAMP>`-l&acute;</SAMP> and <SAMP>`-d&acute;</SAMP> options are mandatory. The <TT>`.dll&acute;</TT> file is
205 written in a subdirectory of the specified directory whose name depends on the
206 locale.
208 </P>
211 <H3><A NAME="SEC141" HREF="gettext_toc.html#TOC141">8.1.6 Output file location in Tcl mode</A></H3>
213 <DL COMPACT>
215 <DT><SAMP>`-l <VAR>locale</VAR>&acute;</SAMP>
216 <DD>
217 <DT><SAMP>`--locale=<VAR>locale</VAR>&acute;</SAMP>
218 <DD>
219 <A NAME="IDX888"></A>
220 <A NAME="IDX889"></A>
221 Specify the locale name, either a language specification of the form <VAR>ll</VAR>
222 or a combined language and country specification of the form <VAR>ll_CC</VAR>.
224 <DT><SAMP>`-d <VAR>directory</VAR>&acute;</SAMP>
225 <DD>
226 <A NAME="IDX890"></A>
227 Specify the base directory of <TT>`.msg&acute;</TT> message catalogs.
229 </DL>
232 The <SAMP>`-l&acute;</SAMP> and <SAMP>`-d&acute;</SAMP> options are mandatory. The <TT>`.msg&acute;</TT> file is
233 written in the specified directory.
235 </P>
238 <H3><A NAME="SEC142" HREF="gettext_toc.html#TOC142">8.1.7 Input file syntax</A></H3>
240 <DL COMPACT>
242 <DT><SAMP>`-P&acute;</SAMP>
243 <DD>
244 <DT><SAMP>`--properties-input&acute;</SAMP>
245 <DD>
246 <A NAME="IDX891"></A>
247 <A NAME="IDX892"></A>
248 Assume the input files are Java ResourceBundles in Java <CODE>.properties</CODE>
249 syntax, not in PO file syntax.
251 <DT><SAMP>`--stringtable-input&acute;</SAMP>
252 <DD>
253 <A NAME="IDX893"></A>
254 Assume the input files are NeXTstep/GNUstep localized resource files in
255 <CODE>.strings</CODE> syntax, not in PO file syntax.
257 </DL>
261 <H3><A NAME="SEC143" HREF="gettext_toc.html#TOC143">8.1.8 Input file interpretation</A></H3>
263 <DL COMPACT>
265 <DT><SAMP>`-c&acute;</SAMP>
266 <DD>
267 <DT><SAMP>`--check&acute;</SAMP>
268 <DD>
269 <A NAME="IDX894"></A>
270 <A NAME="IDX895"></A>
271 Perform all the checks implied by <CODE>--check-format</CODE>, <CODE>--check-header</CODE>,
272 <CODE>--check-domain</CODE>.
274 <DT><SAMP>`--check-format&acute;</SAMP>
275 <DD>
276 <A NAME="IDX896"></A>
277 <A NAME="IDX897"></A>
278 Check language dependent format strings.
280 If the string represents a format string used in a
281 <CODE>printf</CODE>-like function both strings should have the same number of
282 <SAMP>`%&acute;</SAMP> format specifiers, with matching types. If the flag
283 <CODE>c-format</CODE> or <CODE>possible-c-format</CODE> appears in the special
284 comment <KBD>#,</KBD> for this entry a check is performed. For example, the
285 check will diagnose using <SAMP>`%.*s&acute;</SAMP> against <SAMP>`%s&acute;</SAMP>, or <SAMP>`%d&acute;</SAMP>
286 against <SAMP>`%s&acute;</SAMP>, or <SAMP>`%d&acute;</SAMP> against <SAMP>`%x&acute;</SAMP>. It can even handle
287 positional parameters.
289 Normally the <CODE>xgettext</CODE> program automatically decides whether a
290 string is a format string or not. This algorithm is not perfect,
291 though. It might regard a string as a format string though it is not
292 used in a <CODE>printf</CODE>-like function and so <CODE>msgfmt</CODE> might report
293 errors where there are none.
295 To solve this problem the programmer can dictate the decision to the
296 <CODE>xgettext</CODE> program (see section <A HREF="gettext_13.html#SEC225">13.3.1 C Format Strings</A>). The translator should not
297 consider removing the flag from the <KBD>#,</KBD> line. This "fix" would be
298 reversed again as soon as <CODE>msgmerge</CODE> is called the next time.
300 <DT><SAMP>`--check-header&acute;</SAMP>
301 <DD>
302 <A NAME="IDX898"></A>
303 Verify presence and contents of the header entry. See section <A HREF="gettext_5.html#SEC39">5.2 Filling in the Header Entry</A>,
304 for a description of the various fields in the header entry.
306 <DT><SAMP>`--check-domain&acute;</SAMP>
307 <DD>
308 <A NAME="IDX899"></A>
309 Check for conflicts between domain directives and the <CODE>--output-file</CODE>
310 option
312 <DT><SAMP>`-C&acute;</SAMP>
313 <DD>
314 <DT><SAMP>`--check-compatibility&acute;</SAMP>
315 <DD>
316 <A NAME="IDX900"></A>
317 <A NAME="IDX901"></A>
318 <A NAME="IDX902"></A>
319 Check that GNU msgfmt behaves like X/Open msgfmt. This will give an error
320 when attempting to use the GNU extensions.
322 <DT><SAMP>`--check-accelerators[=<VAR>char</VAR>]&acute;</SAMP>
323 <DD>
324 <A NAME="IDX903"></A>
325 <A NAME="IDX904"></A>
326 <A NAME="IDX905"></A>
327 <A NAME="IDX906"></A>
328 Check presence of keyboard accelerators for menu items. This is based on
329 the convention used in some GUIs that a keyboard accelerator in a menu
330 item string is designated by an immediately preceding <SAMP>`&#38;&acute;</SAMP> character.
331 Sometimes a keyboard accelerator is also called "keyboard mnemonic".
332 This check verifies that if the untranslated string has exactly one
333 <SAMP>`&#38;&acute;</SAMP> character, the translated string has exactly one <SAMP>`&#38;&acute;</SAMP> as well.
334 If this option is given with a <VAR>char</VAR> argument, this <VAR>char</VAR> should
335 be a non-alphanumeric character and is used as keyboard accelerator mark
336 instead of <SAMP>`&#38;&acute;</SAMP>.
338 <DT><SAMP>`-f&acute;</SAMP>
339 <DD>
340 <DT><SAMP>`--use-fuzzy&acute;</SAMP>
341 <DD>
342 <A NAME="IDX907"></A>
343 <A NAME="IDX908"></A>
344 <A NAME="IDX909"></A>
345 Use fuzzy entries in output. Note that using this option is usually wrong,
346 because fuzzy messages are exactly those which have not been validated by
347 a human translator.
349 </DL>
353 <H3><A NAME="SEC144" HREF="gettext_toc.html#TOC144">8.1.9 Output details</A></H3>
355 <DL COMPACT>
357 <DT><SAMP>`-a <VAR>number</VAR>&acute;</SAMP>
358 <DD>
359 <DT><SAMP>`--alignment=<VAR>number</VAR>&acute;</SAMP>
360 <DD>
361 <A NAME="IDX910"></A>
362 <A NAME="IDX911"></A>
363 Align strings to <VAR>number</VAR> bytes (default: 1).
365 <DT><SAMP>`--no-hash&acute;</SAMP>
366 <DD>
367 <A NAME="IDX912"></A>
368 Don't include a hash table in the binary file. Lookup will be more expensive
369 at run time (binary search instead of hash table lookup).
371 </DL>
375 <H3><A NAME="SEC145" HREF="gettext_toc.html#TOC145">8.1.10 Informative output</A></H3>
377 <DL COMPACT>
379 <DT><SAMP>`-h&acute;</SAMP>
380 <DD>
381 <DT><SAMP>`--help&acute;</SAMP>
382 <DD>
383 <A NAME="IDX913"></A>
384 <A NAME="IDX914"></A>
385 Display this help and exit.
387 <DT><SAMP>`-V&acute;</SAMP>
388 <DD>
389 <DT><SAMP>`--version&acute;</SAMP>
390 <DD>
391 <A NAME="IDX915"></A>
392 <A NAME="IDX916"></A>
393 Output version information and exit.
395 <DT><SAMP>`--statistics&acute;</SAMP>
396 <DD>
397 <A NAME="IDX917"></A>
398 Print statistics about translations.
400 <DT><SAMP>`-v&acute;</SAMP>
401 <DD>
402 <DT><SAMP>`--verbose&acute;</SAMP>
403 <DD>
404 <A NAME="IDX918"></A>
405 <A NAME="IDX919"></A>
406 Increase verbosity level.
408 </DL>
412 <H2><A NAME="SEC146" HREF="gettext_toc.html#TOC146">8.2 Invoking the <CODE>msgunfmt</CODE> Program</A></H2>
415 <A NAME="IDX920"></A>
416 <A NAME="IDX921"></A>
418 <PRE>
419 msgunfmt [<VAR>option</VAR>] [<VAR>file</VAR>]...
420 </PRE>
423 <A NAME="IDX922"></A>
424 The <CODE>msgunfmt</CODE> program converts a binary message catalog to a
425 Uniforum style .po file.
427 </P>
430 <H3><A NAME="SEC147" HREF="gettext_toc.html#TOC147">8.2.1 Operation mode</A></H3>
432 <DL COMPACT>
434 <DT><SAMP>`-j&acute;</SAMP>
435 <DD>
436 <DT><SAMP>`--java&acute;</SAMP>
437 <DD>
438 <A NAME="IDX923"></A>
439 <A NAME="IDX924"></A>
440 <A NAME="IDX925"></A>
441 Java mode: input is a Java <CODE>ResourceBundle</CODE> class.
443 <DT><SAMP>`--csharp&acute;</SAMP>
444 <DD>
445 <A NAME="IDX926"></A>
446 <A NAME="IDX927"></A>
447 C# mode: input is a .NET .dll file containing a subclass of
448 <CODE>GettextResourceSet</CODE>.
450 <DT><SAMP>`--csharp-resources&acute;</SAMP>
451 <DD>
452 <A NAME="IDX928"></A>
453 <A NAME="IDX929"></A>
454 C# resources mode: input is a .NET <TT>`.resources&acute;</TT> file.
456 <DT><SAMP>`--tcl&acute;</SAMP>
457 <DD>
458 <A NAME="IDX930"></A>
459 <A NAME="IDX931"></A>
460 Tcl mode: input is a tcl/msgcat <TT>`.msg&acute;</TT> file.
462 </DL>
466 <H3><A NAME="SEC148" HREF="gettext_toc.html#TOC148">8.2.2 Input file location</A></H3>
468 <DL COMPACT>
470 <DT><SAMP>`<VAR>file</VAR> ...&acute;</SAMP>
471 <DD>
472 Input .mo files.
474 </DL>
477 If no input <VAR>file</VAR> is given or if it is <SAMP>`-&acute;</SAMP>, standard input is read.
479 </P>
482 <H3><A NAME="SEC149" HREF="gettext_toc.html#TOC149">8.2.3 Input file location in Java mode</A></H3>
484 <DL COMPACT>
486 <DT><SAMP>`-r <VAR>resource</VAR>&acute;</SAMP>
487 <DD>
488 <DT><SAMP>`--resource=<VAR>resource</VAR>&acute;</SAMP>
489 <DD>
490 <A NAME="IDX932"></A>
491 <A NAME="IDX933"></A>
492 Specify the resource name.
494 <DT><SAMP>`-l <VAR>locale</VAR>&acute;</SAMP>
495 <DD>
496 <DT><SAMP>`--locale=<VAR>locale</VAR>&acute;</SAMP>
497 <DD>
498 <A NAME="IDX934"></A>
499 <A NAME="IDX935"></A>
500 Specify the locale name, either a language specification of the form <VAR>ll</VAR>
501 or a combined language and country specification of the form <VAR>ll_CC</VAR>.
503 </DL>
506 The class name is determined by appending the locale name to the resource name,
507 separated with an underscore. The class is located using the <CODE>CLASSPATH</CODE>.
509 </P>
512 <H3><A NAME="SEC150" HREF="gettext_toc.html#TOC150">8.2.4 Input file location in C# mode</A></H3>
514 <DL COMPACT>
516 <DT><SAMP>`-r <VAR>resource</VAR>&acute;</SAMP>
517 <DD>
518 <DT><SAMP>`--resource=<VAR>resource</VAR>&acute;</SAMP>
519 <DD>
520 <A NAME="IDX936"></A>
521 <A NAME="IDX937"></A>
522 Specify the resource name.
524 <DT><SAMP>`-l <VAR>locale</VAR>&acute;</SAMP>
525 <DD>
526 <DT><SAMP>`--locale=<VAR>locale</VAR>&acute;</SAMP>
527 <DD>
528 <A NAME="IDX938"></A>
529 <A NAME="IDX939"></A>
530 Specify the locale name, either a language specification of the form <VAR>ll</VAR>
531 or a combined language and country specification of the form <VAR>ll_CC</VAR>.
533 <DT><SAMP>`-d <VAR>directory</VAR>&acute;</SAMP>
534 <DD>
535 <A NAME="IDX940"></A>
536 Specify the base directory for locale dependent <TT>`.dll&acute;</TT> files.
538 </DL>
541 The <SAMP>`-l&acute;</SAMP> and <SAMP>`-d&acute;</SAMP> options are mandatory. The <TT>`.msg&acute;</TT> file is
542 located in a subdirectory of the specified directory whose name depends on the
543 locale.
545 </P>
548 <H3><A NAME="SEC151" HREF="gettext_toc.html#TOC151">8.2.5 Input file location in Tcl mode</A></H3>
550 <DL COMPACT>
552 <DT><SAMP>`-l <VAR>locale</VAR>&acute;</SAMP>
553 <DD>
554 <DT><SAMP>`--locale=<VAR>locale</VAR>&acute;</SAMP>
555 <DD>
556 <A NAME="IDX941"></A>
557 <A NAME="IDX942"></A>
558 Specify the locale name, either a language specification of the form <VAR>ll</VAR>
559 or a combined language and country specification of the form <VAR>ll_CC</VAR>.
561 <DT><SAMP>`-d <VAR>directory</VAR>&acute;</SAMP>
562 <DD>
563 <A NAME="IDX943"></A>
564 Specify the base directory of <TT>`.msg&acute;</TT> message catalogs.
566 </DL>
569 The <SAMP>`-l&acute;</SAMP> and <SAMP>`-d&acute;</SAMP> options are mandatory. The <TT>`.msg&acute;</TT> file is
570 located in the specified directory.
572 </P>
575 <H3><A NAME="SEC152" HREF="gettext_toc.html#TOC152">8.2.6 Output file location</A></H3>
577 <DL COMPACT>
579 <DT><SAMP>`-o <VAR>file</VAR>&acute;</SAMP>
580 <DD>
581 <DT><SAMP>`--output-file=<VAR>file</VAR>&acute;</SAMP>
582 <DD>
583 <A NAME="IDX944"></A>
584 <A NAME="IDX945"></A>
585 Write output to specified file.
587 </DL>
590 The results are written to standard output if no output file is specified
591 or if it is <SAMP>`-&acute;</SAMP>.
593 </P>
596 <H3><A NAME="SEC153" HREF="gettext_toc.html#TOC153">8.2.7 Output details</A></H3>
598 <DL COMPACT>
600 <DT><SAMP>`--force-po&acute;</SAMP>
601 <DD>
602 <A NAME="IDX946"></A>
603 Always write an output file even if it contains no message.
605 <DT><SAMP>`-i&acute;</SAMP>
606 <DD>
607 <DT><SAMP>`--indent&acute;</SAMP>
608 <DD>
609 <A NAME="IDX947"></A>
610 <A NAME="IDX948"></A>
611 Write the .po file using indented style.
613 <DT><SAMP>`--strict&acute;</SAMP>
614 <DD>
615 <A NAME="IDX949"></A>
616 Write out a strict Uniforum conforming PO file. Note that this
617 Uniforum format should be avoided because it doesn't support the
618 GNU extensions.
620 <DT><SAMP>`-p&acute;</SAMP>
621 <DD>
622 <DT><SAMP>`--properties-output&acute;</SAMP>
623 <DD>
624 <A NAME="IDX950"></A>
625 <A NAME="IDX951"></A>
626 Write out a Java ResourceBundle in Java <CODE>.properties</CODE> syntax. Note
627 that this file format doesn't support plural forms and silently drops
628 obsolete messages.
630 <DT><SAMP>`--stringtable-output&acute;</SAMP>
631 <DD>
632 <A NAME="IDX952"></A>
633 Write out a NeXTstep/GNUstep localized resource file in <CODE>.strings</CODE> syntax.
634 Note that this file format doesn't support plural forms.
636 <DT><SAMP>`-w <VAR>number</VAR>&acute;</SAMP>
637 <DD>
638 <DT><SAMP>`--width=<VAR>number</VAR>&acute;</SAMP>
639 <DD>
640 <A NAME="IDX953"></A>
641 <A NAME="IDX954"></A>
642 Set the output page width. Long strings in the output files will be
643 split across multiple lines in order to ensure that each line's width
644 (= number of screen columns) is less or equal to the given <VAR>number</VAR>.
646 <DT><SAMP>`--no-wrap&acute;</SAMP>
647 <DD>
648 <A NAME="IDX955"></A>
649 Do not break long message lines. Message lines whose width exceeds the
650 output page width will not be split into several lines. Only file reference
651 lines which are wider than the output page width will be split.
653 <DT><SAMP>`-s&acute;</SAMP>
654 <DD>
655 <DT><SAMP>`--sort-output&acute;</SAMP>
656 <DD>
657 <A NAME="IDX956"></A>
658 <A NAME="IDX957"></A>
659 <A NAME="IDX958"></A>
660 Generate sorted output. Note that using this option makes it much harder
661 for the translator to understand each message's context.
663 </DL>
667 <H3><A NAME="SEC154" HREF="gettext_toc.html#TOC154">8.2.8 Informative output</A></H3>
669 <DL COMPACT>
671 <DT><SAMP>`-h&acute;</SAMP>
672 <DD>
673 <DT><SAMP>`--help&acute;</SAMP>
674 <DD>
675 <A NAME="IDX959"></A>
676 <A NAME="IDX960"></A>
677 Display this help and exit.
679 <DT><SAMP>`-V&acute;</SAMP>
680 <DD>
681 <DT><SAMP>`--version&acute;</SAMP>
682 <DD>
683 <A NAME="IDX961"></A>
684 <A NAME="IDX962"></A>
685 Output version information and exit.
687 <DT><SAMP>`-v&acute;</SAMP>
688 <DD>
689 <DT><SAMP>`--verbose&acute;</SAMP>
690 <DD>
691 <A NAME="IDX963"></A>
692 <A NAME="IDX964"></A>
693 Increase verbosity level.
695 </DL>
699 <H2><A NAME="SEC155" HREF="gettext_toc.html#TOC155">8.3 The Format of GNU MO Files</A></H2>
701 <A NAME="IDX965"></A>
702 <A NAME="IDX966"></A>
704 </P>
706 The format of the generated MO files is best described by a picture,
707 which appears below.
709 </P>
711 <A NAME="IDX967"></A>
712 The first two words serve the identification of the file. The magic
713 number will always signal GNU MO files. The number is stored in the
714 byte order of the generating machine, so the magic number really is
715 two numbers: <CODE>0x950412de</CODE> and <CODE>0xde120495</CODE>. The second
716 word describes the current revision of the file format. For now the
717 revision is 0. This might change in future versions, and ensures
718 that the readers of MO files can distinguish new formats from old
719 ones, so that both can be handled correctly. The version is kept
720 separate from the magic number, instead of using different magic
721 numbers for different formats, mainly because <TT>`/etc/magic&acute;</TT> is
722 not updated often. It might be better to have magic separated from
723 internal format version identification.
725 </P>
727 Follow a number of pointers to later tables in the file, allowing
728 for the extension of the prefix part of MO files without having to
729 recompile programs reading them. This might become useful for later
730 inserting a few flag bits, indication about the charset used, new
731 tables, or other things.
733 </P>
735 Then, at offset <VAR>O</VAR> and offset <VAR>T</VAR> in the picture, two tables
736 of string descriptors can be found. In both tables, each string
737 descriptor uses two 32 bits integers, one for the string length,
738 another for the offset of the string in the MO file, counting in bytes
739 from the start of the file. The first table contains descriptors
740 for the original strings, and is sorted so the original strings
741 are in increasing lexicographical order. The second table contains
742 descriptors for the translated strings, and is parallel to the first
743 table: to find the corresponding translation one has to access the
744 array slot in the second array with the same index.
746 </P>
748 Having the original strings sorted enables the use of simple binary
749 search, for when the MO file does not contain an hashing table, or
750 for when it is not practical to use the hashing table provided in
751 the MO file. This also has another advantage, as the empty string
752 in a PO file GNU <CODE>gettext</CODE> is usually <EM>translated</EM> into
753 some system information attached to that particular MO file, and the
754 empty string necessarily becomes the first in both the original and
755 translated tables, making the system information very easy to find.
757 </P>
759 <A NAME="IDX968"></A>
760 The size <VAR>S</VAR> of the hash table can be zero. In this case, the
761 hash table itself is not contained in the MO file. Some people might
762 prefer this because a precomputed hashing table takes disk space, and
763 does not win <EM>that</EM> much speed. The hash table contains indices
764 to the sorted array of strings in the MO file. Conflict resolution is
765 done by double hashing. The precise hashing algorithm used is fairly
766 dependent on GNU <CODE>gettext</CODE> code, and is not documented here.
768 </P>
770 As for the strings themselves, they follow the hash file, and each
771 is terminated with a <KBD>NUL</KBD>, and this <KBD>NUL</KBD> is not counted in
772 the length which appears in the string descriptor. The <CODE>msgfmt</CODE>
773 program has an option selecting the alignment for MO file strings.
774 With this option, each string is separately aligned so it starts at
775 an offset which is a multiple of the alignment value. On some RISC
776 machines, a correct alignment will speed things up.
778 </P>
780 <A NAME="IDX969"></A>
781 Plural forms are stored by letting the plural of the original string
782 follow the singular of the original string, separated through a
783 <KBD>NUL</KBD> byte. The length which appears in the string descriptor
784 includes both. However, only the singular of the original string
785 takes part in the hash table lookup. The plural variants of the
786 translation are all stored consecutively, separated through a
787 <KBD>NUL</KBD> byte. Here also, the length in the string descriptor
788 includes all of them.
790 </P>
792 Nothing prevents a MO file from having embedded <KBD>NUL</KBD>s in strings.
793 However, the program interface currently used already presumes
794 that strings are <KBD>NUL</KBD> terminated, so embedded <KBD>NUL</KBD>s are
795 somewhat useless. But the MO file format is general enough so other
796 interfaces would be later possible, if for example, we ever want to
797 implement wide characters right in MO files, where <KBD>NUL</KBD> bytes may
798 accidently appear. (No, we don't want to have wide characters in MO
799 files. They would make the file unnecessarily large, and the
800 <SAMP>`wchar_t&acute;</SAMP> type being platform dependent, MO files would be
801 platform dependent as well.)
803 </P>
805 This particular issue has been strongly debated in the GNU
806 <CODE>gettext</CODE> development forum, and it is expectable that MO file
807 format will evolve or change over time. It is even possible that many
808 formats may later be supported concurrently. But surely, we have to
809 start somewhere, and the MO file format described here is a good start.
810 Nothing is cast in concrete, and the format may later evolve fairly
811 easily, so we should feel comfortable with the current approach.
813 </P>
815 <PRE>
816 byte
817 +------------------------------------------+
818 0 | magic number = 0x950412de |
820 4 | file format revision = 0 |
822 8 | number of strings | == N
824 12 | offset of table with original strings | == O
826 16 | offset of table with translation strings | == T
828 20 | size of hashing table | == S
830 24 | offset of hashing table | == H
833 . (possibly more entries later) .
836 O | length &#38; offset 0th string ----------------.
837 O + 8 | length &#38; offset 1st string ------------------.
838 ... ... | |
839 O + ((N-1)*8)| length &#38; offset (N-1)th string | | |
840 | | | |
841 T | length &#38; offset 0th translation ---------------.
842 T + 8 | length &#38; offset 1st translation -----------------.
843 ... ... | | | |
844 T + ((N-1)*8)| length &#38; offset (N-1)th translation | | | | |
845 | | | | | |
846 H | start hash table | | | | |
847 ... ... | | | |
848 H + S * 4 | end hash table | | | | |
849 | | | | | |
850 | NUL terminated 0th string &#60;----------------' | | |
851 | | | | |
852 | NUL terminated 1st string &#60;------------------' | |
853 | | | |
854 ... ... | |
855 | | | |
856 | NUL terminated 0th translation &#60;---------------' |
857 | | |
858 | NUL terminated 1st translation &#60;-----------------'
860 ... ...
862 +------------------------------------------+
863 </PRE>
865 <P><HR><P>
866 Go to the <A HREF="gettext_1.html">first</A>, <A HREF="gettext_7.html">previous</A>, <A HREF="gettext_9.html">next</A>, <A HREF="gettext_22.html">last</A> section, <A HREF="gettext_toc.html">table of contents</A>.
867 </BODY>
868 </HTML>