1 <!-- doc/src/sgml/README.non-ASCII -->
3 Representation of non-ASCII characters
4 --------------------------------------
6 Find non-ASCII characters using:
8 grep --recursive --color='auto' -P '[\x80-\xFF]' .
10 Convert to HTML4 named entity (&) escapes
11 -----------------------------------------
13 We support several output formats:
15 * html (supports all Unicode characters)
16 * man (supports all Unicode characters)
17 * pdf (supports only Latin-1 characters)
20 While some output formatting tools support all Unicode characters,
21 others only support Latin-1 characters. Specifically, the PDF rendering
22 engine can only display Latin-1 characters; non-Latin-1 Unicode
23 characters are displayed as "###".
25 Therefore, in the SGML files, we only use Latin-1 characters. We
26 typically encode these characters as HTML entities, e.g., Álvaro.
27 It is also possible to safely represent Latin-1 characters in UTF8
28 encoding for all output formats.
30 Do not use UTF numeric character escapes (&#nnn;).
33 official: http://www.w3.org/TR/html4/sgml/entities.html
34 one page: http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
35 other lists: http://www.zipcon.net/~swhite/docs/computers/browsers/entities.html
36 http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
37 https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references