1 \section{\module{stringprep
} ---
2 Internet String Preparation
}
4 \declaremodule{standard
}{stringprep
}
5 \modulesynopsis{String preparation, as per RFC
3453}
6 \moduleauthor{Martin v. L\"owis
}{martin@v.loewis.de
}
7 \sectionauthor{Martin v. L\"owis
}{martin@v.loewis.de
}
9 When identifying things (such as host names) in the internet, it is
10 often necessary to compare such identifications for
11 ``equality''. Exactly how this comparison is executed may depend on
12 the application domain, e.g. whether it should be case-insensitive or
13 not. It may be also necessary to restrict the possible
14 identifications, to allow only identifications consisting of
15 ``printable'' characters.
17 \rfc{3454} defines a procedure for ``preparing'' Unicode strings in
18 internet protocols. Before passing strings onto the wire, they are
19 processed with the preparation procedure, after which they have a
20 certain normalized form. The RFC defines a set of tables, which can be
21 combined into profiles. Each profile must define which tables it uses,
22 and what other optional parts of the
\code{stringprep
} procedure are
23 part of the profile. One example of a
\code{stringprep
} profile is
24 \code{nameprep
}, which is used for internationalized domain names.
26 The module
\module{stringprep
} only exposes the tables from RFC
27 3454. As these tables would be very large to represent them as
28 dictionaries or lists, the module uses the Unicode character database
29 internally. The module source code itself was generated using the
30 \code{mkstringprep.py
} utility.
32 As a result, these tables are exposed as functions, not as data
33 structures. There are two kinds of tables in the RFC: sets and
34 mappings. For a set,
\module{stringprep
} provides the ``characteristic
35 function'', i.e. a function that returns true if the parameter is part
36 of the set. For mappings, it provides the mapping function: given the
37 key, it returns the associated value. Below is a list of all functions
38 available in the module.
40 \begin{funcdesc
}{in_table_a1
}{code
}
41 Determine whether
\var{code
} is in table
{A
.1} (Unassigned code points
45 \begin{funcdesc
}{in_table_b1
}{code
}
46 Determine whether
\var{code
} is in table
{B
.1} (Commonly mapped to
50 \begin{funcdesc
}{map_table_b2
}{code
}
51 Return the mapped value for
\var{code
} according to table
{B
.2}
52 (Mapping for case-folding used with NFKC).
55 \begin{funcdesc
}{map_table_b3
}{code
}
56 Return the mapped value for
\var{code
} according to table
{B
.3}
57 (Mapping for case-folding used with no normalization).
60 \begin{funcdesc
}{in_table_c11
}{code
}
61 Determine whether
\var{code
} is in table
{C
.1.1}
62 (ASCII space characters).
65 \begin{funcdesc
}{in_table_c12
}{code
}
66 Determine whether
\var{code
} is in table
{C
.1.2}
67 (Non-ASCII space characters).
70 \begin{funcdesc
}{in_table_c11_c12
}{code
}
71 Determine whether
\var{code
} is in table
{C
.1}
72 (Space characters, union of C
.1.1 and C
.1.2).
75 \begin{funcdesc
}{in_table_c21
}{code
}
76 Determine whether
\var{code
} is in table
{C
.2.1}
77 (ASCII control characters).
80 \begin{funcdesc
}{in_table_c22
}{code
}
81 Determine whether
\var{code
} is in table
{C
.2.2}
82 (Non-ASCII control characters).
85 \begin{funcdesc
}{in_table_c21_c22
}{code
}
86 Determine whether
\var{code
} is in table
{C
.2}
87 (Control characters, union of C
.2.1 and C
.2.2).
90 \begin{funcdesc
}{in_table_c3
}{code
}
91 Determine whether
\var{code
} is in table
{C
.3}
95 \begin{funcdesc
}{in_table_c4
}{code
}
96 Determine whether
\var{code
} is in table
{C
.4}
97 (Non-character code points).
100 \begin{funcdesc
}{in_table_c5
}{code
}
101 Determine whether
\var{code
} is in table
{C
.5}
105 \begin{funcdesc
}{in_table_c6
}{code
}
106 Determine whether
\var{code
} is in table
{C
.6}
107 (Inappropriate for plain text).
110 \begin{funcdesc
}{in_table_c7
}{code
}
111 Determine whether
\var{code
} is in table
{C
.7}
112 (Inappropriate for canonical representation).
115 \begin{funcdesc
}{in_table_c8
}{code
}
116 Determine whether
\var{code
} is in table
{C
.8}
117 (Change display properties or are deprecated).
120 \begin{funcdesc
}{in_table_c9
}{code
}
121 Determine whether
\var{code
} is in table
{C
.9}
122 (Tagging characters).
125 \begin{funcdesc
}{in_table_d1
}{code
}
126 Determine whether
\var{code
} is in table
{D
.1}
127 (Characters with bidirectional property ``R'' or ``AL'').
130 \begin{funcdesc
}{in_table_d2
}{code
}
131 Determine whether
\var{code
} is in table
{D
.2}
132 (Characters with bidirectional property ``L'').