1 .\" $NetBSD: idnconv.1,v 1.4 2014/12/10 04:37:56 christos Exp $
3 .\" Id: idnconv.1,v 1.1 2003/06/04 00:27:10 marka Exp
5 .\" Copyright (c) 2000,2001,2002 Japan Network Information Center.
6 .\" All rights reserved.
8 .\" By using this file, you agree to the terms and conditions set forth bellow.
10 .\" LICENSE TERMS AND CONDITIONS
12 .\" The following License Terms and Conditions apply, unless a different
13 .\" license is obtained from Japan Network Information Center ("JPNIC"),
14 .\" a Japanese association, Kokusai-Kougyou-Kanda Bldg 6F, 2-3-4 Uchi-Kanda,
15 .\" Chiyoda-ku, Tokyo 101-0047, Japan.
17 .\" 1. Use, Modification and Redistribution (including distribution of any
18 .\" modified or derived work) in source and/or binary forms is permitted
19 .\" under this License Terms and Conditions.
21 .\" 2. Redistribution of source code must retain the copyright notices as they
22 .\" appear in each source code file, this License Terms and Conditions.
24 .\" 3. Redistribution in binary form must reproduce the Copyright Notice,
25 .\" this License Terms and Conditions, in the documentation and/or other
26 .\" materials provided with the distribution. For the purposes of binary
27 .\" distribution the "Copyright Notice" refers to the following language:
28 .\" "Copyright (c) 2000-2002 Japan Network Information Center. All rights reserved."
30 .\" 4. The name of JPNIC may not be used to endorse or promote products
31 .\" derived from this Software without specific prior written approval of
34 .\" 5. Disclaimer/Limitation of Liability: THIS SOFTWARE IS PROVIDED BY JPNIC
35 .\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
36 .\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
37 .\" PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL JPNIC BE LIABLE
38 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
39 .\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
40 .\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
41 .\" BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
42 .\" WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
43 .\" OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
44 .\" ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
46 .TH IDNCONV 1 "Mar 3, 2001"
49 idnconv \- codeset converter for named.conf and zone master files
52 \fBidnconv\fP [\fIoptions..\fP] [\fIfile\fP...]
55 \fBidnconv\fR is a codeset converter for named configuration files
56 and zone master files.
57 \fBidnconv\fR performs codeset conversion specified either
58 by the command-line arguments or by the configuration file,
59 and writes the converted text to stdout.
61 If file name is specified, \fBidnconv\fR converts the contents of
63 Otherwise, \fBidnconv\fR converts \fIstdin\fR.
65 Since \fBidnconv\fR is specifically designed for converting
66 internatinalized domain names, it may not be suitable as a general
70 \fBidnconv\fR has two operation modes.
72 One is a mode to convert local-encoded domain names to IDN-encoded
73 one. Usually this mode is used for preparing domain names to be
74 listed in named configuration files or zone master files.
75 In this mode, the following processes are performed in addition to
76 the codeset (encoding) conversion.
81 standard domain name preperation (NAMEPREP)
84 The other mode is a reverse conversion, from IDN-encoded domain name to
85 local-encoded domain names.
86 In this mode, local mapping and NAMEPREP are not performed since
87 IDN-encoded names should already be normalized.
88 Instead, a check is done in order to make sure the IDN-encoded domain name
89 is properly NAMEPREP'ed. If it is not, the name will be output in
90 IDN encoding, not in the local encoding.
93 Normally \fBidnconv\fR reads system's default configuration file
94 (idn.conf) and performs conversion or name preparation according to
95 the parameters specified in the file.
96 You can override the setting in the configuration file by various
97 command line options below.
99 \fB\-in\fP \fIin-code\fP, \fB\-i\fP \fIin-code\fP
100 Specify the codeset name of the input text.
101 Any of the following codeset names can be specified.
104 Any codeset names which \fIiconv_open()\fP library function accepts
110 Any alias names for the above, defined by the codeset alias file.
113 If this option is not specified, the default codeset is determined
114 from the locale in normal conversion mode.
115 In reverse conversion mode, the default codeset is the IDN encoding
116 specified by the configuration file (``idn-encoding'' entry).
118 \fB\-out\fP \fIout-code\fP, \fB\-o\fP \fIout-code\fP
119 Specify the codeset name of the output text. \fIout-code\fP can be any
120 codeset name that can be specified for \fB\-in\fR option.
122 If this option is not specified, the default is the IDN encoding
123 specified by the configuration file (``idn-encoding'' entry) in
124 normal conversion mode.
125 In reverse conversion mode, the default codeset is determined from
128 \fB\-conf\fP \fIpath\fP, \fB\-c\fP \fIpath\fP
129 Specify the pathname of idnkit configuration file (``idn.conf'').
130 If not specified, system's default file is used, unless \-noconf
133 \fB\-noconf\fP, \fB\-C\fP
134 Specify that no configuration file is to be used.
136 \fB\-reverse\fP, \fB\-r\fP
137 Specify reverse conversion mode.
139 If this option is not specified, the normal conversion mode is used.
141 \fB\-nameprep\fR \fIversion\fR, \fB\-n\fR \fIversion\fR
142 Specify the version of NAMEPREP.
143 The following is a list of currently available versions.
145 .IP \f(CWRFC3491\fR 4
146 Perform NAMEPREP according to the RFC3491
150 \fB\-nonameprep\fR, \fB\-N\fR
151 Specify to skip NAMEPREP process (or NAMEPREP verification process
152 in the reverse conversion mode).
153 This option implies -nounassigncheck and -nobidicheck.
155 \fB\-localmap\fR \fImap\fR
156 Specify the name of local mapping rule.
157 Currently, following maps are available.
159 .IP \f(CWRFC3491\fR 4
160 Use the list of mappings specified by RFC3491.
161 .IP \f(CWfilemap:\fR\fIpath\fR 4
162 Use list of mappings specified by mapfile \fIpath\fR.
163 See idn.conf(5) for the format of a mapfile.
166 This option can be specified more than once.
167 In that case, each mapping will be performed in the order of the
170 \fB\-nounassigncheck\fR, \fB\-U\fR
171 Skip unassigned codepoint check.
173 \fB\-nobidicheck\fR, \fB\-B\fR
174 Skip bidi character check.
176 \fB\-nolengthcheck\fR
177 Do not check label length of normal conversion result.
178 This option is only meaningful in the normal conversion mode.
180 \fB\-noasciicheck\fR, \fB\-A\fR
181 Do not check ASCII range characters.
182 This option is only meaningful in the normal conversion mode.
184 \fB\-noroundtripcheck\fR
185 Do not perform round trip check.
186 This option is only meaningful in the reverse conversion mode.
188 \fB\-delimiter\fR \fIcodepoint\fP
189 Specify the character to be mapped to domain name delimiter (period).
190 This option can be specified more than once in order to specify multiple
193 This option is only meaningful in the normal conversion mode.
195 \fB\-whole\fP, \fB\-w\fP
196 Perform local mapping, nameprep and conversion to output codeset for the entire
197 input text. If this option is not specified, only non-ASCII characters
198 and their surrounding texts will be processed.
199 See ``NORAML CONVERSION MECHANISM'' and ``REVERSE CONVERSION MECHANISM''
202 \fB\-alias\fP \fIpath\fP, \fB\-a\fP \fIpath\fP
203 Specify a codeset alias file. It is a simple text file, where
204 each line has a pair of alias name and real name separated by one
205 or more white spaces like below:
209 \fIalias-codeset-name\fP \fIreal-codeset-name\fP
213 Lines starting with ``#'' are treated as comments.
216 Force line-buffering mode.
218 \fB\-version\fP, \fB\-v\fP
219 Print version information and quit.
222 idnconv guesses local codeset from locale and environment variables.
223 See the ``LOCAL CODESET'' section in idn.conf(5) for more details.
225 .SH NORMAL CONVERSION MECHANISM
226 \fBidnconv\fR performs conversion line by line.
227 Here describes how \fBidnconv\fR does its job for each line.
229 .IP "1. read a line from input text" 4
230 .IP "2. convert the line to UTF-8" 4
231 \fBidnconv\fR converts the line from local encoding to UTF-8.
232 .IP "3. find internationalized domain names" 4
233 If the \-whole\ (or \-w) option is specified, the entire line is
234 assumed as an internationalized domain name.
235 Otherwise, \fBidnconv\fR recognizes any character sequences having
236 the following properties in the line as internationalized domain names.
239 containing at least one non-ASCII character, and
241 consisting of legal domain name characters (alphabets, digits, hypens),
242 non-ASCII characters and period.
244 .IP "4. convert internationalized domain names to ACE" 4
245 For each internationalized domain name found in the line,
246 \fBidnconv\fR converts the name to ACE.
247 The details about the conversion procedure is:
249 .IP "4.1. delimiter mapping" 4
250 Substibute certain characters specified as domain name delimiter
252 .IP "4.2. local mapping" 4
253 Perform local mapping.
254 If the local mapping is specified by command line option \-localmap,
255 the specified mapping rule is applied. Otherwise, find the mapping rule
256 from the configuration file which matches to the TLD of the name,
257 and perform mapping according to the matched rule.
259 This step is skipped if the \-nolocalmap (or \-L) option is specified.
260 .IP "4.3. NAMEPREP" 4
261 Perform name preparation (NAMEPREP).
262 Mapping, normalization, prohibited character checking, unassigned
263 codepoint checking, bidirectional character checking are done in
265 If the prohibited character check, unassigned codepoint check, or
266 bidi character check fails, the normal conversion procedure aborts.
268 This step is skipped if the \-nonameprep (or \-N) option is specified.
269 .IP "4.4. ASCII character checking" 4
270 Checks ASCII range character in the domain name.
271 the normal conversion procedure aborts, if the domain name has a label
272 beginning or end with hyphen (U+002D) or it contains ASCII range character
273 except for alphanumeric and hyphen,
275 This step is skipped if the \-noasciicheck (or \-A) option is specified.
276 .IP "4.5. ACE conversion" 4
277 Convert the string to ACE.
278 .IP "4.6. label length checking" 4
279 The normal conversion procedure aborts, if the domain name has an empty
280 label or too long label (64 characters or more).
282 This step is skipped if the \-nolengthcheck option is specified.
284 .IP "5. output the result" 4
287 .SH REVERSE CONVERSION MECHANISM
288 This is like the normal conversion mechanism, but they are not symmetric.
289 \fBidnconv\fR does its job for each line.
291 .IP "1. read a line from input text" 4
292 .IP "2. convert the line to UTF-8" 4
293 \fBidnconv\fR converts the line from local encoding to UTF-8.
294 .IP "3. find internationalized domain names" 4
295 If the \-whole\ (or \-w) option is specified, the entire line is
296 assumed as an internationalized domain name.
297 Otherwise, \fBidnconv\fR decodes any valid ASCII domain names
298 including ACE names in the line.
299 .IP "4. convert domain names to local encoding"
300 Then, \fBidnconv\fR decodes the domain names.
301 The decode procedure consists of the following steps.
303 .IP "4.1. Delimiter mapping" 4
304 Substibute certain characters specified as domain name delimiter
307 .IP "4.2. NAMEPREP" 4
308 Perform name preparation (NAMEPREP) for each label in the domain name.
309 Mapping, normalization, prohibited character checking, unassigned
310 codepoint checking, bidirectional character checking are done in
312 If the prohibited character check, unassigned codepoint check, or
313 bidi character check fails, disqualified labels are restored to
314 original input strings and further conversion on those labels are
317 This step is skipped if the \-nonameprep (or \-N) option is specified.
318 .IP "4.3. ACE conversion" 4
319 Convert the string from ACE to UTF-8.
320 .IP "4.4. Round trip checkning" 4
321 For each label, perform the normal conversion and compare it with
322 the result of the step 4.2.
323 This check succeeds, if they are equivalent strings.
324 In case of failure, disqualified labels are restored to original
325 input strings and further conversion on those labels are not
328 This step is skipped if the \-noroundtripcheck option is specified.
329 .IP "4.5. local encoding conversion" 4
330 Convert the result of the step 4.3. from UTF-8 to local encoding.
331 If a label in the domain name contains a character which cannot be
332 represented in the local encoding, the label is restored to the
333 original input string.
335 .IP "5. output the result" 4
339 Maybe the best way to manage named.conf or zone master files that contains
340 internationalized domain name is to keep them in your local codeset so that
341 they can be edited with your favorite editor, and generate a version in
342 the IDN encoding using \fBidnconv\fP.
344 `make' is a convenient tool for this purpose.
345 Suppose the local codeset version has suffix `.lc', and its ACE version
346 has suffix `.ace'. The following Makefile enables you to generate
347 ACE version from local codeset version by just typing `make'.
352 \&.SUFFIXES: .lc .ace
354 idnconv -in $(LOCALCODE) -out $(IDNCODE) \\
355 $(IDNCONVOPT) $< > $@
361 DESTFILES = db.zone1.ace db.zone2.ace
373 The automatic input-code selection depends on your system, and sometimes
374 it cannot guess or guess wrong. It is better to explicitly specify it