1 .\" Id: libidnkit.3.in,v 1.1 2003/06/04 00:27:15 marka Exp
3 .\" Copyright (c) 2001,2002 Japan Network Information Center.
4 .\" All rights reserved.
6 .\" By using this file, you agree to the terms and conditions set forth bellow.
8 .\" LICENSE TERMS AND CONDITIONS
10 .\" The following License Terms and Conditions apply, unless a different
11 .\" license is obtained from Japan Network Information Center ("JPNIC"),
12 .\" a Japanese association, Kokusai-Kougyou-Kanda Bldg 6F, 2-3-4 Uchi-Kanda,
13 .\" Chiyoda-ku, Tokyo 101-0047, Japan.
15 .\" 1. Use, Modification and Redistribution (including distribution of any
16 .\" modified or derived work) in source and/or binary forms is permitted
17 .\" under this License Terms and Conditions.
19 .\" 2. Redistribution of source code must retain the copyright notices as they
20 .\" appear in each source code file, this License Terms and Conditions.
22 .\" 3. Redistribution in binary form must reproduce the Copyright Notice,
23 .\" this License Terms and Conditions, in the documentation and/or other
24 .\" materials provided with the distribution. For the purposes of binary
25 .\" distribution the "Copyright Notice" refers to the following language:
26 .\" "Copyright (c) 2000-2002 Japan Network Information Center. All rights reserved."
28 .\" 4. The name of JPNIC may not be used to endorse or promote products
29 .\" derived from this Software without specific prior written approval of
32 .\" 5. Disclaimer/Limitation of Liability: THIS SOFTWARE IS PROVIDED BY JPNIC
33 .\" "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
34 .\" LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
35 .\" PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL JPNIC BE LIABLE
36 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
37 .\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
38 .\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
39 .\" BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
40 .\" WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
41 .\" OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
42 .\" ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
44 .TH libidnkit 3 "Mar 11, 2002"
47 libidnkit, libidnkitlite \- Internationalized Domain Name Handling Libraries
54 \fBidn_nameinit\fP(int\ load_file)
57 \fBidn_encodename\fP(int\ actions,\ const\ char\ *from,\ char\ *to,\ size_t\ tolen)
60 \fBidn_decodename\fP(int\ actions,\ const\ char\ *from,\ char\ *to,\ size_t\ tolen)
63 \fBidn_decodename2\fP(int\ actions,\ const\ char\ *from,\ char\ *to,\ size_t\ tolen,
64 const\ char\ *auxencoding)
67 \fBidn_enable\fP(int\ on_off)
69 #include <idn/result.h>
72 \fBidn_result_tostring\fP(idn_result_t\ result)
77 \fBlibidnkit\fR and \fBlibidnkitlite\fR libraries support various
78 manipulations of internationalized domain names, including:
86 They are designed according to IDNA framework where each application must
87 do necessary preparations for the internationalized domain names before
88 passing them to the resolver.
90 To help applications do the preparation, the libraries provide easy-to-use,
91 high-level interface for the work.
93 Both libraries provide almost the same API.
94 The difference between them is that \fBlibidnkit\fR internally uses
95 \fIiconv\fR function to provide encoding conversion from UTF-8 to the
97 (such as iso-8859-1, usually determined by the current locale), and vise
99 \fBlibidnkitlite\fR is lightweight version of libidnkit.
100 It assumes local encoding is UTF-8 so that it never uses \fIiconv\fR.
102 This manual describes only a small subset of the API that the libraries
103 provide, most important functions for application programmers.
104 For other API, please refer to the idnkit's specification document
105 (which is not yet available) or the header files typically found under
106 `/usr/local/include/idn/' on your system.
110 The \fBidn_nameinit\fR function initializes the library.
111 It also sets default configuration if \fIload_file\fR is 0, otherwise
112 it tries to read a configuration file.
113 If \fBidn_nameinit\fR is called more than once, the library initialization
114 will take place only at the first call while the actual configuration
115 procedure will occur at every call.
117 If there are no errors, \fBidn_nameinit\fR returns \fBidn_success\fR.
118 Otherwise, the returned value indicates the cause of the error.
119 See the section ``RETURN VALUES'' below for the error codes.
121 Usually you don't have to call this function explicitly because
122 it is implicitly called when \fBidn_encodename\fR or \fBidn_decodename\fR
123 is first called without prior calling of \fBidn_nameinit\fR.
124 In such case, initialization without the configuration file
128 \fBidn_encodename\fR function performs name preparation and encoding
129 conversion on the internationalized domain name specified by \fIfrom\fR,
130 and stores the result to \fIto\fR, whose length is specified by
132 \fIactions\fR is a bitwise-OR of the following macros, specifying which
133 subprocesses in the encoding process are to be employed.
137 IDN_LOCALCONV Local encoding to UTF-8 conversion
138 IDN_DELIMMAP Delimiter mapping
139 IDN_LOCALMAP Local mapping
140 IDN_NAMEPREP NAMEPREP mapping, normalization,
141 prohibited character check and bidirectional
143 IDN_UNASCHECK NAMEPREP unassigned codepoint check
144 IDN_ASCCHECK ASCII range character check
145 IDN_IDNCONV UTF-8 to IDN encoding conversion
146 IDN_LENCHECK Label length check
151 Details of this encoding process can be found in the section ``NAME ENCODING''.
153 For convenience, also \fBIDN_ENCODE_QUERY\fR, \fBIDN_ENCODE_APP\fR
154 and \fBIDN_ENCODE_STORED\fR macros are provided.
155 \fBIDN_ENCODE_QUERY\fR is used to encode a ``query string''
156 (see the IDNA specification).
161 (IDN_LOCALCONV | IDN_DELIMMAP | IDN_LOCALMAP | IDN_NAMEPREP
162 | IDN_IDNCONV | IDN_LENCHECK)
167 if you are using \fBlibidnkit\fR, and equal to
171 (IDN_DELIMMAP | IDN_LOCALMAP | IDN_NAMEPREP | IDN_IDNCONV
177 if you are using \fBlibidnkitlite\fR.
179 \fBIDN_ENCODE_APP\fR is used for ordinary application to encode a
181 It performs \fBIDN_ASCCHECK\fR in addition with \fBIDN_ENCODE_QUERY\fR.
182 \fBIDN_ENCODE_STORED\fR is used to encode a ``stored string''
183 (see the IDNA specification).
184 It performs \fBIDN_ENCODE_APP\fR plus \fBIDN_UNASCHECK\fR.
186 \fBidn_decodename\fR function performs the reverse of \fBidn_encodename\fR.
187 It converts the internationalized domain name given by \fIfrom\fR,
188 which is represented in a special encoding called ACE,
189 to the application's local codeset and stores into \fIto\fR,
190 whose length is specified by \fItolen\fR.
191 As in \fBidn_encodename\fR, \fIactions\fR is a bitwise-OR of the following
196 IDN_DELIMMAP Delimiter mapping
197 IDN_NAMEPREP NAMEPREP mapping, normalization,
198 prohibited character check and bidirectional
200 IDN_UNASCHECK NAMEPREP unassigned codepoint check
201 IDN_IDNCONV UTF-8 to IDN encoding conversion
202 IDN_RTCHECK Round trip check
203 IDN_ASCCHECK ASCII range character check
204 IDN_LOCALCONV Local encoding to UTF-8 conversion
209 Details of this decoding process can be found in the section ``NAME DECODING''.
211 For convenience, also \fBIDN_DECODE_QUERY\fR, \fBIDN_DECODE_APP\fR
212 and \fBIDN_DECODE_STORED\fR macros are provided.
213 \fBIDN_DECODE_QUERY\fR is used to decode a ``qeury string''
214 (see the IDNA specification).
219 (IDN_DELIMMAP | IDN_NAMEPREP | IDN_IDNCONV | IDN_RTCHECK
225 if you are using \fBlibidnkit\fR, and equal to
229 (IDN_DELIMMAP | IDN_NAMEPREP | IDN_IDNCONV | IDN_RTCHECK)
234 if you are using \fBlibidnkitlite\fR.
236 \fBIDN_DECODE_APP\fR is used for ordinary application to decode a
238 It performs \fBIDN_ASCCHECK\fR in addition with \fBIDN_DECODE_QUERY\fR.
239 \fBIDN_DECODE_STORED\fR is used to decode a ``stored string''
240 (see the IDNA specification).
241 It performs \fBIDN_DECODE_APP\fR plus \fBIDN_UNASCHECK\fR.
243 \fBidn_decodename2\fR function provides the same functionality as
244 \fBidn_decodename\fR except that character encoding of \fIfrom\fR is
245 supposed to be \fIauxencoding\fR.
246 If IDN encoding is Punycode and \fIauxencoding\fR is ISO 8859-2
247 for example, it is assumed that the Punycode string stored in
248 \fIfrom\fR is written in ISO 8859-2.
250 In the IDN decode procedure, \fBIDN_NAMEPREP\fR is done before
251 \fBIDN_IDNCONV\fR, and some non-ASCII characters are converted to
252 ASCII characters as the result of \fBIDN_NAMEPREP\fR.
253 Therefore, ACE string given by \fBfrom\fR may contains those non-ASCII
255 That is the reason \fBdocode_name2\fR exists.
257 All of the functions above return error code of type \fBidn_result_t\fR.
258 All codes other than \fBidn_success\fR indicates some kind of failure.
259 \fBidn_result_tostring\fR function takes an error code \fIresult\fR
260 and returns a pointer to the corresponding message string.
263 Name encoding is a process that transforms the specified
264 internationalized domain name to a certain string suitable for name
266 For each label in a given domain name, the encoding processor performs:
268 .IP "(1) Convert to UTF-8 (IDN_LOCALCONV)"
269 Convert the encoding of the given domain name from application's local
270 encoding (e.g. ISO-8859-1) to UTF-8.
271 Note that \fBlibidnkitlite\fR doesn't support this step.
273 .IP "(2) Delimiter mapping (IDN_DELIMMAP)"
274 Map domain name delimiters to `.' (U+002E).
275 The recoginzed delimiters are: U+3002 (ideographic full stop),
276 U+FF0E (fullwidth full stop), U+FF61 (halfwidth ideographic full stop).
278 .IP "(3) Local mapping (IDN_LOCALMAP)"
279 Apply character mapping whose rule is determined by the TLD of the name.
281 .IP "(4) NAMEPREP (IDN_NAMEPREP, IDN_UNASCHECK)"
282 Perform name preparation (NAMEPREP), which is a standard process for
283 name canonicalizaion of internationalized domain names.
285 NAMEPREP consists of 5 steps:
286 mapping, normalization, prohibited character check, bidirectional
287 text check and unassigned codepoint check.
288 The first four steps are done by IDN_NAMEPREP, and the last step is
289 done by IDN_UNASCHECK.
291 .IP "(5) ASCII range character check (IDN_ASCCHECK)"
292 Checks if the domain name contains non-LDH ASCII character (not
293 alpha-numeric or hyphen), or it begins or end with hyphen.
295 .IP "(6) Convert to ACE (IDN_IDNCONV)"
296 Convert the NAMEPREPed name to a special encoding designed for representing
297 internationalized domain names.
299 The encoding is also known as ACE (ASCII Compatible Encoding) since
300 a string in the encoding is just like a traditional ASCII domain name
301 consisting of only letters, numbers and hyphens.
303 .IP "(7) Label length check (IDN_LENCHECK)"
304 For each label, check the number of characters in it.
305 It must be in the range 1 to 63.
307 There are many configuration parameters for this process, such as the
308 ACE or the local mapping rules. These parameters are read from the
309 default idnkit's configuration file, \fBidn.conf\fR.
310 See idn.conf(5) for details.
313 Name decoding is a reverse process of the name encoding.
314 It transforms the specified
315 internationalized domain name in a special encoding suitable for name
316 resolution to the normal name string in the application's current codeset.
317 However, name encoding and name decoding are not symmetric.
319 For each label in a given domain name, the decoding processor performs:
321 .IP "(1) Delimiter mapping (IDN_DELIMMAP)"
322 Map domain name delimiters to `.' (U+002E).
323 The recoginzed delimiters are: U+3002 (ideographic full stop),
324 U+FF0E (fullwidth full stop), U+FF61 (halfwidth ideographic full stop).
326 .IP "(2) NAMEPREP (IDN_NAMEPREP, IDN_UNASCHECK)"
327 Perform name preparation (NAMEPREP), which is a standard process for
328 name canonicalizaion of internationalized domain names.
330 .IP "(3) Convert to UTF-8 (IDN_IDNCONV)"
331 Convert the encoding of the given domain name from ACE to UTF-8.
333 .IP "(4) Round trip check (IDN_RTCHECK)"
334 Encode the result of (3) using the ``NAME ENCODING'' scheme, and then
335 compare it with the result of the step (2).
336 If they are different, the check is failed.
337 If IDN_UNASCHECK, IDN_ASCCHECK or both are specified, also they are
338 done in the encoding processes.
340 .IP "(5) Convert to local encoding"
341 Convert the result of (3) from UTF-8 to the application's local
342 encoding (e.g. ISO-8859-1).
343 Note that \fBlibidnkitlite\fR doesn't support this step.
345 If prohibited character check, unassigned codepoint check or
346 bidirectional text check at step (2) is failed, or round trip check
347 at step (4) is failed, the original input label is returned.
349 The configuration parameters for this process,
350 are also read from the configuration file \fBidn.conf\fR.
353 If the \fBIDN_DISABLE\fR environ variable is defined at run-time,
354 the libraries disable internationalized domain name support, by default.
355 In this case, \fBidn_encodename\fR and \fBidn_decodename\fR don't
356 encode/decode an input name, but instead they simply ouput a copy
357 of the input name as the result of encoding/decoding.
359 If your application should always enable mulitilingual domain name
360 support regardless of definition of \fBIDN_DISABLE\fR, call
369 before performing encoding/decoding.
372 Most of the API functions return values of type \fBidn_result_t\fR in
373 order to indicate the status of the call.
375 The following is a complete list of the status codes. Note that some
376 of them are never returned by the functions described in this manual.
379 Not an error. The call succeeded.
382 Specified information does not exist.
384 .SB idn_invalid_encoding
385 The encoding of the specified string is invalid.
387 .SB idn_invalid_syntax
388 There is a syntax error in the configuration file.
391 The specified name is not valid.
393 .SB idn_invalid_message
394 The specified DNS message is not valid.
396 .SB idn_invalid_action
397 The specified action contains invalid flags.
399 .SB idn_invalid_codepoint
400 The specified Unicode code point value is not valid.
402 .SB idn_invalid_length
403 The number of characters in an ACE label is not in the range 1 to 63.
405 .SB idn_buffer_overflow
406 The specified buffer is too small to hold the result.
409 The specified key does not exist in the hash table.
412 Memory allocation using malloc failed.
415 The specified file could not be opened.
418 Some characters do not have the mapping to the target character set.
420 .SB idn_context_required
421 Context information is required.
424 The specified string contains some prohibited characters.
427 Generic error which is not covered by the above codes.
430 To get the address of a internationalized domain name in the application's
431 local codeset, use \fBidn_encodename\fR to convert the name to the format
432 suitable for passing to resolver functions.
441 r = idn_encodename(IDN_ENCODE_APP, name, ace_name,
443 if (r != idn_success) {
444 fprintf(stderr, "idn_encodename failed: %s\en",
445 idn_result_tostring(r));
449 hp = gethostbyname(ace_name);
455 To decode the internationalized domain name returned from a resolver function,
456 use \fBidn_decodename\fR.
461 char local_name[256];
465 hp = gethostbyname(name);
466 r = idn_decodename(IDN_DECODE_APP, hp->h_name, local_name,
468 if (r != idn_success) {
469 fprintf(stderr, "idn_decodename failed: %s\en",
470 idn_result_tostring(r));
473 printf("name: %s\en", local_name);