2 * Copyright 2011 Haiku, Inc. All rights reserved.
3 * Distributed under the terms of the MIT License.
6 * Axel Dörfler, axeld@pinc-software.de
7 * John Scipione, jscipione@gmail.com
10 * headers/os/locale/UnicodeChar.h rev 42274
11 * src/kits/locale/UnicodeChar.cpp rev 42274
19 \brief Provides the BUnicodeChar class.
27 \brief Management of all information about characters.
29 This class provide a set of tools for managing the whole set of characters
30 defined by unicode. This include information about special sets of
31 characters such as if the character is whitespace, or alphanumeric. It also
32 provides the uppercase equivalent of a character and determines whether a
33 character can be ornamented with accents.
35 This class consists entirely of static methods, so you do not have to
36 instantiate it. You can call one of the methods passing in the character
37 that you want to be examined.
39 Note all the function work with chars encoded in UTF-32. This is not the
40 most usual way to handle characters, but it is the fastest. To convert an
41 UTF-8 string to an UTF-32 character use the FromUTF8() method.
48 \fn static bool BUnicodeChar::IsAlpha(uint32 c)
49 \brief Determine if \a c is alphabetic.
51 \returns \c true if the specified unicode character is an
59 \fn static bool BUnicodeChar::IsAlNum(uint32 c)
60 \brief Determine if \a c is alphanumeric.
62 \returns \c true if the specified unicode character is a
63 alphabetic or numeric character.
70 \fn static bool BUnicodeChar::IsDigit(uint32 c)
71 \brief Determine if \a c is numeric.
73 \returns \c true if the specified unicode character is a
81 \fn static bool BUnicodeChar::IsHexDigit(uint32 c)
82 \brief Determine if \a c is a hexadecimal digit.
84 \returns \c true if the specified unicode character is a
85 hexadecimal number character.
92 \fn static bool BUnicodeChar::IsUpper(uint32 c)
93 \brief Determine if \a c is uppercase.
95 \returns \c true if the specified unicode character is an
103 \fn static bool BUnicodeChar::IsLower(uint32 c)
104 \brief Determine if \a c is lowercase.
106 \returns \c true if the specified unicode character is a
114 \fn static bool BUnicodeChar::IsSpace(uint32 c)
115 \brief Determine if \a c is a space.
117 Unlike IsWhitespace() this function will return \c true for non-breakable
118 spaces. This method is useful for determining if the character will render
119 as an empty space which can be stretched on-screen.
121 \returns \c true if the specified unicode character is some
122 kind of a space character.
131 \fn static bool BUnicodeChar::IsWhitespace(uint32 c)
132 \brief Determine if \a c is whitespace.
134 This method is essentially the same as IsSpace(), but excludes all
135 non-breakable spaces.
137 \returns \c true if the specified unicode character is a whitespace
147 \fn static bool BUnicodeChar::IsControl(uint32 c)
148 \brief Determine if \a c is a control character.
150 Example control characters are the non-printable ASCII characters from
153 \returns \c true if the specified unicode character is a control
163 \fn static bool BUnicodeChar::IsPunctuation(uint32 c)
164 \brief Determine if \a c is punctuation character.
166 \returns \c true if the specified unicode character is a
167 punctuation character.
174 \fn static bool BUnicodeChar::IsPrintable(uint32 c)
175 \brief Determine if \a c is printable.
177 Printable characters are not control characters.
179 \returns \c true if the specified unicode character is a printable
189 \fn static bool BUnicodeChar::IsTitle(uint32 c)
190 \brief Determine if \a c is title case.
192 Title case characters are a smaller version of normal uppercase letters.
194 \returns \c true if the specified unicode character is a title case
202 \fn static bool BUnicodeChar::IsDefined(uint32 c)
203 \brief Determine if \a c is defined.
205 In unicode some codes are not valid or not attributed yet.
206 For these codes this method will return \c false.
208 \returns \c true if the specified unicode character is defined.
215 \fn static bool BUnicodeChar::IsBase(uint32 c)
216 \brief Determine if \a c can be used with a diacritic.
218 \note IsBase() does not determine if a unicode character is distinct.
220 \returns \c true if the specified unicode character is a base
221 form character that can be used with a diacritic.
228 \fn static int8 BUnicodeChar::Type(uint32 c)
229 \brief Gets the type of a character.
231 \returns A member of the \c unicode_char_category enum.
238 \fn uint32 BUnicodeChar::ToLower(uint32 c)
239 \brief Transforms \a c to lowercase.
241 \returns The lowercase version of the specified unicode character.
248 \fn uint32 BUnicodeChar::ToUpper(uint32 c)
249 \brief Transforms \a c to uppercase.
251 \returns The uppercase version of the specified unicode character.
258 \fn uint32 BUnicodeChar::ToTitle(uint32 c)
259 \brief Transforms \a c to title case.
261 \returns The title case version of the specified unicode character.
268 \fn int32 BUnicodeChar::DigitValue(uint32 c)
269 \brief Gets the numeric value \a c.
271 \returns The numeric version of the specified unicode character.
278 \fn void BUnicodeChar::ToUTF8(uint32 c, char** out)
279 \brief Transform a character to UTF-8 encoding.
281 \returns The UTF-8 encoding of the specified unicode character.
288 \fn uint32 BUnicodeChar::FromUTF8(const char** in)
289 \brief Transform a UTF-8 string to an UTF-32 character.
291 If the string contains multiple characters, only the fist one is used.
292 This function updates the in pointer so that it points on the next
293 character for the following call.
295 \returns The UTF-32 encoded version of \a in.
302 \fn size_t BUnicodeChar::UTF8StringLength(const char* string)
303 \brief Counts the characters in the given \c NUL terminated string.
305 \returns the number of UTF-8 characters in the \c NUL terminated string.
307 \sa BString::CountChars()
314 \fn size_t BUnicodeChar::UTF8StringLength(const char* string,
316 \brief Counts the characters in the given string up to \a maxLength
319 \param string does not need to be \c NUL terminated if you specify a
320 \a maxLength that is shorter than the maximum length of the string.
321 \param maxLength The maximum length of the string in bytes.
323 \returns the number of UTF-8 characters in the \c NUL terminated string
324 up to \a maxLength characters.