unicode(n)						       Unicode normalization							unicode(n)


unicode - Implementation of Unicode normalization SYNOPSIS
package require Tcl 8.3 package require unicode 1.0 ::unicode::fromstring string ::unicode::tostring uclist ::unicode::normalize form uclist ::unicode::normalizeS form string _________________________________________________________________ DESCRIPTION
This is an implementation in Tcl of the Unicode normalization forms. COMMANDS
::unicode::fromstring string Converts string to list of integer Unicode character codes which is used in unicode for internal string representation. ::unicode::tostring uclist Converts list of integers uclist back to Tcl string. ::unicode::normalize form uclist Normalizes Unicode characters list ulist according to form and returns the normalized list. Form form takes one of the following values: D (canonical decomposition), C (canonical decomposition, followed by canonical composition), KD (compatibility decomposi- tion), or KC (compatibility decomposition, followed by canonical composition). ::unicode::normalizeS form string A shortcut to ::unicode::tostring [unicode::normalize $form [::unicode::fromstring $string]]. Normalizes Tcl string and returns normalized string. EXAMPLES
% ::unicode::fromstring "u0410u0411u0412u0413" 1040 1041 1042 1043 % ::unicode::tostring {49 50 51 52 53} 12345 % % ::unicode::normalize D {7692 775} 68 803 775 % ::unicode::normalizeS KD "u1d2c" A % REFERENCES
[1] "Unicode Standard Annex #15: Unicode Normalization Forms", ( AUTHORS
Sergei Golovan BUGS, IDEAS, FEEDBACK This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category string- prep of the Tcllib SF Trackers []. Please also report any ideas for enhancements you may have for either package and/or documentation. SEE ALSO
stringprep(n) KEYWORDS
normalization, unicode COPYRIGHT
Copyright (c) 2007, Sergei Golovan <> stringprep 1.0.0 unicode(n)

ASCII(1)						      General Commands Manual							  ASCII(1)

ascii, unicode - interpret ASCII, Unicode characters SYNOPSIS
ascii [ -8 ] [ -oxdbn ] [ -nct ] [ text ] unicode [ -nt ] hexmin-hexmax unicode [ -t ] hex [ ... ] unicode [ -n ] characters look hex /lib/unicode DESCRIPTION
Ascii prints the ASCII values corresponding to characters and vice versa; under the -8 option, the ISO Latin-1 extensions (codes 0200-0377) are included. The values are interpreted in a settable numeric base; -o specifies octal, -d decimal, -x hexadecimal (the default), and -bn base n. With no arguments, ascii prints a table of the character set in the specified base. Characters of text are converted to their ASCII val- ues, one per line. If, however, the first text argument is a valid number in the specified base, conversion goes the opposite way. Control characters are printed as two- or three-character mnemonics. Other options are: -n Force numeric output. -c Force character output. -t Convert from numbers to running text; do not interpret control characters or insert newlines. Unicode is similar; it converts between UTF and character values from the Unicode Standard (see utf(7)). If given a range of hexadecimal numbers, unicode prints a table of the specified Unicode characters -- their values and UTF representations. Otherwise it translates from UTF to numeric value or vice versa, depending on the appearance of the supplied text; the -n option forces numeric output to avoid ambigu- ity with numeric characters. If converting to UTF , the characters are printed one per line unless the -t flag is set, in which case the output is a single string containing only the specified characters. Unlike ascii, unicode treats no characters specially. The output of ascii and unicode may be unhelpful if the characters printed are not available in the current font. The file /lib/unicode contains a table of characters and descriptions, sorted in hexadecimal order, suitable for look(1) on the lower case hex values of characters. EXAMPLES
ascii -d Print the ASCII table base 10. unicode p Print the hex value of `p'. unicode 2200-22f1 Print a table of miscellaneous mathematical symbols. look 039 /lib/unicode See the start of the Greek alphabet's encoding in the Unicode Standard. FILES
/lib/unicode table of characters and descriptions. SOURCE
/src/cmd/ascii.c /src/cmd/unicode.c SEE ALSO
look(1), tcs(1), utf(7), font(7) ASCII(1)
