Unicode Utilities 2.26 (Default branch)


 
Thread Tools Search this Thread
Special Forums News, Links, Events and Announcements Software Releases - RSS News Unicode Utilities 2.26 (Default branch)
# 1  
Old 04-04-2008
Unicode Utilities 2.26 (Default branch)

ImageThe Unicode Utilities are a set of programsfor manipulating and analyzing Unicode text. uniname defaults to printing the character offset of each character, its byte offset, its hex code value, its encoding, the glyph itself, and its name. Command line options allow undesired information to be suppressed and the Unicode range to be added. unidesc reports the character ranges to which different portions of the text belong. unihist generates a histogram of the characters in its input. ExplicateUTF8 is intended for debugging or for learning about Unicode. It determines and explains the validity of a sequence of bytes as a UTF-8 encoding. unirev reverses UTF-8 strings.License: GNU General Public License (GPL)Changes:
This release adds a new utility, unifuzz, whichgenerates test input for programs expectingUnicode. In addition to generating randomsequences of characters, unifuzz can generate acharacter from each range, tokens of variouspotentially problematic characters and sequences,very long lines, strings with embedded nulls, andill-formed UTF-8.Image

More...
Login or Register to Ask a Question

Previous Thread | Next Thread
Login or Register to Ask a Question
Tcl_UniCharIsAlpha(3)					      Tcl Library Procedures					     Tcl_UniCharIsAlpha(3)

__________________________________________________________________________________________________________________________________________________

NAME
Tcl_UniCharIsAlnum, Tcl_UniCharIsAlpha, Tcl_UniCharIsControl, Tcl_UniCharIsDigit, Tcl_UniCharIsGraph, Tcl_UniCharIsLower, Tcl_UniCharIsPrint, Tcl_UniCharIsPunct, Tcl_UniCharIsSpace, Tcl_UniCharIsUpper, Tcl_UniCharIsWordChar - routines for classification of Tcl_UniChar characters SYNOPSIS
#include <tcl.h> int Tcl_UniCharIsAlnum(ch) int Tcl_UniCharIsAlpha(ch) int Tcl_UniCharIsControl(ch) int Tcl_UniCharIsDigit(ch) int Tcl_UniCharIsGraph(ch) int Tcl_UniCharIsLower(ch) int Tcl_UniCharIsPrint(ch) int Tcl_UniCharIsPunct(ch) int Tcl_UniCharIsSpace(ch) int Tcl_UniCharIsUpper(ch) int Tcl_UniCharIsWordChar(ch) ARGUMENTS
int ch (in) The Tcl_UniChar to be examined. _________________________________________________________________ DESCRIPTION
All of the routines described examine Tcl_UniChars and return a boolean value. A non-zero return value means that the character does belong to the character class associated with the called routine. The rest of this document just describes the character classes associated with the various routines. Note: A Tcl_UniChar is a Unicode character represented as an unsigned, fixed-size quantity. CHARACTER CLASSES
Tcl_UniCharIsAlnum tests if the character is an alphanumeric Unicode character. Tcl_UniCharIsAlpha tests if the character is an alphabetic Unicode character. Tcl_UniCharIsControl tests if the character is a Unicode control character. Tcl_UniCharIsDigit tests if the character is a numeric Unicode character. Tcl_UniCharIsGraph tests if the character is any Unicode print character except space. Tcl_UniCharIsLower tests if the character is a lowercase Unicode character. Tcl_UniCharIsPrint tests if the character is a Unicode print character. Tcl_UniCharIsPunct tests if the character is a Unicode punctuation character. Tcl_UniCharIsSpace tests if the character is a whitespace Unicode character. Tcl_UniCharIsUpper tests if the character is an uppercase Unicode character. Tcl_UniCharIsWordChar tests if the character is alphanumeric or a connector punctuation mark. KEYWORDS
unicode, classification Tcl 8.1 Tcl_UniCharIsAlpha(3)