uni2ascii 4.5 (Default branch)


 
Thread Tools Search this Thread
Special Forums News, Links, Events and Announcements Software Releases - RSS News uni2ascii 4.5 (Default branch)
# 1  
Old 03-21-2008
uni2ascii 4.5 (Default branch)

Image uni2ascii and ascii2uni convert between UTF-8 Unicode and 29 7-bit ASCII equivalents including: hexadecimal and decimal HTML and SGML numeric character references, \u-escapes, standard hexadecimal, raw hexadecimal, and RFC2396 URI format. Such ASCII equivalents are useful for entering Unicode in program source or in programs that are not 8-bit safe, and for testing and debugging. Several options allow Unicode to be converted to approximately equivalent ASCII, e.g. by stripping diacritics. An optional GUI is provided. License: GNU General Public License v3 Changes:
Microsoft-style HTML entities lacking the final semi-colon are now passed on by default rather than converted by ascii2uni. The new -m flag causes them to be converted. Error messages and warnings from ascii2uni now include the line number.Image

More...
Login or Register to Ask a Question

Previous Thread | Next Thread
Login or Register to Ask a Question
set_ucodepage(3alleg4)						  Allegro manual					    set_ucodepage(3alleg4)

NAME
set_ucodepage - Sets 8-bit to Unicode conversion tables. Allegro game programming library. SYNOPSIS
#include <allegro.h> void set_ucodepage(const unsigned short *table, const unsigned short *extras); DESCRIPTION
When you select the U_ASCII_CP encoding mode, a set of tables are used to convert between 8-bit characters and their Unicode equivalents. You can use this function to specify a custom set of mapping tables, which allows you to support different 8-bit codepages. The `table' parameter points to an array of 256 shorts, which contain the Unicode value for each character in your codepage. The `extras' parameter, if not NULL, points to a list of mapping pairs, which will be used when reducing Unicode data to your codepage. Each pair con- sists of a Unicode value, followed by the way it should be represented in your codepage. The list is terminated by a zero Unicode value. This allows you to create a many->one mapping, where many different Unicode characters can be represented by a single codepage value (eg. for reducing accented vowels to 7-bit ASCII). Allegro will use the `table' parameter when it needs to convert an ASCII string to an Unicode string. But when Allegro converts an Unicode string to ASCII, it will use both parameters. First, it will loop through the `table' parameter looking for an index position pointing at the Unicode value it is trying to convert (ie. the `table' parameter is also used for reverse matching). If that fails, the `extras' list is used. If that fails too, Allegro will put the character `^', giving up the conversion. Note that Allegro comes with a default `table' and `extras' parameters set internally. The default `table' will convert 8-bit characters to `^'. The default `extras' list reduces Latin-1 and Extended-A characters to 7 bits in a sensible way (eg. an accented vowel will be reduced to the same vowel without the accent). SEE ALSO
set_uformat(3alleg4) Allegro version 4.4.2 set_ucodepage(3alleg4)