That is a complex specification to read (and understand). If I did - which I don't dare to assert - a solution to replace "extended" characters with "nomal" ones would be
Here the characters in question are replaced by X or Y, depending on their representation in UTF-8 - replace them by what you want, e.g. spaces or nothing. If you want to translate € to EUR, do so BEFORE those substitutions above, but be aware that then your "fixed length" will be lost.
While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format.
I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Hi All,
In the HP Unix that i'm using when i initialise a string as Stalled="'30¬G'"
Stalled=$Stalled" '30¬C'", it is taking the character ¬ as a comma. I need to grep for 30¬G 30¬C in a file and take its count. But since this character ¬ is not being understood, the count returns a zero.
The... (2 Replies)
Hi,
I try to get tr to replace multibytes characters by ascii equivalent. For example
"Je vais à l'école" ---> 'Je vais a l'ecole"
But my version of tr (5.97) doesn't seem to support multibyte sets.
$ locale charmap; echo "Je vais à l'école" | tr éà ea
UTF-8
Je vais aa l'aacole
I try to... (2 Replies)
Can someone help me to write a script / command to read in a file, character by character, replace any unknown ASCII characters with space. then write out the file to a new filename/
Thanks! (1 Reply)
Hi gurus,
I have a file in unix with ascii values. I need to convert all the ascii values in the file to ascii characters. File contains nearly 20000 records with ascii values. (10 Replies)
Hi please help in writing a script for replacing all the non-iso8859-1 characters to question marks.
I need a pattern of this kind
"sed s/<non-iso char range>/?/g < ipfile > opfile"
Please help me in this. (2 Replies)
Hello,
I am trying to convert a 7bit ASCII file to UTF-8.
I have used iconv before though it can't recognize it for some reason and says unknown file encoding.
When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file
It still... (2 Replies)
Hi,
I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine
unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt
and i am getting some chinese characters as below which l opened the converted file on windows machine.
LANG=en_US.UTF-8... (3 Replies)
We are getting extended Ascii characters in the input file and my requirement is to search and replace them with a space. I am using the following command
LANG=C sed -e 's// /g'
It is doing a good job, but in some cases it is replacing the extended characters with two spaces. So my input... (12 Replies)
Hi All,
I have an ascii file in which few columns are having hex values which i need to convert into ascii. Kindly suggest me what command can be used in unix shell scripting?
Thanks in Advance (2 Replies)
Discussion started by: HemaV
2 Replies
LEARN ABOUT PHP
mb_list_encodings
MB_LIST_ENCODINGS(3) 1 MB_LIST_ENCODINGS(3)mb_list_encodings - Returns an array of all supported encodingsSYNOPSIS
array mb_list_encodings (void )
DESCRIPTION
Returns an array containing all supported encodings.
PARAMETERS
This function has no parameters.
RETURN VALUES
Returns a numerically indexed array.
ERRORS /EXCEPTIONS
This function does not emit any errors.
EXAMPLES
Example #1
mb_list_encodings(3) example
<?php
print_r(mb_list_encodings());
?>
The above example will output something similar to:
Array
(
[0] => pass
[1] => auto
[2] => wchar
[3] => byte2be
[4] => byte2le
[5] => byte4be
[6] => byte4le
[7] => BASE64
[8] => UUENCODE
[9] => HTML-ENTITIES
[10] => Quoted-Printable
[11] => 7bit
[12] => 8bit
[13] => UCS-4
[14] => UCS-4BE
[15] => UCS-4LE
[16] => UCS-2
[17] => UCS-2BE
[18] => UCS-2LE
[19] => UTF-32
[20] => UTF-32BE
[21] => UTF-32LE
[22] => UTF-16
[23] => UTF-16BE
[24] => UTF-16LE
[25] => UTF-8
[26] => UTF-7
[27] => UTF7-IMAP
[28] => ASCII
[29] => EUC-JP
[30] => SJIS
[31] => eucJP-win
[32] => SJIS-win
[33] => JIS
[34] => ISO-2022-JP
[35] => Windows-1252
[36] => ISO-8859-1
[37] => ISO-8859-2
[38] => ISO-8859-3
[39] => ISO-8859-4
[40] => ISO-8859-5
[41] => ISO-8859-6
[42] => ISO-8859-7
[43] => ISO-8859-8
[44] => ISO-8859-9
[45] => ISO-8859-10
[46] => ISO-8859-13
[47] => ISO-8859-14
[48] => ISO-8859-15
[49] => EUC-CN
[50] => CP936
[51] => HZ
[52] => EUC-TW
[53] => BIG-5
[54] => EUC-KR
[55] => UHC
[56] => ISO-2022-KR
[57] => Windows-1251
[58] => CP866
[59] => KOI8-R
)
PHP Documentation Group MB_LIST_ENCODINGS(3)