10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi
I have a big file which is in ansii . I want to convert it to UTF-16 .Please help me on this as I am stuck at this point in unix . (8 Replies)
Discussion started by: harry00514
8 Replies
2. Linux
Hi,
I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine
unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt
and i am getting some chinese characters as below which l opened the converted file on windows machine.
LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies
3. Shell Programming and Scripting
I Am trying to change the file encoding from ASCII to UTF-8 using below command
iconv -f ASCII -t UTF-8 <input_file> > <output_file>
But the output_file is not actually in UTF-8 format. If I use the file command to check the file encoding it still says ASCII.
While converting am not... (5 Replies)
Discussion started by: Sriranga
5 Replies
4. UNIX for Dummies Questions & Answers
I need to use sort, uniq, grep, wc,... and the like to work with lists of words in UTF-8 (the "words" being phonetic transcriptions using the IPA). I have been using Google a lot and I even found at least one previous post on this topic, but it didn't help.
I tried following the instructions... (2 Replies)
Discussion started by: mregine
2 Replies
5. Programming
My OS (Debian) and gcc use the UTF-8 locale. This code says that the char size is 1 byte but the size of 'a' is really 4 bytes.
int main(void)
{
setlocale(LC_ALL, "en_US.UTF-8");
printf("Char size: %i\nSize of char 'a': %i\nSize of Euro sign '€': %i\nLength of Euro sign: %i\n",... (8 Replies)
Discussion started by: cyler
8 Replies
6. AIX
please someone provide me the link for downloading en_us.utf-8 .....i have an issue with locale for which i need this :( (1 Reply)
Discussion started by: shubhendu.pyne
1 Replies
7. UNIX for Dummies Questions & Answers
Hi all, dummy here.... I have major errors on entering the shell. On login I get:
-bash: dircolors: command not found
-bash: tr: command not found
-bash: fgrep: command not found
-bash: grep: command not found
-bash: grep: command not found
-bash: id: command not found
-bash: [: =: unary... (12 Replies)
Discussion started by: wcmmlynn
12 Replies
8. Shell Programming and Scripting
Collegues
I tried to manipulate a UTF 8 data using the following script.
cat $1 | sed 's/ലായി$/ലായി LAYI/g' | sed 's/ുടെ/ുടെ UTE/g' | sed 's/യില്*/യില്* YIL/g'
But it says that cnot exicute binary file. Any solution.
Jaganadh.
Linguist (1 Reply)
Discussion started by: jaganadh
1 Replies
9. AIX
$ errpt | more
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
3074FEB7 0802050205 T H fscsi1 ADAPTER ERROR
B8113DD1 0802050205 T H fcs1 LINK ERROR
B8113DD1 0802050205 T H fcs1 LINK ERROR
3074FEB7 0802050205 T H fscsi0 ADAPTER ERROR
B8113DD1 ... (2 Replies)
Discussion started by: mcastill66
2 Replies
10. UNIX for Advanced & Expert Users
$ errpt | more
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
3074FEB7 0802050205 T H fscsi1 ADAPTER ERROR
B8113DD1 0802050205 T H fcs1 LINK ERROR
B8113DD1 0802050205 T H fcs1 LINK ERROR
3074FEB7 0802050205 T H fscsi0 ADAPTER ERROR
B8113DD1 ... (0 Replies)
Discussion started by: mcastill66
0 Replies
UTF8_ENCODE(3) 1 UTF8_ENCODE(3)
utf8_encode - Encodes an ISO-8859-1 string to UTF-8
SYNOPSIS
string utf8_encode (string $data)
DESCRIPTION
This function encodes the string $data to UTF-8, and returns the encoded version. UTF-8 is a standard mechanism used by Unicode for
encoding wide character values into a byte stream. UTF-8 is transparent to plain ASCII characters, is self-synchronized (meaning it is
possible for a program to figure out where in the bytestream characters start) and can be used with normal string comparison functions for
sorting and such. PHP encodes UTF-8 characters in up to four bytes, like this:
UTF-8 encoding
+------+-------------------------------------+---+
|bytes | | |
| | | |
| | bits | |
| | | |
| | representation | |
| | | |
+------+-------------------------------------+---+
| 1 | | |
| | | |
| | 7 | |
| | | |
| | 0bbbbbbb | |
| | | |
| 2 | | |
| | | |
| | 11 | |
| | | |
| | 110bbbbb 10bbbbbb | |
| | | |
| 3 | | |
| | | |
| | 16 | |
| | | |
| | 1110bbbb 10bbbbbb 10bbbbbb | |
| | | |
| 4 | | |
| | | |
| | 21 | |
| | | |
| | 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb | |
| | | |
+------+-------------------------------------+---+
Each b represents a bit that can be used to store character data.
PARAMETERS
o $data
- An ISO-8859-1 string.
RETURN VALUES
Returns the UTF-8 translation of $data.
SEE ALSO
utf8_decode(3).
PHP Documentation Group UTF8_ENCODE(3)