01-12-2011
How to find the file encoding and updating the file encoding?
Hi,
I am beginner to Unix.
My requirement is to validate the encoding used in the incoming file(csv,txt).If it is encoded with UTF-8 format,then the file should remain as such otherwise i need to chnage the encoding to UTF-8.
Please advice me how to proceed on this.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
1. I have a shell script which creates a file using cat command. How can i find what encoding the file follows (e.g. UTF8, ANSI)?
2. I want to convert that file to PC-ANSI format. How can i achieve that?
I am using HP-Unix. (6 Replies)
Discussion started by: ssmallya
6 Replies
2. Shell Programming and Scripting
In a bash script:
src=”cooltrack.wav”
dst=”cooltrack.mp3”
lame $src $dst
I would like to add some line that would delete the source wav file like:
rm $src
but I would like this only if the encoding was successful.
What should I include before deleting the original to check that the... (2 Replies)
Discussion started by: Aia
2 Replies
3. Shell Programming and Scripting
Hello!
The system is AIX 5.3
Give please command or script to get the file encoding
Thanks (2 Replies)
Discussion started by: vinment
2 Replies
4. AIX
Hello!
The system is AIX 5.3
Give please command or script to get the file encoding (1 Reply)
Discussion started by: vinment
1 Replies
5. Shell Programming and Scripting
Hi,
I have got a zip (binary) file transferred from MacOS (thus it has additional __MACOSX directory packed inside). On extracting this zip, there are few *.xml files available. When I opened this *.xml file in vim editor using Cygwin (on windows) the editor displayed in the bottom. I tried... (4 Replies)
Discussion started by: royalibrahim
4 Replies
6. HP-UX
how to find the character encoding of a file in hp_ux (1 Reply)
Discussion started by: alokjyotibal
1 Replies
7. UNIX for Dummies Questions & Answers
Hi, I am trying to determine the encoding for the file, because to convert to UTF-8, it seems as though I have to know the encoding of the source.
Tried this
file <filename>
give me this:
<filename>:data or International Language text
Tried to see the locale and this is the output:... (6 Replies)
Discussion started by: MIA651
6 Replies
8. UNIX for Advanced & Expert Users
Hello Experts, please help to provide any insight as I am facing issue migrating java application from hpux to redhat. The java program is using InputStreamReader to read a file without specifying any charset parameter.
However, in new Linux Redhat 5.6 environent, when reading a file that... (1 Reply)
Discussion started by: sonic_air
1 Replies
9. Solaris
Hi all!!
I´m using command file -i myfile.xml to validate XML file encoding, but it is just saying regular file . I´m expecting / looking an output as UTF8 or ANSI / ASCII
Is there command to display the files encoding?
Thank you! (2 Replies)
Discussion started by: mrreds
2 Replies
10. Shell Programming and Scripting
how can i know what format a file is
* example:
UTF-8
ANSI
UCS2
i am in a... (8 Replies)
Discussion started by: tricampeon81
8 Replies
LEARN ABOUT PHP
utf8_encode
UTF8_ENCODE(3) 1 UTF8_ENCODE(3)
utf8_encode - Encodes an ISO-8859-1 string to UTF-8
SYNOPSIS
string utf8_encode (string $data)
DESCRIPTION
This function encodes the string $data to UTF-8, and returns the encoded version. UTF-8 is a standard mechanism used by Unicode for
encoding wide character values into a byte stream. UTF-8 is transparent to plain ASCII characters, is self-synchronized (meaning it is
possible for a program to figure out where in the bytestream characters start) and can be used with normal string comparison functions for
sorting and such. PHP encodes UTF-8 characters in up to four bytes, like this:
UTF-8 encoding
+------+-------------------------------------+---+
|bytes | | |
| | | |
| | bits | |
| | | |
| | representation | |
| | | |
+------+-------------------------------------+---+
| 1 | | |
| | | |
| | 7 | |
| | | |
| | 0bbbbbbb | |
| | | |
| 2 | | |
| | | |
| | 11 | |
| | | |
| | 110bbbbb 10bbbbbb | |
| | | |
| 3 | | |
| | | |
| | 16 | |
| | | |
| | 1110bbbb 10bbbbbb 10bbbbbb | |
| | | |
| 4 | | |
| | | |
| | 21 | |
| | | |
| | 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb | |
| | | |
+------+-------------------------------------+---+
Each b represents a bit that can be used to store character data.
PARAMETERS
o $data
- An ISO-8859-1 string.
RETURN VALUES
Returns the UTF-8 translation of $data.
SEE ALSO
utf8_decode(3).
PHP Documentation Group UTF8_ENCODE(3)