Sponsored Content
Operating Systems Linux Help to Convert file from UNIX UTF-8 to Windows UTF-16 Post 302886930 by fpmurphy on Tuesday 4th of February 2014 09:51:27 AM
Old 02-04-2014
iconv and BOMs are a gray area in the Unicode specification. A useful discussion regarding iconv and presence or lack of a BOM is here
 

10 More Discussions You Might Find Interesting

1. Programming

Howto convert Ascii -> UTF-8 & back C++

While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format. I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies

2. UNIX for Dummies Questions & Answers

grep and UNICODE (utf-16) file

I'm using shell scripting in Applescript. When searching a file with the ANSEL character set (for GEDCOM files) using (grep '1 CHAR ANSEL' filepath) gives the expected result. When searching a UNICODE formatted file (utf-16), searching for text known to exist in the file using (grep '1 CHAR... (4 Replies)
Discussion started by: Whiterock
4 Replies

3. UNIX for Advanced & Expert Users

Convert UTF-8 encoded hex value to a character

Hi, I have a non-ascii character (Ŵ), which can be represented in UTF-8 encoding as equivalent hex value (\xC5B4). Is there a function in unix to convert this hex value back to display the charcter ? (10 Replies)
Discussion started by: sumirmehta
10 Replies

4. UNIX for Advanced & Expert Users

UTF-8 to EBCDIC conversion in UNIX

Hi all, At present a file from AS400 system is being FTPed to an AIX system. Now, a similar file needs to be sent from our Unix box (Solaris) Is there any tool available which does the conversion in Unix from UTF-8 to EBCDIC? Any suggestions/ pointers are really appreciated. Thanks,... (4 Replies)
Discussion started by: sridhar_423
4 Replies

5. Red Hat

Can't convert 7bit ASCII to UTF-8

Hello, I am trying to convert a 7bit ASCII file to UTF-8. I have used iconv before though it can't recognize it for some reason and says unknown file encoding. When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies

6. UNIX for Dummies Questions & Answers

Issue with UTF-8 BOM character in text file

Sometimes we recieve some excel files containing French/Japanese characters over the mail, and these files are manually transferred to the server by using SFTP (security is not a huge concern here). The data is changed to text format before transferring it using Notepad. Problem is: When saving... (4 Replies)
Discussion started by: jawsnnn
4 Replies

7. Shell Programming and Scripting

Trying to convert utf-8 to WINDOWS-1251

Hello all i have utf-8 file that i try to convert to WINDOWS-1251 on linux without any success the file name is utf-8 when i try to do : file -bi test.txt it gives me : text/plain; charset=utf-8 when i try to convert the file i do : /usr/bin/iconv -f UTF-8 -t WINDOWS-1251 test.txt >... (1 Reply)
Discussion started by: umen
1 Replies

8. Shell Programming and Scripting

Copying a file with UTF char on UNIX server

Hi, I need to run a SQL which check for special UTF char in DB. When I try to copy that in UNIX file it changes it to some wierd chat. How can in retain the UTF chars in my script? e.g. ο|π|ρ|σ|τ|υ|φ|χ|ψ Any help will be appriciated. Thanks, (14 Replies)
Discussion started by: varun22486
14 Replies

9. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies

10. UNIX for Beginners Questions & Answers

Convert files to UTF-8 on AIX 7.1

Dears, I have a shell script - working perfectly on Oracle Linux - that detects the encoding (the charset to be exact) of the files in a specified directory using the "file" command (The file command outputs the charset in Linux, but doesn't do that in AIX), then if the file isn't a UTF-8 text... (4 Replies)
Discussion started by: JeanM-1
4 Replies
UTF8_ENCODE(3)								 1							    UTF8_ENCODE(3)

utf8_encode - Encodes an ISO-8859-1 string to UTF-8

SYNOPSIS
string utf8_encode (string $data) DESCRIPTION
This function encodes the string $data to UTF-8, and returns the encoded version. UTF-8 is a standard mechanism used by Unicode for encoding wide character values into a byte stream. UTF-8 is transparent to plain ASCII characters, is self-synchronized (meaning it is possible for a program to figure out where in the bytestream characters start) and can be used with normal string comparison functions for sorting and such. PHP encodes UTF-8 characters in up to four bytes, like this: UTF-8 encoding +------+-------------------------------------+---+ |bytes | | | | | | | | | bits | | | | | | | | representation | | | | | | +------+-------------------------------------+---+ | 1 | | | | | | | | | 7 | | | | | | | | 0bbbbbbb | | | | | | | 2 | | | | | | | | | 11 | | | | | | | | 110bbbbb 10bbbbbb | | | | | | | 3 | | | | | | | | | 16 | | | | | | | | 1110bbbb 10bbbbbb 10bbbbbb | | | | | | | 4 | | | | | | | | | 21 | | | | | | | | 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb | | | | | | +------+-------------------------------------+---+ Each b represents a bit that can be used to store character data. PARAMETERS
o $data - An ISO-8859-1 string. RETURN VALUES
Returns the UTF-8 translation of $data. SEE ALSO
utf8_decode(3). PHP Documentation Group UTF8_ENCODE(3)
All times are GMT -4. The time now is 01:35 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy