Sponsored Content
Operating Systems Linux Help to Convert file from UNIX UTF-8 to Windows UTF-16 Post 302886237 by phanidhar6039 on Thursday 30th of January 2014 08:20:52 AM
Old 01-30-2014
Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi,

I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine

Code:
unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt

and i am getting some chinese characters as below which l opened the converted file on windows machine.

Code:
LANG=en_US.UTF-8
਍䰀䌀开䌀吀夀倀䔀㴀∀攀渀开唀匀⸀唀吀䘀ⴀ㠀∀ഀഀ
LC_NUMERIC="en_US.UTF-8"
਍䰀䌀开吀䤀䴀䔀㴀∀攀渀开唀匀⸀唀吀䘀ⴀ㠀∀ഀഀ
LC_COLLATE="en_US.UTF-8"
਍䰀䌀开䴀伀一䔀吀䄀刀夀㴀∀攀渀开唀匀⸀唀吀䘀ⴀ㠀∀ഀഀ
LC_MESSAGES="en_US.UTF-8"
਍䰀䌀开倀䄀倀䔀刀㴀∀攀渀开唀匀⸀唀吀䘀ⴀ㠀∀ഀഀ
LC_NAME="en_US.UTF-8"
਍䰀䌀开䄀䐀䐀刀䔀匀匀㴀∀攀渀开唀匀⸀唀吀䘀ⴀ㠀∀ഀഀ
LC_TELEPHONE="en_US.UTF-8"
਍䰀䌀开䴀䔀䄀匀唀刀䔀䴀䔀一吀㴀∀攀渀开唀匀⸀唀吀䘀ⴀ㠀∀ഀഀ
LC_IDENTIFICATION="en_US.UTF-8"
਍䰀䌀开䄀䰀䰀㴀ഀഀ

This is just a test file i was working on but the actual file contains numbers of rows and columns. Am i missing anything above up here?

The requirements are for the output should be as below

- UTF-16 Little endian
- preceded with a byte order marker --ff and fe
- Windows line endings

Any pointers will be great

Thanks,
P

Last edited by phanidhar6039; 01-30-2014 at 09:35 AM.. Reason: More info
 

10 More Discussions You Might Find Interesting

1. Programming

Howto convert Ascii -> UTF-8 & back C++

While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format. I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies

2. UNIX for Dummies Questions & Answers

grep and UNICODE (utf-16) file

I'm using shell scripting in Applescript. When searching a file with the ANSEL character set (for GEDCOM files) using (grep '1 CHAR ANSEL' filepath) gives the expected result. When searching a UNICODE formatted file (utf-16), searching for text known to exist in the file using (grep '1 CHAR... (4 Replies)
Discussion started by: Whiterock
4 Replies

3. UNIX for Advanced & Expert Users

Convert UTF-8 encoded hex value to a character

Hi, I have a non-ascii character (Ŵ), which can be represented in UTF-8 encoding as equivalent hex value (\xC5B4). Is there a function in unix to convert this hex value back to display the charcter ? (10 Replies)
Discussion started by: sumirmehta
10 Replies

4. UNIX for Advanced & Expert Users

UTF-8 to EBCDIC conversion in UNIX

Hi all, At present a file from AS400 system is being FTPed to an AIX system. Now, a similar file needs to be sent from our Unix box (Solaris) Is there any tool available which does the conversion in Unix from UTF-8 to EBCDIC? Any suggestions/ pointers are really appreciated. Thanks,... (4 Replies)
Discussion started by: sridhar_423
4 Replies

5. Red Hat

Can't convert 7bit ASCII to UTF-8

Hello, I am trying to convert a 7bit ASCII file to UTF-8. I have used iconv before though it can't recognize it for some reason and says unknown file encoding. When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies

6. UNIX for Dummies Questions & Answers

Issue with UTF-8 BOM character in text file

Sometimes we recieve some excel files containing French/Japanese characters over the mail, and these files are manually transferred to the server by using SFTP (security is not a huge concern here). The data is changed to text format before transferring it using Notepad. Problem is: When saving... (4 Replies)
Discussion started by: jawsnnn
4 Replies

7. Shell Programming and Scripting

Trying to convert utf-8 to WINDOWS-1251

Hello all i have utf-8 file that i try to convert to WINDOWS-1251 on linux without any success the file name is utf-8 when i try to do : file -bi test.txt it gives me : text/plain; charset=utf-8 when i try to convert the file i do : /usr/bin/iconv -f UTF-8 -t WINDOWS-1251 test.txt >... (1 Reply)
Discussion started by: umen
1 Replies

8. Shell Programming and Scripting

Copying a file with UTF char on UNIX server

Hi, I need to run a SQL which check for special UTF char in DB. When I try to copy that in UNIX file it changes it to some wierd chat. How can in retain the UTF chars in my script? e.g. ο|π|ρ|σ|τ|υ|φ|χ|ψ Any help will be appriciated. Thanks, (14 Replies)
Discussion started by: varun22486
14 Replies

9. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies

10. UNIX for Beginners Questions & Answers

Convert files to UTF-8 on AIX 7.1

Dears, I have a shell script - working perfectly on Oracle Linux - that detects the encoding (the charset to be exact) of the files in a specified directory using the "file" command (The file command outputs the charset in Linux, but doesn't do that in AIX), then if the file isn't a UTF-8 text... (4 Replies)
Discussion started by: JeanM-1
4 Replies
LOCALE(1)                                                        Linux User Manual                                                       LOCALE(1)

NAME
locale - get locale-specific information SYNOPSIS
locale [option] locale [option] -a locale [option] -m locale [option] name... DESCRIPTION
The locale command displays information about the current locale, or all locales, on standard output. When invoked without arguments, locale displays the current locale settings for each locale category (see locale(5)), based on the settings of the environment variables that control the locale (see locale(7)). Values for variables set in the environment are printed without dou- ble quotes, implied values are printed with double quotes. If either the -a or the -m option (or one of their long-format equivalents) is specified, the behavior is as follows: -a, --all-locales Display a list of all available locales. The -v option causes the LC_IDENTIFICATION metadata about each locale to be included in the output. -m, --charmaps Display the available charmaps (character set description files). To display the current character set for the locale, use locale -c charmap. The locale command can also be provided with one or more arguments, which are the names of locale keywords (for example, date_fmt, ctype- class-names, yesexpr, or decimal_point) or locale categories (for example, LC_CTYPE or LC_TIME). For each argument, the following is dis- played: * For a locale keyword, the value of that keyword to be displayed. * For a locale category, the values of all keywords in that category are displayed. When arguments are supplied, the following options are meaningful: -c, --category-name For a category name argument, write the name of the locale category on a separate line preceding the list of keyword values for that category. For a keyword name argument, write the name of the locale category for this keyword on a separate line preceding the keyword value. This option improves readability when multiple name arguments are specified. It can be combined with the -k option. -k, --keyword-name For each keyword whose value is being displayed, include also the name of that keyword, so that the output has the format: keyword="value" The locale command also knows about the following options: -v, --verbose Display additional information for some command-line option and argument combinations. -?, --help Display a summary of command-line options and arguments and exit. --usage Display a short usage message and exit. -V, --version Display the program version and exit. FILES
/usr/lib/locale/locale-archive Usual default locale archive location. /usr/share/i18n/locales Usual default path for locale definition files. CONFORMING TO
POSIX.1-2001, POSIX.1-2008. EXAMPLE
$ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= $ locale date_fmt %a %b %e %H:%M:%S %Z %Y $ locale -k date_fmt date_fmt="%a %b %e %H:%M:%S %Z %Y" $ locale -ck date_fmt LC_TIME date_fmt="%a %b %e %H:%M:%S %Z %Y" $ locale LC_TELEPHONE +%c (%a) %l (%a) %l 11 1 UTF-8 $ locale -k LC_TELEPHONE tel_int_fmt="+%c (%a) %l" tel_dom_fmt="(%a) %l" int_select="11" int_prefix="1" telephone-codeset="UTF-8" The following example compiles a custom locale from the ./wrk directory with the localedef(1) utility under the $HOME/.locale directory, then tests the result with the date(1) command, and then sets the environment variables LOCPATH and LANG in the shell profile file so that the custom locale will be used in the subsequent user sessions: $ mkdir -p $HOME/.locale $ I18NPATH=./wrk/ localedef -f UTF-8 -i fi_SE $HOME/.locale/fi_SE.UTF-8 $ LOCPATH=$HOME/.locale LC_ALL=fi_SE.UTF-8 date $ echo "export LOCPATH=$HOME/.locale" >> $HOME/.bashrc $ echo "export LANG=fi_SE.UTF-8" >> $HOME/.bashrc SEE ALSO
localedef(1), charmap(5), locale(5), locale(7) COLOPHON
This page is part of release 4.15 of the Linux man-pages project. A description of the project, information about reporting bugs, and the latest version of this page, can be found at https://www.kernel.org/doc/man-pages/. Linux 2017-09-15 LOCALE(1)
All times are GMT -4. The time now is 09:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy