Sponsored Content
Top Forums Shell Programming and Scripting Converting Unicode file to UTF8 format Post 302340765 by vfrg on Tuesday 4th of August 2009 09:38:25 AM
Old 08-04-2009
Hi,

I was able to successfully convert the file to UTF-8 format using the following command

iconv -f ISO8859-9 -t UTF-8 <input_file> > <output_file>

I still have one issue. We will receive file with encode type format ANSI and in some cases UTF-8.

If the file comes with encode type to ANSI, then using above command, we change the file to UTF-8. This is not an issue.

But if the file is comes with UTF-8 and if we run above command then the file special characters are not coming properly.

We need to run iconv command only if the file encode type is ANSI. If it is UTF-8 then we should not run iconv. How do we identify the encode of file in UNIX. Please help me in finding this.

Thanks.
Venkat
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Converting the File Creation Date to a new format

I need to capture a file's creation/modification date and time and convert this to a different format, whilst I can easily get the existing format from a ls -l | awk ' { print $......}' or a cut command I do not know how to convert it to a desired format? I should add that at present the ls -l... (1 Reply)
Discussion started by: barney_clough
1 Replies

2. Shell Programming and Scripting

converting string to unicode

How can I can convert a string in a shell script that looks something like: ]] to unicode equivalent? thanks a lot, webtekie (1 Reply)
Discussion started by: webtekie
1 Replies

3. UNIX for Advanced & Expert Users

Convert UTF8 Format file to ANSI format

:) Hi i am trying to convert a file which is in UTF8 format to ANSI format i tried to use the function ICONV but it is throwing error Function i used it as $ iconv -f UTF8 -t ANSI filename Error iam getting is NOT Supported UTF8 to ANSI please some help me out on this.........Let me... (1 Reply)
Discussion started by: rajreddy
1 Replies

4. UNIX for Dummies Questions & Answers

Convert UTF8 Format file to ANSI format

:confused: Hi i am trying to convert a file which is in UTF8 format to ANSI format i tried to use the function ICONV but it is throwing error Function i used it as $ iconv -f UTF8 -t ANSI filename Error iam getting is NOT Supported UTF8 to ANSI please some help me out on... (9 Replies)
Discussion started by: rajreddy
9 Replies

5. Shell Programming and Scripting

converting config file to csv format

Hello, For 2 days now i've been searching for a solution to this. I am now beginning to doubt this is even possible. It's even harder when you don't know how to search for it. (which keywords generate enough relevancy etc..) I need to parse a config file to generate a CSV file in return. It... (7 Replies)
Discussion started by: zer0dvide
7 Replies

6. Shell Programming and Scripting

Converting file format

My input file is Pipe delimited with 10 fields, I am trying to create a tab delimited output file with 6 fields from the provided input file. Below is sample data Input file abc||2|PIN|num||||www.123.com|abc@123.com| bcd||2|PIN|num|||||abc@123.com|... (3 Replies)
Discussion started by: pasupuleti81
3 Replies

7. Shell Programming and Scripting

Converting windows format file to unix format using script

Hi, I am having couple of files which i used to copy from windows to Linux, so now in case of text files (CTRL^M) appears at end of line. I know i can convert this windows format file to unix format file by running dos2unix. My requirement here is that i want to do it automatically using a... (5 Replies)
Discussion started by: sarbjit
5 Replies

8. Shell Programming and Scripting

Help with Converting UTF-8 data to Unicode

How can I get an error when converting 3rd line, since it has invalid characters abcde a®cdée a�cd� Unicode for ® = ® é = é I used "iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt" (2 Replies)
Discussion started by: arunbs
2 Replies

9. Shell Programming and Scripting

Need help in converting the file format

Hi All, I need help in converting the mentioned file format into desired output format using awk. Could anyone help me in this? Below is the input.. Date Account Campaign AdGroup Keyword Conversion Revenue Var1 Var2 Var3 Var4 Var5 10 20 30 ... (8 Replies)
Discussion started by: Ravi S M
8 Replies

10. UNIX for Dummies Questions & Answers

help required in converting a file format

My file format: -------------------------------------------------- Complete Consistency Check Valid Area : VALID:VALID Started by : esanwad Started at : Thu Dec 11 16:04:46 2014 CNA version : R21H04_EC08 Check range : AREA VALID/VALID ... (4 Replies)
Discussion started by: Gautam Banerjee
4 Replies
sdeckanji(5)							File Formats Manual						      sdeckanji(5)

NAME
sdeckanji - A character encoding system (codeset) for Japanese DESCRIPTION
The Super DEC Kanji codeset extends the DEC Kanji codeset to support the CS2 (JIS Katakana) and CS3 (JIS X0212) character sets that are also included in the Japanese EUC codeset. Super DEC Kanji is therefore a superset of both DEC Kanji and Japanese EUC and can handle data encoded in either DEC Kanji or Japanese EUC. The codeset was implemented to ease the transition from DEC Kanji, which is proprietary encod- ing, to Japanese EUC, which is encoding specified by an industry standard. In addition to supporting standard Japanese character sets, Super DEC Kanji provides three areas for defining User-Defined Characters (UDC). The UDC areas are as follows: ------------------------------------------------------------ Area Usage Row Range Number of Char- Code Range acters ------------------------------------------------------------ JIS X0208 85-94 940 F5A1-FEFE JIS X0212 78-94 1598 SS3 [EEA1-FEFE] UDC 1-94 8836 A121-FE7E ------------------------------------------------------------ The representation of ASCII/JIS Roman, JIS X0208 and User-Defined Characters (UDC) in Super DEC Kanji is the same as that in DEC Kanji. The representation of CS2 and CS3 in Super DEC Kanji is the same as that in Japanese EUC. Codeset Conversion The following codeset converter pairs are available for converting Japanese characters between sdeckanji and other encoding formats. Refer to iconv_intro(5) for an introduction to codeset conversion. For more information about the other codeset for which sdeckanji is the input or output, see the reference page specified in the list item. deckanji_sdeckanji, sdeckanji_deckanji Converting from and to the DEC Kanji codeset: deckanji(5). eucJP_sdeckanji, sdeckanji_eucJP Converting from and to Japanese Extended UNIX Code: eucJP(5). ISO-2022-JP_sdeckanji, sdeckanji_ISO-2022JP Converting from and to the ISO 2022-JP codeset: iso2022jp(5). ISO-2022-JPext_sdeckanji, sdeckanji_ISO-2022JPext Converting from and to the ISO 2022-JP Extended codeset: iso2022jp(5). JIS7_sdeckanji or jiskanji7_sdeckanji, sdeckanji_JIS7 or sdeckanji_jiskanji7 Converting from and to 7-bit JIS Kanji code: jiskanji(5). SJIS_sdeckanji, sdeckanji_SJIS Converting from and to the Shift JIS codeset: SJIS(5). Shift JIS encoding is identical to encoding used in the Microsoft Japanese code page for PC systems. Therefore, you can use these converters to convert Japanese characters between Super DEC Kanji and PC code-page format. For information on how the operating sys- tem supports PC code pages, see code_page(5). UCS-2_sdeckanji, sdeckanji_UCS-2 Converting from and to UCS-2 format: Unicode(5). UCS-4_sdeckanji, sdeckanji_UCS-4 Converting from and to UCS-4 format: Unicode(5). UTF-8_sdeckanji, sdeckanji_UTF-8 Converting from and to UTF--8 format: Unicode(5). Font Support for Super DEC Kanji For display devices, the operating system supports sdeckanji code by converting it to deckanji and using fonts available for deckanji. Refer to i18n_printing(5) and Japanese(5) for information about supporting print jobs that contain Japanese characters. SEE ALSO
Commands: locale(1) Others: ascii(5), code_page(5), deckanji(5), eucJP(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), iso2022jp(5), Japanese(5), jiskanji(5), l10n_intro(5), shiftjis(5), Unicode(5) sdeckanji(5)
All times are GMT -4. The time now is 05:36 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy