Sponsored Content
Top Forums Shell Programming and Scripting Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters Post 302980623 by Neo on Wednesday 31st of August 2016 12:43:46 AM
Old 08-31-2016
Everyone at the UNIX and Linux Forums gives their best effort to reply to all questions in a timely manner. For this reason, posting questions with subjects like "Urgent!" or "Emergency" or "Help Me!" and demanding a fast reply are not permitted in the regular forums.

For members who want a higher visibility to their questions, we suggest you post in the Emergency UNIX and Linux Support Forum. This forum is given a higher priority than our regular forums.

Posting a new question in the Emergency UNIX and Linux Support Forum requires forum Bits. We monitor this forum to help people with emergencies, but we do not not guarantee response time or best answers. However, we will treat your post with a higher priority and give our best efforts to help you.

If you have posted a question in the regular forum with a subject "Urgent" "Emergency" or similar idea, we will, more-than-likely, close your thread and post this reply, redirecting you to the proper forum.

Of course, you can always post a descriptive subject text, remove words like "Urgent" etc. (from your subject and post) and post in the regular forums at any time.


Thank you.

The UNIX and Linux Forums
 

10 More Discussions You Might Find Interesting

1. Programming

Howto convert Ascii -> UTF-8 & back C++

While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format. I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies

2. Shell Programming and Scripting

Replace characters in a string using their ascii value

Hi All, In the HP Unix that i'm using when i initialise a string as Stalled="'30¬G'" Stalled=$Stalled" '30¬C'", it is taking the character ¬ as a comma. I need to grep for 30¬G 30¬C in a file and take its count. But since this character ¬ is not being understood, the count returns a zero. The... (2 Replies)
Discussion started by: roops
2 Replies

3. Shell Programming and Scripting

replace UTF-8 characters with tr

Hi, I try to get tr to replace multibytes characters by ascii equivalent. For example "Je vais à l'école" ---> 'Je vais a l'ecole" But my version of tr (5.97) doesn't seem to support multibyte sets. $ locale charmap; echo "Je vais à l'école" | tr éà ea UTF-8 Je vais aa l'aacole I try to... (2 Replies)
Discussion started by: ripat
2 Replies

4. Shell Programming and Scripting

read in a file character by character - replace any unknown ASCII characters with spa

Can someone help me to write a script / command to read in a file, character by character, replace any unknown ASCII characters with space. then write out the file to a new filename/ Thanks! (1 Reply)
Discussion started by: raghav525
1 Replies

5. Shell Programming and Scripting

convert ascii values into ascii characters

Hi gurus, I have a file in unix with ascii values. I need to convert all the ascii values in the file to ascii characters. File contains nearly 20000 records with ascii values. (10 Replies)
Discussion started by: sandeeppvk
10 Replies

6. Shell Programming and Scripting

Remove characters other than ISO8859-1

Hi please help in writing a script for replacing all the non-iso8859-1 characters to question marks. I need a pattern of this kind "sed s/<non-iso char range>/?/g < ipfile > opfile" Please help me in this. (2 Replies)
Discussion started by: rprajendran
2 Replies

7. Red Hat

Can't convert 7bit ASCII to UTF-8

Hello, I am trying to convert a 7bit ASCII file to UTF-8. I have used iconv before though it can't recognize it for some reason and says unknown file encoding. When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies

8. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

9. Shell Programming and Scripting

Search and Replace Extended Ascii Characters

We are getting extended Ascii characters in the input file and my requirement is to search and replace them with a space. I am using the following command LANG=C sed -e 's// /g' It is doing a good job, but in some cases it is replacing the extended characters with two spaces. So my input... (12 Replies)
Discussion started by: ysvsr1
12 Replies

10. Shell Programming and Scripting

Convert Hex to Ascii in a Ascii file

Hi All, I have an ascii file in which few columns are having hex values which i need to convert into ascii. Kindly suggest me what command can be used in unix shell scripting? Thanks in Advance (2 Replies)
Discussion started by: HemaV
2 Replies
TCS(1)							      General Commands Manual							    TCS(1)

NAME
tcs - translate character sets SYNOPSIS
tcs [ -slcv ] [ -f ics ] [ -t ocs ] [ file ... ] DESCRIPTION
Tcs interprets the named file(s) (standard input default) as a stream of characters from the ics character set or format, converts them to runes, and then converts them into a stream of characters from the ocs character set or format on the standard output. The default value for ics and ocs is utf, the UTF encoding described in utf(6). The -l option lists the character sets known to tcs. Processing continues in the face of conversion errors (the -s option prevents reporting of these errors). The -c option forces the output to contain only cor- rectly converted characters; otherwise, 0x80 characters will be substituted for UTF encoding errors and 0xFFFD characters will substituted for unknown characters. The -v option generates various diagnostic and summary information on standard error, or makes the -l output more verbose. Tcs recognizes an ever changing list of character sets. In particular, it supports a variety of Russian and Japanese encodings. Some of the supported encodings are utf The Plan 9 UTF encoding, known by ISO as UTF-8 utf1 The deprecated original UTF encoding from ISO 10646 ascii 7-bit ASCII 8859-1 Latin-1 (Central European) 8859-2 Latin-2 (Czech .. Slovak) 8859-3 Latin-3 (Dutch .. Turkish) 8859-4 Latin-4 (Scandinavian) 8859-5 Part 5 (Cyrillic) 8859-6 Part 6 (Arabic) 8859-7 Part 7 (Greek) 8859-8 Part 8 (Hebrew) 8859-9 Latin-5 (Finnish .. Portuguese) koi8 KOI-8 (GOST 19769-74) jis-kanji ISO 2022-JP ujis EUC-JX: JIS 0208 ms-kanji Microsoft, or Shift-JIS jis (from only) guesses between ISO 2022-JP, EUC or Shift-Jis gb Chinese national standard (GB2312-80) big5 Big 5 (HKU version) unicode Unicode Standard 1.0 tis Thai character set plus ASCII (TIS 620-1986) msdos IBM PC: CP 437 atari Atari-ST character set EXAMPLES
tcs -f 8859-1 Convert 8859-1 (Latin-1) characters into UTF format. tcs -s -f jis Convert characters encoded in one of several shift JIS encodings into UTF format. Unknown Kanji will be converted into 0xFFFD char- acters. tcs -lv Print an up to date list of the supported character sets. SOURCE
/sys/src/cmd/tcs SEE ALSO
ascii(1), rune(2), utf(6). TCS(1)
All times are GMT -4. The time now is 05:14 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy