Replacing string/special characters using a 'conversion' table

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Replacing string/special characters using a 'conversion' table
# 1  
Old 07-15-2016
Replacing string/special characters using a 'conversion' table

Hi,

Does anyone know if there is a script or program available out there that uses a conversion table to replace special characters from a file?

I am trying to remove some special characters from a file but there are several unprintable/control characters that some I need to remove but some I need to replace with 1/2/3 spaces instead.

For example, I want to replace CTRL-I with an underscore, tabs with 3 spaces, CTRL-M with UNIX's newline etc.

I thought it will be easier to have a conversion table that I can use to do this instead of using tr? I need to make the changes in some kind of sequence too, i.e. do the CTRL-I's first, then the tab, etc.

Anyway, here's hoping someone has done this before or know of any script/program that does it.

Any reply much appreciated. Thanks in advance.
# 2  
Old 07-15-2016
dos2unix, recode, iconv come to my mind.
# 3  
Old 07-15-2016
Quote:
Originally Posted by newbie_01
For example, I want to replace CTRL-I with an underscore, tabs with 3 spaces, CTRL-M with UNIX's newline etc.
You can simply use sed to that that, eventually embedded in a script (tabs and spaces written as <t> and <b> for clarity, use literal tabs/spaces when writing):

Code:
#! /bin/ksh
typeset fIn="$1"

if [ ! -r "$fIn" ] ; then
     print -u2 "File $fIn not readable or not existing".
     exit 1
fi
sed 's/^I/_/g
     s/<t>/<b><b><b>/g
     s/^M$//
     [....]' "$fIn"

exit 0

Use the script like:

Code:
/path/to/script /some/input.file > /some/output.file

As you enter the script in vi notice that you can enter any non-printing character pressing <CTRL>-<V> and then your character, i.e <CTRL><I> literally.

A word of caution about "^M" characters: look at my take on it. You probably don't want to change any "^M" but only thos at line ends. The one on line ends you don't want to change into anything, just remove them. They are probably being left over from a DOS<->UNIX file transfer and as DOS has two characters signifying the line change and UNIX has only one you simply remove them.

I hope this helps.

bakunin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Any tip to replacing the special characters in a file

Hi, Please find attached a file that has special characters on it. It is a copy and paste from a Micro$oft file. I don't want to use strings as it remove all the 'indentations' / 'formatting' so I am replacing them with space instead. I am using the sed command below sed "s/$(printf... (1 Reply)
Discussion started by: newbie_01
1 Replies

2. UNIX for Dummies Questions & Answers

Replacing valuses containig space and special characters

**Extremely sorry for the typos in heading Old:CAST ('${DEFAULT_HIGH_DATE}' AS DATE FORMAT 'YYYY-MM-DD') New :CAST(CAST('${G_DEFAULT_HIGH_DATE}' AS DATE FORMAT 'MM-DD-YYYY') as DATE FORMAT 'YYYY-MM-DD') Need to change old format as new format cat file1 CAST ('${DEFAULT_HIGH_DATE}' AS... (1 Reply)
Discussion started by: 100bees
1 Replies

3. Linux

File conversion and removing special characters from a file in Linux

I have a .CSV file when I check for the special characters in the file using the command cat -vet filename.csv, i get very lengthy lines with "^@", "^I^@" and "^@^M" characters in between each alphabet in all of the records. Using the code below file filename.csv I get the output as I have a... (2 Replies)
Discussion started by: dhruuv369
2 Replies

4. Shell Programming and Scripting

Need help in replacing special characters

I am writing a ksh script. I need to replace a set of characters in an xml file. FROM="ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÛÚÜÝßàáâãäåçèéêëìíîïðñòóôõö¿¶ø®"; TO="AAAAAAACEEEEIIIIDNOOOOOOUUUUYSaaaaaaceeeeiiiionooooo N R" I have used the code- sed 's/$FROM/$TO/g'<abc.xml But its not working. Can anyone tell me the code to do this? (3 Replies)
Discussion started by: saga20
3 Replies

5. Shell Programming and Scripting

Replacing string with special characters in shell

Hi, I am trying to replace a string in shell but it is not working correctly. @xcom.file@ needs to be replaced with tb137 Plz help.Thx. Please use and tags when posting code, data or logs etc. to preserve formatting and enhance readability, thanks. (4 Replies)
Discussion started by: manish72
4 Replies

6. Programming

C++ Special Characters in a String?

Hello. How can i put all of the special characters on my keyboard into a string in c++ ? I tried this but it doesn't work. string characters("~`!@#$%^&*()_-+=|\}]{ How can i accomplish this? Thanks in advance. (1 Reply)
Discussion started by: cbreiny
1 Replies

7. Shell Programming and Scripting

help on sed replacing special characters

Hello, I have a file with many lines with below format: \abc\\1234 jkl\\567 def\\345 \pqr\\567 \xyz\\234 Here, i need to do 2 things. 1. replace \\ with \ 2. remove starting \ so output to be as below: (11 Replies)
Discussion started by: prvnrk
11 Replies

8. UNIX for Dummies Questions & Answers

Advice on extracting special characters from a DB2 table to a file in the UNIX ENV

need some advice on the following situation. I have a DB2 table which has a varchar Column. This varchar column can have special characters like ©, ®, ™ . When I extract from this table to a sequential file for this varchar column I am only able to get © and ® . To Get the ™... (1 Reply)
Discussion started by: cosec
1 Replies

9. Shell Programming and Scripting

Replacing French special characters

Hi, I have tonnes of .txt files that are written in French. I need to replace the French special characters, however, with English equivalents (e.g. é -> e and ç -> c). I have tried this --- #!/bin/bash # Convert French characters to normal characters # Treat each of the files exec... (4 Replies)
Discussion started by: BlueberryPickle
4 Replies

10. Shell Programming and Scripting

replacing string with special character ???

the problem is while replacing the old string with new one with the help of SED i am unable to replace the special characters with new strings. how can i do that? i dont want the user to be given the trouble to write '\' before every special characters like * , . , \ , $ , &. sed... (4 Replies)
Discussion started by: imppayel
4 Replies
Login or Register to Ask a Question