Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Replacing string/special characters using a 'conversion' table Post 302977271 by bakunin on Friday 15th of July 2016 10:42:04 AM
Old 07-15-2016
Quote:
Originally Posted by newbie_01
For example, I want to replace CTRL-I with an underscore, tabs with 3 spaces, CTRL-M with UNIX's newline etc.
You can simply use sed to that that, eventually embedded in a script (tabs and spaces written as <t> and <b> for clarity, use literal tabs/spaces when writing):

Code:
#! /bin/ksh
typeset fIn="$1"

if [ ! -r "$fIn" ] ; then
     print -u2 "File $fIn not readable or not existing".
     exit 1
fi
sed 's/^I/_/g
     s/<t>/<b><b><b>/g
     s/^M$//
     [....]' "$fIn"

exit 0

Use the script like:

Code:
/path/to/script /some/input.file > /some/output.file

As you enter the script in vi notice that you can enter any non-printing character pressing <CTRL>-<V> and then your character, i.e <CTRL><I> literally.

A word of caution about "^M" characters: look at my take on it. You probably don't want to change any "^M" but only thos at line ends. The one on line ends you don't want to change into anything, just remove them. They are probably being left over from a DOS<->UNIX file transfer and as DOS has two characters signifying the line change and UNIX has only one you simply remove them.

I hope this helps.

bakunin
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

replacing string with special character ???

the problem is while replacing the old string with new one with the help of SED i am unable to replace the special characters with new strings. how can i do that? i dont want the user to be given the trouble to write '\' before every special characters like * , . , \ , $ , &. sed... (4 Replies)
Discussion started by: imppayel
4 Replies

2. Shell Programming and Scripting

Replacing French special characters

Hi, I have tonnes of .txt files that are written in French. I need to replace the French special characters, however, with English equivalents (e.g. é -> e and ç -> c). I have tried this --- #!/bin/bash # Convert French characters to normal characters # Treat each of the files exec... (4 Replies)
Discussion started by: BlueberryPickle
4 Replies

3. UNIX for Dummies Questions & Answers

Advice on extracting special characters from a DB2 table to a file in the UNIX ENV

need some advice on the following situation. I have a DB2 table which has a varchar Column. This varchar column can have special characters like ©, ®, ™ . When I extract from this table to a sequential file for this varchar column I am only able to get © and ® . To Get the ™... (1 Reply)
Discussion started by: cosec
1 Replies

4. Shell Programming and Scripting

help on sed replacing special characters

Hello, I have a file with many lines with below format: \abc\\1234 jkl\\567 def\\345 \pqr\\567 \xyz\\234 Here, i need to do 2 things. 1. replace \\ with \ 2. remove starting \ so output to be as below: (11 Replies)
Discussion started by: prvnrk
11 Replies

5. Programming

C++ Special Characters in a String?

Hello. How can i put all of the special characters on my keyboard into a string in c++ ? I tried this but it doesn't work. string characters("~`!@#$%^&*()_-+=|\}]{ How can i accomplish this? Thanks in advance. (1 Reply)
Discussion started by: cbreiny
1 Replies

6. Shell Programming and Scripting

Replacing string with special characters in shell

Hi, I am trying to replace a string in shell but it is not working correctly. @xcom.file@ needs to be replaced with tb137 Plz help.Thx. Please use and tags when posting code, data or logs etc. to preserve formatting and enhance readability, thanks. (4 Replies)
Discussion started by: manish72
4 Replies

7. Shell Programming and Scripting

Need help in replacing special characters

I am writing a ksh script. I need to replace a set of characters in an xml file. FROM="ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÛÚÜÝßàáâãäåçèéêëìíîïðñòóôõö¿¶ø®"; TO="AAAAAAACEEEEIIIIDNOOOOOOUUUUYSaaaaaaceeeeiiiionooooo N R" I have used the code- sed 's/$FROM/$TO/g'<abc.xml But its not working. Can anyone tell me the code to do this? (3 Replies)
Discussion started by: saga20
3 Replies

8. Linux

File conversion and removing special characters from a file in Linux

I have a .CSV file when I check for the special characters in the file using the command cat -vet filename.csv, i get very lengthy lines with "^@", "^I^@" and "^@^M" characters in between each alphabet in all of the records. Using the code below file filename.csv I get the output as I have a... (2 Replies)
Discussion started by: dhruuv369
2 Replies

9. UNIX for Dummies Questions & Answers

Replacing valuses containig space and special characters

**Extremely sorry for the typos in heading Old:CAST ('${DEFAULT_HIGH_DATE}' AS DATE FORMAT 'YYYY-MM-DD') New :CAST(CAST('${G_DEFAULT_HIGH_DATE}' AS DATE FORMAT 'MM-DD-YYYY') as DATE FORMAT 'YYYY-MM-DD') Need to change old format as new format cat file1 CAST ('${DEFAULT_HIGH_DATE}' AS... (1 Reply)
Discussion started by: 100bees
1 Replies

10. UNIX for Beginners Questions & Answers

Any tip to replacing the special characters in a file

Hi, Please find attached a file that has special characters on it. It is a copy and paste from a Micro$oft file. I don't want to use strings as it remove all the 'indentations' / 'formatting' so I am replacing them with space instead. I am using the sed command below sed "s/$(printf... (1 Reply)
Discussion started by: newbie_01
1 Replies
RKMAPPHONOGRAM(3)					     Library Functions Manual						 RKMAPPHONOGRAM(3)

NAME
RkMapPhonogram - perform Romaji-kana conversion by using a specified Romaji-kana conversion table SYNOPSIS
#include <canna/RK.h> int RkMapPhonogram(romaji, dst, maxdst, src, srclen, flags, ulen, dlen, tlen, rule) struct RkRxDic *romaji; unsigned char *dst; int maxdst; unsigned char *src; int srclen; int flags; int *ulen; int *dlen; int *tlen; int *rule; DESCRIPTION
RkMapPhonogram performs Romaji-kana conversion by using a specified Romaji-kana conversion table. romaji specifies a pointer to the Romaji-kana conversion table that has been returned by RkOpenRoma(3). RkMapPhonogram interprets the srclen bytes of contiguous data start- ing at the area src as a Romaji character string and searches through the Romaji-kana conversion table for the rule of Romaji-kana conver- sion that matches the beginning of the Romaji character string. If the corresponding conversion rule is found, the area dst is loaded with the corresponding kana character string, with dlen being set to its byte length. The resulting character string is truncated when its byte length exceeds maxdst -1. flags is a combination of the following Romaji-kana conversion flags connected by or: RK_SOKUON Enables gemination, in which a sequence of two occurrences of the same character, xx, is converted to small kana character "tsu" only if it is not defined in the Romaji-kana conversion table file. This process provides for situa- tions in which an obsolete Romaji-kana conversion table is used. RK_FLUSH Converts all Romaji characters compulsorily. Useful for handling n appearing at the end of a character string. ulen, dlen, and tlen are used to manage the progress of Romaji-kana conversion. For example, if the character string "tt" is given to RkMapPhonogram, the first "t" is submitted to Romaji-kana conversion, with dst being set to small kana character "tsu". The remaining "t" is put to reuse for Romaji-kana conversion. When "a" is entered subsequently, it is combined with the "t" left from the previous run of RkMapPhonogram to generate kana character "ta". ulen is set to the byte length of the characters from src used for Romaji-kana conversion. dlen is set to the byte length of the kana characters derived from Romaji-kana conversion. tlen is set to the byte length of the character string to be used in the next run of Romaji-kana conversion. The character string led by tlen is placed after the character string resulting from Romaji-kana conversion in the dst buffers. If null pointers are specified in ulen, dlen, and tlen, these parameters do not return any values and are simply ignored. rule is used to exchange information about the rule of Romaji-kana conversion. When calling RkMapPhonogram for the first time, specify a pointer to a variable loaded with 0. Upon return from the first call to RkMapPhonogram, that variable is loaded with internal information about the rule of Romaji-kana conversion, in place of 0. To continue Romaji-kana conversion, specify the same pointer to the variable to RkMapPhonogram. RETURN VALUE
This function returns information about whether the character string set in dst has been converted by using a rule existing in the Romaji- kana conversion table or not. The function returns 1 if a rule existing in the Romaji-kana conversion table has been used; otherwise, it returns 0. RKMAPPHONOGRAM(3)
All times are GMT -4. The time now is 06:02 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy