Search and replace with mapping from a mapper file in a target file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Search and replace with mapping from a mapper file in a target file
# 1  
Old 11-24-2012
Search and replace with mapping from a mapper file in a target file

Hello,
I have a special problem. I have a file in 8 bit and would like to convert the whole database to 16Bit unicode.
The mapping file has the following structure:

Quote:
8bitcode>16bit code
The mapper is provided as a zip file

The target file to be converted contains data in English and 8 bit Urdu mapping, a sample of which is also given below:
Quote:
'œ„
±'œ„
±•œ„
±¤¥œ„
±§œ„
‚ªœ„
¬¸œ„
¡„
±‰²‚¡„
±²‚¡„
‚²¡„
±‰§£„
£§„
œ‚¨„
Ž‚ª„
±‰§‚ª„
±°ª„
±œ‰¬„
±œ¬„
±¬µ¬„
I have tried to use the mapper to convert but seem to get entangled in 8 bit and 16 bit conversion.

The only other way is to map to hex values and write a program in C which would be long and cumbersome
Any help given would be really great.
Many thanks in advance.
# 2  
Old 11-24-2012
As far as i can see "sed" should be up to the task: go through your mapper file, one line at a time. There are 4 bytes in every line:

1. a "src" character
2. a ">"
3,4. a "target" character.

It is easy to divide in 2 shell variables:

Code:
cat mapper | while read line ; do
     src="${line%???}"
     tgt="${line#??}"
done

Now use these variables to replace all original characters to translated ones. You have to mark you original characters somehow to make sure you don't translate characters twice during various passes. In this case we escape all characters initially with a backslash and remove these as we translate. This way we can - by searching for backslashes in the resultfile - find out if our mapper file is completely covering the input. If still backslashes are there some characters didn't get translated.

Code:
sed 's/./\\&//g' /path/to/infile > workfile # mask all characters with "\"

cat mapper | while read line ; do
     src="${line%???}"
     tgt="${line#??}"
     sed 's/\\'"${src}"'/'"${tgt}"'/g' workfile > workfile.tmp
     mv workfile.tmp workfile
done
mv workfile /path/to/outfile

This silently presupposes the backslash "\" not to be used in your inputfile. Replace it with another (unused) character if this is not the case.

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 3  
Old 11-24-2012
Many thanks. I forgot to mention that I work on Windows and do not have SED. Is there a PERL or awk solution for the same ?
Many thanks once again
# 4  
Old 11-25-2012
If you use "awk" i suppose you already have some part of the "MKS-toolkit" (or whatever it is called today, "Unix Tools for Windows" or something such - i don't use any Windows) installed. You can download and install "sed" from the same source you already got "awk" from.

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 5  
Old 11-25-2012
Many thanks. Will try the solution out and get back to you
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replacing 12 columns of one file by second file based on mapping in third file

i have a real data prod file with 80+ fields containing 1k -2k records. i have to extract say 12 columns out of this which are sensitive fields along with one primary key say SEQ_ID (like DOB,account no, name, SEQ_ID, govtid etc) in a lookup file. i have to replace these sensitive fields in... (11 Replies)
Discussion started by: megh12
11 Replies

2. Shell Programming and Scripting

Search and Replace in a new FILE.

Hi, more data.txt more srstring.sh input="data.txt" while IFS= read -r var do startdirectory=$loc search=$(echo $var | awk -F'=' '{print $1}') replace=$(echo $var | awk -F'=' '{print $2}') find "/tmp/config" -type f -exec grep -l "$search" {} + | while read file do if sed -e... (9 Replies)
Discussion started by: mohtashims
9 Replies

3. Shell Programming and Scripting

Nested search in a file and replace the inner search

Hi Team, I am new to unix, please help me in this. I have a file named properties. The content of the file is : ##Mobile props east.url=https://qa.east.corp.com/prop/end west.url=https://qa.west.corp.com/prop/end south.url=https://qa.south.corp.com/prop/end... (2 Replies)
Discussion started by: tolearn
2 Replies

4. Shell Programming and Scripting

Search in a file and replace the mapped entity in another file

Hello, I looked for all search and replace routines in the forum but could find none which meets my requirement. I have a file in which a set of mappers are provided. The structure is as under a=b c=d e=f A simplistic example from English and French would make this clear: John=Jean... (2 Replies)
Discussion started by: gimley
2 Replies

5. Shell Programming and Scripting

Search and replace from file in awk using a 16 bit text file

Hello, Some time ago a helpful awk file was provided on the forum which I give below: NR==FNR{A=$0;next}{for(j in A){split(A,P,"=");for(i=1;i<=NF;i++){if($i==P){$i=P}}}}1 While it works beautifully on English and Latin characters i.e. within the ASCII range of 127, the moment a character beyond... (6 Replies)
Discussion started by: gimley
6 Replies

6. Shell Programming and Scripting

Optimised way for search & replace a value on one line in a very huge file (File Size is 24 GB).

Hi Experts, I had to edit (a particular value) in header line of a very huge file so for that i wanted to search & replace a particular value on a file which was of 24 GB in Size. I managed to do it but it took long time to complete. Can anyone please tell me how can we do it in a optimised... (7 Replies)
Discussion started by: manishkomar007
7 Replies

7. UNIX for Dummies Questions & Answers

VIM search and replace with line breaks in both the target and replacement text

Hi, Ive spent ages trying to find an explanation for how to do this on the web, but now feel like I'm :wall: I would like to change each occurence (there are many within my script) of the following: to in Vim. I know how to search and replace when it is just single lines... (2 Replies)
Discussion started by: blueade7
2 Replies

8. Shell Programming and Scripting

Search in one file replace the same in next file

yyyyy (4 Replies)
Discussion started by: kkraja
4 Replies

9. UNIX for Dummies Questions & Answers

how can search a String in one text file and replace the whole line in another file

i am very new to UNIX plz help me in this scenario i have two text files as below file1.txt name=Rajakumar. Discipline=Electronics and communication. Designation=software Engineer. file2.txt name=Kannan. Discipline=Mechanical. Designation=CADD Design Engineer. ... (6 Replies)
Discussion started by: kkraja
6 Replies

10. UNIX for Dummies Questions & Answers

Search and replace in file

Hi guys, I have one file with duplicate string. I want to replace all the occurance of that string with some other string. How can I do that in vi editor? Malay Maru (3 Replies)
Discussion started by: malaymaru
3 Replies
Login or Register to Ask a Question