Search in a file and replace the mapped entity in another file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Search in a file and replace the mapped entity in another file
# 1  
Old 08-04-2013
Search in a file and replace the mapped entity in another file

Hello,
I looked for all search and replace routines in the forum but could find none which meets my requirement.
I have a file in which a set of mappers are provided. The structure is as under
Code:
a=b
c=d
e=f

A simplistic example from English and French would make this clear:
Code:
John=Jean
eats=mange
a=un
cake=gateau

This constitutes the rule file which is the basis of all operations.
I have a target file consisting of running text with strings in English which I need to convert to French.
Given a sentence like:
Code:
John eats a cake

the script reading from the source file where the mappings are provided should convert the text to
Code:
Jean mange un gateau

In case there is no string match the "English" word would be retained.
The source file would contain around 500,000 such mappings and would be in Unicode (16Bit). The target file would be running text.
Since I work under Windows and am limited to DOS, a PERL or AWK script would be of great help.

I am not targeting Machine Translation with the tool but chose this as an example.

Many thanks in advance.
# 2  
Old 08-04-2013
Similar variants were often in this forum.
The first file "rules" is read into an hashed array.
Then the other files (here - i.e. stdin) are processed.
In a loop every word is looked up in the array, and substituted by the matching word.
Maybe it is unusual to change the delimiter FS between the files
Code:
<text awk '
NR==FNR {tr[$1]=$2; next}
{for(i=1; i<=NF; i++) if ($i in tr) $i=tr[$i]; print}
' FS="=" rules FS=" " -

<text awk ... reads from a file "text". You can of course do echo "John eats a cake" | awk ...
This User Gave Thanks to MadeInGermany For This Post:
# 3  
Old 08-04-2013
Code:
$ 
$ cat mapper
a=un
about=pour
are=allez
boy=garçon
cake=gâteau
eats=mange
friend=ami
green=verte
how=comment
is=est
my=mon
the=le
this=cette
you=vous
$ 
$ 
$ cat input
how are you
my friend eats a cake
this apple is green
about the boy
$ 
$ 
$ perl -lne 'if ($ARGV eq "mapper"){($k,$v)=m/^(.*?)=(.*)$/; $x{$k}=$v} else {@y=split/[ ]+/; print join(" ", map{ defined $x{$_} ? $x{$_} : $_} @y)}' mapper input
comment allez vous
mon ami mange un gâteau
cette apple est verte
pour le garçon
$ 
$ 
$ # "apple" didn't get translated.
$ # Add the map for "apple" and try again
$ 
$ echo "apple=pomme" >> mapper
$ 
$ perl -lne 'if ($ARGV eq "mapper"){($k,$v)=m/^(.*?)=(.*)$/; $x{$k}=$v} else {@y=split/[ ]+/; print join(" ", map{ defined $x{$_} ? $x{$_} : $_} @y)}' mapper input
comment allez vous
mon ami mange un gâteau
cette pomme est verte
pour le garçon
$ 
$ 
$

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Nested search in a file and replace the inner search

Hi Team, I am new to unix, please help me in this. I have a file named properties. The content of the file is : ##Mobile props east.url=https://qa.east.corp.com/prop/end west.url=https://qa.west.corp.com/prop/end south.url=https://qa.south.corp.com/prop/end... (2 Replies)
Discussion started by: tolearn
2 Replies

2. Shell Programming and Scripting

Search and replace from file in awk using a 16 bit text file

Hello, Some time ago a helpful awk file was provided on the forum which I give below: NR==FNR{A=$0;next}{for(j in A){split(A,P,"=");for(i=1;i<=NF;i++){if($i==P){$i=P}}}}1 While it works beautifully on English and Latin characters i.e. within the ASCII range of 127, the moment a character beyond... (6 Replies)
Discussion started by: gimley
6 Replies

3. Shell Programming and Scripting

Search and replace with mapping from a mapper file in a target file

Hello, I have a special problem. I have a file in 8 bit and would like to convert the whole database to 16Bit unicode. The mapping file has the following structure: The mapper is provided as a zip file The target file to be converted contains data in English and 8 bit Urdu mapping, a... (4 Replies)
Discussion started by: gimley
4 Replies

4. Red Hat

Crontab not generating file in mapped drive in windows

Dear Experts, I've schedule scripts with the help of Cronjob, and output should be generated on windows mapped drive, Earlier 2 days its was working fine and now my cronjob is not generating file on windows network drive. Please suggest. Linux details given below. $ tail... (9 Replies)
Discussion started by: Mohammed Fareed
9 Replies

5. Shell Programming and Scripting

Optimised way for search & replace a value on one line in a very huge file (File Size is 24 GB).

Hi Experts, I had to edit (a particular value) in header line of a very huge file so for that i wanted to search & replace a particular value on a file which was of 24 GB in Size. I managed to do it but it took long time to complete. Can anyone please tell me how can we do it in a optimised... (7 Replies)
Discussion started by: manishkomar007
7 Replies

6. UNIX for Advanced & Expert Users

search and replace in a file

I have a file (say file1.txt) and I have to search for a line which has a text replace it and replace another string too in the same line. Eg: file1.txt -------- x='hai' y='world' z='unix' x='hai' y='world' x='hai' z='perl' y='world' z="world" k="junk" b="world" z='perl' x='hai'... (3 Replies)
Discussion started by: ammu
3 Replies

7. Shell Programming and Scripting

Search in one file replace the same in next file

yyyyy (4 Replies)
Discussion started by: kkraja
4 Replies

8. UNIX for Dummies Questions & Answers

how can search a String in one text file and replace the whole line in another file

i am very new to UNIX plz help me in this scenario i have two text files as below file1.txt name=Rajakumar. Discipline=Electronics and communication. Designation=software Engineer. file2.txt name=Kannan. Discipline=Mechanical. Designation=CADD Design Engineer. ... (6 Replies)
Discussion started by: kkraja
6 Replies

9. Shell Programming and Scripting

search and replace in a file

Hi I have to search & replace column in the file.For example ..below iam having File1. in which 3rd column ...if it is A it should be 'ACT' if P it should be 'PAD' and if it ils D it should be 'DEC' I have to pass column no ,value and to be converted value as variables in to the... (2 Replies)
Discussion started by: satyam_sat
2 Replies

10. Shell Programming and Scripting

Search and replace in file..

Hi All, As I'm working on a Unix script... and the requirement is like, I need to search a word and replace it with the another word... for that i'm using SED command.... can anybody give any other alternate for this...? Thanks Amit (2 Replies)
Discussion started by: Amits
2 Replies
Login or Register to Ask a Question