Creating unique mapping from multiple mapping


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Creating unique mapping from multiple mapping
# 1  
Old 06-27-2012
Creating unique mapping from multiple mapping

Hello,
I do not know if this is the right title to use. I have a large dictionary database which has the following structure:
Quote:
aSPACEbSPACEcSPACEdSPACEe=pSPACEqSPACErSPACEsSPACEt
where a b c d e are in English and p q r s t are in a target language., the two separated by the delimiter =.
What I am looking for is a perl script which will take each line of such as data structure and reduce it such that the first word in English maps to the first word in the Target Language and so on till all the mappings are over to arrive at the following database:
Quote:
a=p
b=q
c=r
d=s
e=t
The database is in Unicode.
At present I am using macros to do the job but the macros are terribly slow. I am still a novice in perl and my knowledge of awk does not seem sufficient to handle this type of database.
Could someone please help out with a script in Perl or Awk. The database is huge and runs to over 500,000 lines.
Many thanks in advance
# 2  
Old 06-27-2012
It would be better if you posted the sample data in real format ie:
Code:
a b c d e=p q r s t

And what should be the desired output?
Code:
a=p
b=q
c=r
d=s
e=t

?
# 3  
Old 06-27-2012
I am posting around 6 lines of the database.
Langauge1 (to the left of the delimiter is English (need I say that ?), Language2 is Hindi.
Quote:
Lata Sanjay Patil=लता संजय पाटील
Lata Sanjay Patole=लता संजय पाटोळे
Lata Sanjay Pawar=लता संजय पवार
The desired output which my macro spewed out is:
Quote:
Lata=लता
Sanjay=संजय
Patil=पाटील
Lata=लता
Sanjay=संजय
Patole=पाटोळे
Lata=लता
Sanjay=संजय
Pawar=पवार
I hope the sample is readable.
Here the sample comprises a grid of 3 to 3. But in some cases the names can be as many as five to a side.
Many thanks for your interest
# 4  
Old 06-27-2012
Try:
Code:
awk -F"=" '{n=split($1,a," ");split($2,b," ");for (i=1;i<=n;i++) print a[i]"="b[i]}' file

This User Gave Thanks to bartus11 For This Post:
# 5  
Old 06-27-2012
Many thanks. The script worked beautifully. I coupled it with an awk script for frequencies and I can now see the output in frequencies and find out which name glosses are more frequent and which are less.
Tested it on around 200 thousand lines and it did the job in around 22 secs. (Had set a timein timeout function.)
# 6  
Old 06-28-2012
In perl..
Code:
$ perl -F'=|\s' -lane '$a=($#F+1)/2;while($i<$a){print "$F[$i]=$F[$i+$a]";$i++}' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Many to many -- mapping

INPUT 13333--TEXT1 14444--TEXT2 13333--TEXT3 12233--TEXT5 14444--TEXT5 12233--TEXT1 12222--TEXT5 13333--TEXT09 what I'm looking for is something using awk arrays with below given output. 14444--TEXT2,TEXT5 13333--TEXT1,TEXT3,TEXT09 12233--TEXT5,TEXT1 12222--TEXT5 (6 Replies)
Discussion started by: busyboy
6 Replies

2. Homework & Coursework Questions

Mount and Mapping are same???

Please let me know if mounted on in Unix and Mapping in Windows are same thing??? (1 Reply)
Discussion started by: MZC
1 Replies

3. Red Hat

drive mapping

What is the eqiuvalent of /dev/dsk/rdsk in linux vs Solaris (1 Reply)
Discussion started by: walnutpony123
1 Replies

4. Shell Programming and Scripting

Need Help Mapping Arrays

I have the following arrays with different lengths that I want to map them with the same key. # Week numbers, 8 columns @headers = ("2011-34", "2011-35", "2011-36", "2011-37", "2011-38", "2011-39", "2011-40", "2011-41"); %data = ("2011-34", BCE, "2011-35", YZA, "2011-36",... (5 Replies)
Discussion started by: tqlam
5 Replies

5. UNIX for Dummies Questions & Answers

Re-Mapping Printers.

Hi we have a situation where some printers are on a server that sometimes has to be rebooted. If this happens the Unix boxes we have that are referencing the printers in the vfstab file fail to work even when the print server is brought back up. Does anyone know if it would be possible to put... (0 Replies)
Discussion started by: Hadleyshope
0 Replies

6. Shell Programming and Scripting

Copying of multiple columns of one table to another by mapping with particular strings.

Hi, I would like to copy some columns from a particular file by mapping with the string names. i am using the .csv file format. my one file consist of 100 of columns but i want only particular 4 columns such as ( First_name, Middle_name,Last_name & Stlc). but they are listed in many files... (15 Replies)
Discussion started by: dsh007
15 Replies

7. Shell Programming and Scripting

Pattern mapping

Dear Friends, Please help me on this I have file A.txt containing text lines as below grectec; 30 ,50, 60, base_123 ; top09 grectec; 30 ,55, 60, base_123 ; top09 grectec; 10 ,53, 60, base_123 ; top09 grectec; 50 ,57, 60, base_123 ; top09 ... ... another file B.txt containing test... (4 Replies)
Discussion started by: Danish Shakil
4 Replies

8. UNIX for Dummies Questions & Answers

Mapping PF Keys in Vi

This is my first post and right off the bat, I want to let you know that my experience in UNIX is 2 days only backed up with over 20 years of IT working. So, if this is a dumb question or too stupid, please bear with me. I read somewhere on the web and also on these forums that you can map your... (7 Replies)
Discussion started by: sssccc
7 Replies

9. UNIX for Advanced & Expert Users

kernel mapping...

> how the sendmsg and recvmsg calls will know which kernel module to use (SCTP, RTP etc.) internally(kernel mapping: how kernel handle socket call) (1 Reply)
Discussion started by: prangin
1 Replies

10. IP Networking

mapping drives

how can i map a shared network drive? Is there any command to perform mapping? For example if i want to map a shared directory named "wwwroot" in machine "dev001" to my machine's "X" drive, how can it be done?? -Thanks Sakthi. (1 Reply)
Discussion started by: cs_sakthi
1 Replies
Login or Register to Ask a Question