Sponsored Content
Top Forums Shell Programming and Scripting Search and repllace of strings with space between words Post 302970300 by gimley on Tuesday 5th of April 2016 05:58:21 AM
Old 04-05-2016
Search and repllace of strings with space between words

Dear all,
I have gone through all the search and replace requests but none of them meet my particular need. I have a huge file in which all Unicode characters are stored as Names. A sample is given below. I want to replace strings in that file with a mapper from another file termed as master.dic. The peculiarity is that the strings in the master dictionary have spaces between the words and may have a space also at at the end. The other peculiarity is that some of the mapping strings can map to a
Code:
null

What I am looking for is a perl or awk script which can do the operation and should be able to search and replace in a file having around 100,000 individual string sets and each string set can have up to 6 or 7 names inside.
Some samples are given above.
Code:
Master.dic: Only sample rules are given
telugu letter=
a telugu vowel sign uu=u
a telugu vowel sign ii=i
telugu sign anusvara=n
a telugu vowel sign e=e
a telugu vowel sign o=o
a telugu vowel sign aa=a
a telugu sign virama=
vowel sign vocalic r=ri
a telugu vowel sign ii=i
telugu vowel sign=

Code:
Input file: The file on which the operation is to be carried out
telugu letter aa
telugu letter aa  telugu letter ii telugu sign anusvara  telugu letter dda telugu letter la
telugu letter aa  telugu letter ii telugu sign anusvara telugu letter dda
telugu letter aa  telugu letter ii telugu sign anusvara telugu letter dda telugu sign virama telugu letter la
telugu letter aa  telugu letter ii telugu letter aa
telugu letter aa  telugu letter ii telugu letter aa  telugu letter sha telugu vowel sign o telugu letter ka telugu sign virama
telugu letter aa  telugu letter ii telugu letter ka
telugu letter aa  telugu letter ii telugu letter ka telugu vowel sign aa
telugu letter aa  telugu letter ii telugu letter dda telugu sign virama telugu letter ra telugu vowel sign uu telugu letter sa telugu sign virama
telugu letter aa  telugu letter ii telugu letter ta  telugu letter ga telugu vowel sign o telugu letter na telugu vowel sign ii
telugu letter aa  telugu letter ii telugu letter ta telugu vowel sign aa
telugu letter aa  telugu letter ii telugu letter ta telugu vowel sign aa telugu letter ra  telugu letter va telugu vowel sign e telugu letter na telugu vowel sign ii
telugu letter aa  telugu letter ii telugu letter ta telugu vowel sign ii
telugu letter aa  telugu letter ii telugu letter ta telugu vowel sign uu
telugu letter aa  telugu letter ii telugu letter ta telugu vowel sign e
telugu letter aa  telugu letter ii telugu letter ta telugu vowel sign e telugu sign anusvara
telugu letter aa  telugu letter ii telugu letter ta telugu sign virama
telugu letter aa  telugu letter ii telugu letter ta telugu sign virama telugu letter ya
telugu letter aa  telugu letter ii telugu letter ta telugu sign virama telugu letter la
telugu letter aa  telugu letter ii telugu letter ta telugu sign virama telugu letter la telugu vowel sign aa
telugu letter aa  telugu letter ii telugu letter dha
telugu letter aa  telugu letter ii telugu letter na
telugu letter aa  telugu letter ii telugu letter na telugu vowel sign aa
telugu letter aa  telugu letter ii telugu letter na telugu sign virama
telugu letter aa  telugu letter ii telugu letter ma
telugu letter aa  telugu letter ii telugu letter ra  telugu letter ma telugu vowel sign e telugu letter sha telugu sign virama
telugu letter aa  telugu letter ii telugu letter ra  telugu letter va telugu vowel sign ii telugu sign anusvara  telugu letter dda telugu letter ra telugu sign virama
telugu letter aa  telugu letter ii telugu letter la

Code:
Expected out put
aa
aa ii n dda la
aa ii n dda
aa ii n ddla
aa ii aa
aa ii aa sho ka
aa ii ka
aa ii kaa
aa ii ddruu sa
aa ii ta go nii
aa ii taa
aa ii taa ra ve nii
aa ii tii
aa ii tuu
aa ii te
aa ii te n
aa ii ta
aa ii tya
aa ii tla
aa ii tlaa
aa ii dha
aa ii na
aa ii naa
aa ii na
aa ii ma
aa ii ra me sha
aa ii ra vii n dda ra
aa ii la

Many thanks for your help. I work under Windows environment so a Perl or Awk script would be of help.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

compare strings, words in different order

Hi, Would anyone know how to compare two strings, and only throw an error if there were different words, not that the same words were in a different order? e.g "A B C" vs "B C A" ->OK "A B C" vs "A D C" -> BAD Thanks! (2 Replies)
Discussion started by: rebelbuttmunch
2 Replies

2. Shell Programming and Scripting

Insert space between two words

Hi, I need to insert space between words on my output in UNIX other than the single space given by the space bar on my keyboard, e.g when are you going. (There should be 4 spaces between each of these words) rather than when are you going Can anyone help me with... (3 Replies)
Discussion started by: divroro12
3 Replies

3. Shell Programming and Scripting

perl or awk print strings between words

hi everyone, 1.txt 981 I field1 > field2.a: aa, ..si01To:<f:a@a.com>From: <f:a@a.com>;tag=DVNgfRZBZRMi96 <f:a@1:333>;ZZZZZ: 12345 the output field1 field2 <f:a@a.com> the output is cut the string 3rd and 5th field, and get the value betwee "To:" and "From:", please advice. ... (1 Reply)
Discussion started by: jimmy_y
1 Replies

4. Shell Programming and Scripting

delete repeated strings (tags) in a line and concatenate corresponding words

Hello friends! Each line of my input file has this format: word<TAB>tag1<blankspace>lemma<TAB>tag2<blankspace>lemma ... <TAB>tag3<blankspace>lemma Of this file I need to eliminate all the repeated tags (of the same word) in a line, as in the example here below, but conserving both (all) the... (2 Replies)
Discussion started by: mjomba
2 Replies

5. Shell Programming and Scripting

Splitting Concatenated Words With Largest Strings First

hello, I had posted earlier help for a script for splitting concatenated words . The script was supposed to read words from a master file and split concatenated words in the slave/input file. Thanks to the help I got, the following script which works very well was posted. It detects residues by... (14 Replies)
Discussion started by: gimley
14 Replies

6. UNIX for Dummies Questions & Answers

Search file or log for words or strings

i want to search a log for occurrences of words and i want the result to tell me how many lines in the log contained each word. if i type a command like this: egrep "cat|dog|monkey|bananas|bike" logfile i would like a response like this: cat=3,dog=17,monkey=1,bananas=102,bike=51 the... (12 Replies)
Discussion started by: SkySmart
12 Replies

7. Shell Programming and Scripting

USING sed to remove multiple strings/words from a line

Hi I use sed comnand to remove occurance of one workd from a line. However I need to removed occurance of dufferent words in ne line. Original-1 Hi this is the END of my begining Comand sed s/"END"/"start"/g Output-1 Hi this is the start of my beginig But I have more... (9 Replies)
Discussion started by: mnassiri
9 Replies

8. Shell Programming and Scripting

How to grep the words with space between?

see I have a text like: 27-MAY 14:00 4 aaa 5.30 0.01 27-MAY 14:00 3 aaa 0.85 0.00 27-MAY 14:00 2 aaa 1.09 0.00 27-MAY 14:00 5 aaa 0.03 0.00 27-MAY 14:00... (3 Replies)
Discussion started by: netbanker
3 Replies

9. Shell Programming and Scripting

sed Find and Replace Text Between Two Strings or Words

I am looking for a sed in which I can recognize all of the text in between two indicators and then replace it with a place holder. For instance, the 1st indicator is a list of words "no|noone|havent" and the 2nd indicator is a list of punctuation ".|,|!".From a sentence such as "noone... (3 Replies)
Discussion started by: owwow14
3 Replies

10. Shell Programming and Scripting

Search words in any quote position and then change the words

hi, i need to replace all words in any quote position and then need to change the words inside the file thousand of raw. textfile data : "Ninguno","Confirma","JuicioABC" "JuicioCOMP","Recurso","JuicioABC" "JuicioDELL","Nulidad","Nosino" "Solidade","JuicioEUR","Segundo" need... (1 Reply)
Discussion started by: benjietambling
1 Replies
All times are GMT -4. The time now is 01:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy