Visit Our UNIX and Linux User Community


How to replace matching words defined in one file on another file?


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers How to replace matching words defined in one file on another file?
# 1  
Old 06-15-2019
How to replace matching words defined in one file on another file?

I have file1 and file2 as shown below,
file1:
Code:
((org14/1-131541:0.11535,((org29/1-131541:0.00055,org7/1-131541:0.00055)1.000:0.10112,((org17/1-131541:0.07344,(org23/1-131541:0.07426,((org10/1-131541:0.00201,org22/1-131541:0.00243)1.000:0.02451,

file2:
Code:
org14=india
org29=america
org7=srilanka
org17=africa
org23=europe
org10=brazil
org22=china

I need to replace the words in file1, based on the matching words defined in file2.

The expected outcome is shown below,
Code:
((india/1-131541:0.11535,((america/1-131541:0.00055,srilanka/1-131541:0.00055)1.000:0.10112,((africa/1-131541:0.07344,(europe/1-131541:0.07426,((brazil/1-131541:0.00201,china/1-131541:0.00243)1.000:0.02451,

.

I could use replace option in gedit, but here I need to replace list of words. Hence, Please help me to do the same.

Thank you in advance.

Last edited by Scrutinizer; 06-15-2019 at 04:14 AM.. Reason: Quote tags -> code tags; removed some superfluous quote tags
# 2  
Old 06-15-2019
This problem has been solved umpteen times in these fora. Did you bother to search, or look into the proposals given below under "More UNIX and Linux Forum Topics You Might Find Helpful"?


Howsoever, try

Code:
awk 'FNR==NR{REP[$1]=$2; next} {for (r in REP) gsub(r, REP[r])}1' FS="=" file2 file1

This User Gave Thanks to RudiC For This Post:
# 3  
Old 06-15-2019
Hi, try:
Code:
awk '
  NR==FNR {
    A[$1]=$2
    next
  } 
  {
    for(i=1; i<=NF; i++)
      if($i in A)
        sub($i,A[$i])
    print
  }
'  FS="=" file2 FS='[(/,]' file1

This User Gave Thanks to Scrutinizer For This Post:
# 4  
Old 06-15-2019
Note that RudiC's and Scrutinizer's suggestions both depend on the fact that the orgX and orgXX strings in file2 are distinct. Had file2 also contained the line:
Code:
org2=japan

both of those suggestions might randomly have resulted in japan9 appearing in the output instead of america, japan3 appearing instead of europe, and japan2 appearing instead of china.

If this might be a problem for you, you would either need to be sure that all of your orgXX strings are the same length or sort your orgXX values by decreasing numerical value of XX and process the substitutions from beginning to end in sequence (like Scrutinizer did) instead of using for (r in REP) (like RudiC did).

And, if using Scrutinizer's code and a single orgXX string might occur more than once in a line of input (which does not happen in your sample), you would need to use gsub() instead of sub() to get the desired results.

Last edited by Don Cragun; 06-16-2019 at 06:17 PM.. Reason: Fix broken ICODE tag.
These 3 Users Gave Thanks to Don Cragun For This Post:
# 5  
Old 06-16-2019
In post #3, isn't
Code:
      if($i in A)
        $i=A[$i]

more correct?
--
I see now, awk will reformat the line, substituting the FS characters with spaces.

Last edited by MadeInGermany; 06-16-2019 at 05:37 AM..
This User Gave Thanks to MadeInGermany For This Post:
# 6  
Old 06-16-2019
Yes that is correct, #3 uses exact strings, so it correctly identifies the right field, and the sub() in itself isn't the problem either, since iteration occurs over the fields and not over the key value pairs (therefore it can substititute multiple occurrences on one line), but the problem is in the replacement part, it was attempting to use sub() on the record instead of a direct assignment to the field, to avoid losing the file separators.

This adaptation should fix that:
Code:
awk '
  NR==FNR {
    A[$1]=$2
    next
  } 
  {
    for(i=1; i<=NF; i++) {
      n=split($i, F, /[(,]/)
      org=F[n]
      if(org in A)
        sub(org, A[org], $i)
    } 
    print
  }
'  FS="=" file2 FS=/ OFS=/ file1


Last edited by Scrutinizer; 06-16-2019 at 06:37 AM..
This User Gave Thanks to Scrutinizer For This Post:

Previous Thread | Next Thread
Test Your Knowledge in Computers #368
Difficulty: Medium
The Open Group released the Single UNIX Specification Version 2 in 1987.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Hi All, I need one help to replace particular words in file based on if finds another words in that file . i.e. my self is peter@king. i am staying at north sydney. we all are peter@king. How to replace peter to sham if it finds @king in any line of that file. Please help me... (8 Replies)
Discussion started by: Rajib Podder
8 Replies

2. Shell Programming and Scripting

How to replace some specific words from file?

I have the file like this. cat 123.txt <p> <table border='1' width='90%' align='center' summary='Script output'> <tr><td>text </td> </tr> </table> </p> I want to replace some tags and want the output like below. I tried with awk & sed commands. But no luck. Could someone help me on this? ... (4 Replies)
Discussion started by: thomasraj87
4 Replies

3. Shell Programming and Scripting

How to replace words in file?

Hi Guys, I have a text where we used Ram in 10 times now I want replace all Ram words by Shyam word then how to do it. (6 Replies)
Discussion started by: aaditya321
6 Replies

4. UNIX for Dummies Questions & Answers

Replace the words in the file to the words that user type?

Hello, I would like to change my setting in a file to the setting that user input. For example, by default it is ONBOOT=ON When user key in "YES", it would be ONBOOT=YES -------------- This code only adds in the entire user input, but didn't replace it. How do i go about... (5 Replies)
Discussion started by: malfolozy
5 Replies

5. Shell Programming and Scripting

Replace text in column1 of a file matching columns of another file

Hi all, I have 2 files: species-names.txt Abaca-bunchy-top-virus ((((Abaca-bunchy-top-virus((Babuvirus((Unassigned((Nanoviridae((Unassigned)))) Abutilon-mosaic-virus ((((Abutilon-mosaic-virus((Begomovirus((Unassigned((Geminiviridae((Unassigned))))... (2 Replies)
Discussion started by: thienxho
2 Replies

6. Shell Programming and Scripting

search the pattern in a file and replace with variable already defined previously in csh

I want to replace a certain pattern with the variable already defined. e.g. set path_verilog = /home/priya/bin/verilogfile my file contents are : verilog new verilog is defined here verilog_path_comes I am using the below command sed 's/verilog_path_comes/'$path_verilog'/g' <filename>... (2 Replies)
Discussion started by: nehashine
2 Replies

7. Shell Programming and Scripting

How to from grep command from a file which contains matching words?

Hi all I have a file with below content (content is variable whenever new product is launched). I need form a grep command like this egrep "Unknown product|Invalid symboland so on" How to do it using a script? Unknown product Invalid symbol No ILX exch found exceeds maximum size AFX... (4 Replies)
Discussion started by: johnl
4 Replies

8. UNIX for Dummies Questions & Answers

sed replace words in file and keep some

lets see if i can explain this in a good way. im trying to replace some words in a file but i need to know what the words are that is beeing replaced. not sure if sed can do this. file.name.something.1DATA01.something.whatever sed "s/./.DATA?????/g" need to know what the first . is... (2 Replies)
Discussion started by: cas
2 Replies

9. Shell Programming and Scripting

replace words in file based on another file

Hello, Can someone kindy help me solve this problem..I am using SunOS shell script I got a file A with following content: This is my correct document. I wrote 111 This is my incorrect word , 222 This is my wrong statement 333 This is my correct document 444 This is my correct document 555... (9 Replies)
Discussion started by: kinmak
9 Replies

10. Programming

getting file words as pattern matching

Sir, I want to check for the repation of a user address in a file i used || as my delimiter and want to check repetaip0n of the address that is mailid and then i have to use IMAP and all. How can i do this... I am in linux ...and my file is linux file. ... (5 Replies)
Discussion started by: arunkumar_mca
5 Replies

Featured Tech Videos