Multiple replacement


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Multiple replacement
# 1  
Old 05-30-2013
Multiple replacement

Hi friends

Hope, you all are doing well

I need your help for doing multiple strings replacement. I have a file with more than 1000 lines and I want to replace several elements in the same run. I have the equivalences written in another file.

Example: target file

Code:
Start (bp)	 End (bp)  Strand	Rank	Gene            ID	           Chromosome 
51066374	51066598	-1	1	ARSA	ENST00000356098	22
51063446	51063892	-1	8	ARSA	ENST00000216124	22
51064007	51064109	-1	8	ARSA	ENST00000395621	22
51063457	51063892	-1	8	ARSA	ENST00000453344	22

Example: equivalences file

Code:
ID	                        RefSeq
ENST00000216124	NM_000487
ENST00000547805	NM_001085425
ENST00000356098	NM_001085426
ENST00000395619	NM_001085427
ENST00000453344	NM_001085428

I'd like to replace "ID" strings by "RefSeq"

Thank you so much!

Last edited by Scott; 05-30-2013 at 05:42 AM.. Reason: Code tags
# 2  
Old 05-30-2013
Code:
while IFS=" " read old new
do
sed "s/$old/$new/g" target_file > temp_file
mv temp_file target_file
done < equiv_file

# 3  
Old 05-30-2013
Another way to do this using awk once instead of using sed once for each line in the equivalences file is:
Code:
awk '
BEGIN { OFS = "\t"}
FNR == NR {ID[$1] = $2;next}
{       if(FNR == 1) sub(/ID/, ID["ID"])
        else if($6 in ID) $6 = ID[$6]
        print
}' equivalences target

If you're using a Solaris/SunOS system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of awk.

For the input files shown in the 1st message in this thread, the output produced is:
Code:
Start (bp)       End (bp)  Strand       Rank    Gene            RefSeq             Chromosome
51066374        51066598        -1      1       ARSA    NM_001085426    22
51063446        51063892        -1      8       ARSA    NM_000487       22
51064007        51064109        -1      8       ARSA    ENST00000395621 22
51063457        51063892        -1      8       ARSA    NM_001085428    22

This User Gave Thanks to Don Cragun For This Post:
# 4  
Old 05-30-2013
To make Don's proposal a bit more flexible, you could introduce a variable for the column. If the header's fields do not contain FS chars, you can omit the extra treatment for line 1:
Code:
awk     'BEGIN          {FS = OFS = "\t"}
         FNR == NR      {ID[$1] = $2; next}
         $COL in ID     {$COL = ID[$COL]}
         1
        ' COL=6 equivalences target


Last edited by RudiC; 05-30-2013 at 09:22 AM..
# 5  
Old 05-30-2013
Thank you so much guys!

Problem solved!
SmilieSmilieSmilieSmilieSmilie
# 6  
Old 05-30-2013
Quote:
Originally Posted by RudiC
To make Don's proposal a bit more flexible, you could introduce a variable for the column. If the header's fields do not contain FS chars, you can omit the extra treatment for line 1:
Code:
awk     'BEGIN          {FS = OFS = "\t"}
         FNR == NR      {ID[$1] = $2; next}
         $COL in ID     {$COL = ID[$COL]}
         1
        ' COL=6 equivalences target

I looked at trying that, but decided against it since the header line contains a mix of spaces and tabs and the spaces around "ID" in the header line cause $6 = ID[$6] to skip the change of "ID" to "RefSeq" in the header.
# 7  
Old 05-30-2013
Another solution:
Code:
xargs printf 's/\\b%s\\b/%s/;\n' <equiv_file >subst.pl
perl -p subst.pl target_file

\\b because the \b has a special meaning in printf.
\b in perl is "word boundary" and gives a more precise match.
Code:
perl -i -p subst.pl target_file

will directly modify target_file.
The following is even more precise here, because column 1 is not involved:
Code:
xargs printf 's/(\s)%s(\s)/$1%s$2/;\n' <equiv_file >subst.pl


Last edited by MadeInGermany; 05-30-2013 at 03:17 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Is this MB, which needs replacement ?

Hello, I am getting below error in fmadm output. This server is not in support, so can't reach them. Is it showing that motherboard is faulty and should be replaced ? It was rebooted a week back and then, there were no errors # fmadm faulty --------------- ------------------------------------ ... (1 Reply)
Discussion started by: solaris_1977
1 Replies

2. Shell Programming and Scripting

Multiple Replacement in a Text File in one operation (sed/awk) ?

Hi all, Saying we have two files: 1. A "Reference File" whose content is "Variable Name": "Variable Value" 2. A "Model File" whose content is a model program in which I want to substitute "VariableName" with their respective value to produce a third file "Program File" which would be a... (4 Replies)
Discussion started by: dae
4 Replies

3. UNIX for Beginners Questions & Answers

GUI for multiple pattern replacement

I'm trying to change a few programs in our environment. Basically We have hardcoded some server names and stuff, So I want some one to suggest me some UNIX gui tools that can be used to replace these.. I really don't want to deal doing this through the command line. I want to transfer the files... (3 Replies)
Discussion started by: sudden
3 Replies

4. Shell Programming and Scripting

Grep from multiple patterns multiple file multiple output

Hi, I want to grep multiple patterns from multiple files and save to multiple outputs. As of now its outputting all to the same file when I use this command. Input : 108 files to check for 390 patterns to check for. output I need to 108 files with the searched patterns. Xargs -I {} grep... (3 Replies)
Discussion started by: Diya123
3 Replies

5. UNIX for Dummies Questions & Answers

replacement

my filename.txt looks like this: 2079951061790 SCK0000891539000000000000021600R 2079951061790 SCK0000901487000000000000028900R 2079951061790 SCK0000903092000000000000021300R 2079951074758 ... (9 Replies)
Discussion started by: tjmannonline
9 Replies

6. Shell Programming and Scripting

help me :replacement

Hi pls help me for below; i have a file .content is : =================== uid,pcsPricingPlan,refPcsQosProfName 821910002022,smartlimit,SGSNQOS1 i have to replace the value of uid and pricingplan by a unix script. may be the value would be next line or any where in the file. pls... (9 Replies)
Discussion started by: Aditya.Gurgaon
9 Replies

7. Shell Programming and Scripting

Multiple String with a number replacement and more..

Hello all, First of all, I could not made up a nice title what explains my problem in short,sorry for that already. I have the next file which contains the following, CREATE:ENTRY:\ DNAME,"referenceId=sondakika30,referenceId=User1,\ referenceId=Company,\ ... (2 Replies)
Discussion started by: sondakika
2 Replies

8. Shell Programming and Scripting

sed xml file multiple line replacement

I have a file called config.xml, it's a simple xml file, and I need use sed/awk to erase some lines. <machine xsi:type="unix-machineType"> <name>server1</name> <node-manager> <name>server1</name> <listen-address>server1</listen-address> </node-manager> ... (3 Replies)
Discussion started by: cbo0485
3 Replies

9. UNIX for Dummies Questions & Answers

Regarding Replacement

I have two files: file1: somedata <html> <head> This is sample statement ...... ...... </head> </html> somedata file2: olga 81 91 B A rene 82 92 B A zack 83 93 Expextd Result: (2 Replies)
Discussion started by: rajx
2 Replies

10. Shell Programming and Scripting

String replacement in multiple files

What is the most simple way to search multiple text files in multiple directories for a string then replace it with another string? I have about 300 files that I need to update and I'm just looking for alternatives rather than having to edit each one by hand. Thanks in advance! (2 Replies)
Discussion started by: WABonnett
2 Replies
Login or Register to Ask a Question