Replace text in column1 of a file matching columns of another file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Replace text in column1 of a file matching columns of another file
# 1  
Old 12-11-2012
Replace text in column1 of a file matching columns of another file

Hi all,

I have 2 files:

species-names.txt

Code:
Abaca-bunchy-top-virus	((((Abaca-bunchy-top-virus((Babuvirus((Unassigned((Nanoviridae((Unassigned))))
Abutilon-mosaic-virus	((((Abutilon-mosaic-virus((Begomovirus((Unassigned((Geminiviridae((Unassigned))))
Abutilon-yellows-virus	((((Abutilon-yellows-virus((Crinivirus((Unassigned((Closteroviridae((Unassigned))))

sequence-names.txt

Code:
gi|145845934|gb|EF546810.1|-Abaca-bunchy-top-virus-isolate-Q767-segment-DNA-S,-complete-sequence	GGCAGGGGGGCTTATTATTACCCCCCCTGCC
gi|145845936|gb|EF546811.1|-Abutilon-mosaic-virus-isolate-Q767-segment-DNA-M,-complete-sequence	GGGGCTGGGGCTTATTATTACCCCCAGCCCCGGAACGGGACATCAC
gi|145845938|gb|EF546812.1|-Abutilon-yellows-virus-isolate-Q767-segment-DNA-C,-complete-sequence	GGCAGGGGGGCTTATTATTACCCCCCCTGCCCGGG

I need to replace text in 1st column of file sequence-names.txt which matches 1st column of file species-names.txt to text of 2nd column of species-names.txt. Output will be:

Code:
gi|145845934|gb|EF546810.1|-((((Abaca-bunchy-top-virus((Babuvirus((Unassigned((Nanoviridae((Unassigned))))-isolate-Q767-segment-DNA-S,-complete-sequence	GGCAGGGGGGCTTATTATTACCCCCCCTGCC
gi|145845936|gb|EF546811.1|-((((Abutilon-mosaic-virus((Begomovirus((Unassigned((Geminiviridae((Unassigned))))-isolate-Q767-segment-DNA-M,-complete-sequence	GGGGCTGGGGCTTATTATTACCCCCAGCCCCGGAACGGGACATCAC
gi|145845938|gb|EF546812.1|-((((Abutilon-yellows-virus((Crinivirus((Unassigned((Closteroviridae((Unassigned))))-isolate-Q767-segment-DNA-C,-complete-sequence	GGCAGGGGGGCTTATTATTACCCCCCCTGCCCGGG

Thanks a lot!
# 2  
Old 12-11-2012
try:
Code:
awk '
NR==FNR {a[$1]=$2; next;}
{c=0; for (i=1; i<=NF; i++) {for (s in a) if ($0 ~ s && c==0) {sub(s, a[s]); c=1}}}
1
' species-names.txt sequence-names.txt

This User Gave Thanks to rdrtx1 For This Post:
# 3  
Old 12-11-2012
Thanks a lot rdrtx1,

You script worked nicely.

Because column 2 of file sequence-names.txt was very large, I tried to awk print column 1 only, and run your script with it, then printed the output as column 1 and then column 2 of the original sequence-names.txt. Same result and quicker.

Really appreciate again!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to replace matching words defined in one file on another file?

I have file1 and file2 as shown below, file1: ((org14/1-131541:0.11535,((org29/1-131541:0.00055,org7/1-131541:0.00055)1.000:0.10112,((org17/1-131541:0.07344,(org23/1-131541:0.07426,((org10/1-131541:0.00201,org22/1-131541:0.00243)1.000:0.02451, file2: org14=india org29=america... (5 Replies)
Discussion started by: dineshkumarsrk
5 Replies

2. UNIX for Beginners Questions & Answers

awk to update file with partial matching line in another file and append text

In the awk below I am trying to cp and paste each matching line in f2 to $3 in f1 if $2 of f1 is in the line in f2 somewhere. There will always be a match (usually more then 1) and my actual data is much larger (several hundreds of lines) in both f1 and f2. When the line in f2 is pasted to $3 in... (4 Replies)
Discussion started by: cmccabe
4 Replies

3. Shell Programming and Scripting

Shell script to filter records in a zip file that contains matching columns from another file

Not sure if this is the correct forum for this question. I have two files. file1.zip, file2 Input: file1.zip col1, col2 , col3 a , b , 0:0:0:0:0:c436:9346:d40b x, y, 0:0:0:0:0:880:39f9:c9a7 m, n , 0:0:0:0:0:80c7:9161:fe00 file2.txt col1 c4:36:93:46:d4:0b... (1 Reply)
Discussion started by: anil.v
1 Replies

4. Shell Programming and Scripting

How to concatenate 2-columns by 2 -columns for a text file?

Hello, I want to concatenate 2-columns by 2-columns separated by colon. How can I do so? For example, I have a text file containing 6 columns separated by tab. I want to concatenate column 1 and 2; column 3 and 4; column 5 and 6, respectively, and put a colon in between. input file: 1 0 0 1... (10 Replies)
Discussion started by: huiyee1
10 Replies

5. Shell Programming and Scripting

Replace and add line in file with line in another file based on matching string

Hi, I want to achieve something similar to what described in another post: The difference is I want to add the line if the pattern is not found. File 1: A123, valueA, valueB B234, valueA, valueB C345, valueA, valueB D456, valueA, valueB E567, valueA, valueB F678, valueA, valueB ... (11 Replies)
Discussion started by: jyu3
11 Replies

6. UNIX for Dummies Questions & Answers

Removing columns from a text file that do not have any values in second and third columns

I have a text file that has three columns. But at the end of the text file, there are trailing lines that have missing second and third columns: 4 0.04972604 KLHL28 4 0.0497332 CSTB 4 0.04979822 AIF1 4 0.04983331 DECR2 4 0.04990344 KATNB1 4 4 4 4 How can I remove the trailing... (3 Replies)
Discussion started by: evelibertine
3 Replies

7. Shell Programming and Scripting

Copy values from columns matching in those in second file.

Hi All, I have two sets of files. Set 1: 100 text files with extension .txt with names like 1.txt, 2.txt, 3.txt until 100.txt Set 2: One big file with extension .dat The text files have some records in columns like this: 0.7316431 82628 0.7248189 82577 0.7248182 81369 0.7222999... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

8. Shell Programming and Scripting

Replace specific columns in one file with columns in another file

HELLO! This is my first post here! By the way, I think it is great that people do this. My question: I have two files, one is a .dilm and one is a .txt. It is my understanding that the .dilm file can be treated as a .txt file. I wrote another program where I was able to manipulate it as if it... (3 Replies)
Discussion started by: mehdib
3 Replies

9. UNIX for Dummies Questions & Answers

How to convert text to columns in tab delimited text file

Hello Gurus, I have a text file containing nearly 12,000 tab delimited characters with 4000 rows. If the file size is small, excel can convert the text into coloumns. However, the file that I have is very big. Can some body help me in solving this problem? The input file example, ... (6 Replies)
Discussion started by: Unilearn
6 Replies

10. Shell Programming and Scripting

compare two columns of different files and print the matching second file..

Hi, I have two tab separated files; file1: S.No ddi fi cu o/l t+ t- 1 0.5 0.6 o 0.1 0.2 2 0.2 0.3 l 0.3 0.4 3 0.5 0.8 l 0.1 0.6 ... (5 Replies)
Discussion started by: vasanth.vadalur
5 Replies
Login or Register to Ask a Question