Sponsored Content
Top Forums Shell Programming and Scripting Join lines from two files based on match Post 302844326 by pathunkathunk on Sunday 18th of August 2013 01:46:36 PM
Old 08-18-2013
Join lines from two files based on match

I have two files.
File1
Code:
>gi|11320906|gb|AF197889.1|_Buchnera_aphidicola
ATGAAATTTAAGATAAAAAATAGTATTTT
>gi|11320898|gb|AF197885.1|_Buchnera_aphidicola
ATGAAATTTAATATAAACAATAAAA
>gi|11320894|gb|AF197883.1|_Buchnera_aphidicola
ATGAAATTTAATATAAACAATAAAATTTTT

File2
Code:
AF197885	Uroleucon aeneum
AF197886	Uroleucon jaceae
AF197889	Uroleucon obscurum
AF197883	Uroleucon astronomus
AF197893	Uroleucon erigeronense

For all lines in file1, I want to match the term bracked by "gb|" and "." (i.e. AF197889 in the first line) to a line in file2. In this example of file1, all terms of interest start with "AF" but this isn't always the case.

If there's a match, I'd like to append the species name in file2, preceded by "_host_" to the matching line in file1, using underscores and no spaces. Desired output:
Code:
>gi|11320906|gb|AF197889.1|_Buchnera_aphidicola_host_Uroleucon_obscurum
ATGAAATTTAAGATAAAAAATAGTATTTT
>gi|11320898|gb|AF197885.1|_Buchnera_aphidicola_host_Uroleucon_aeneum
ATGAAATTTAATATAAACAATAAAA
>gi|11320894|gb|AF197883.1|_Buchnera_aphidicola_host_Uroleucon_astronomus
ATGAAATTTAATATAAACAATAAAATTTTT

With the meager skills I have, I could use "|" as a filed separator for file 1 and use awk to fill an array to find matches. But I'm not sure how to to append the file2 data, or how to accomplish it in one step. Can anyone help?

Last edited by Don Cragun; 08-18-2013 at 02:56 PM.. Reason: CODE tags; not QUOTE tags for input, output, and code samples.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

join based on line number when one file is missing lines

I have a file that contains 87 lines, each with a set of coordinates (x & y). This file looks like: 1 200.3 -0.3 2 201.7 -0.32 ... 87 200.2 -0.314 I have another file which contains data that was taken at certain of these 87 positions. i.e.: 37 125 42 175 86 142 where the first... (1 Reply)
Discussion started by: jackiev
1 Replies

2. Shell Programming and Scripting

join two files based on one column

Hi All, I am trying to join to files based on one common column. Cat File1 ID HID Ab_1 23 Cd 45 df 22 Vv 33 Cat File2 ID pval Ab_1 0.3 Cd 10 Vv 0.0444 (3 Replies)
Discussion started by: newpro
3 Replies

3. UNIX for Dummies Questions & Answers

sed, join lines that do not match pattern

Hello, Could someone help me with sed. I have searched for solution 5 days allready :wall:, but cant find. Unfortunately my "sed" knowledge not good enough to manage it. I have the text: 123, foo1, bar1, short text1, dat1e, stable_pattern 124, foo2, bar2, long text with few lines, date,... (4 Replies)
Discussion started by: petrasl
4 Replies

4. UNIX for Dummies Questions & Answers

join 2 lines based on 1st field

hi i have a file with the following lines 2303:13593:137135 16 abc1 26213806....... 1234:45675:123456 16 bbc1 9813806....... 2303:13593:137135 17 bna1 26566444.... 1234:45675:123456 18 nnb1 98123456....... i want to join the lines having common 1st field i,e., ... (1 Reply)
Discussion started by: anurupa777
1 Replies

5. UNIX for Dummies Questions & Answers

Join the lines until next pattern match

Hi, I have a data file where data is splitted into multiple lines. And, each valid record starts with a patten date | <?xml and ends with pattern </dmm> e.g. 20120924|<?xml record 1 line1....record 1 line1....record 1 line1.... record 1 line2....record 1 line2....record 1 line2.... record 1... (3 Replies)
Discussion started by: Dipalik
3 Replies

6. UNIX for Dummies Questions & Answers

Join 2 files based on certain column

I have file input1.txt 11103|11|OTTAWA|City|AA|CAR|0|0|1|-1|0|8526|2014-09-07 23:00:14 11103|11|OTTAWA|City|BB|TRAIN|0|0|2|-2|6|6359|2014-09-07 23:00:14 11104|11|CANADA|City|CC|CAR|0|0|2|-2|0|5947|2014-09-07 23:00:14 11104|11|CANADA|City|DD|TRAIN|0|0|2|-2|1|4523|2014-09-07 23:00:14... (5 Replies)
Discussion started by: radius
5 Replies

7. Shell Programming and Scripting

Merge lines based on match

I am trying to merge two lines to one based on some matching condition. The file is as follows: Matches filter: 'request ', timestamp, <HTTPFlow request=<GET: Matches filter: 'request ', timestamp, <HTTPFlow request=<GET: Matches filter: 'request ', timestamp, <HTTPFlow ... (8 Replies)
Discussion started by: jamie_123
8 Replies

8. Shell Programming and Scripting

awk join lines based on keyword

Hello , I will need your help once again. I have the following file: cat file02.txt PATTERN XXX.YYY.ZZZ. 500 ROW01 aaa. 300 XS 14 ROW 45 29 AS XD.FD. PATTERN 500 ZZYN002 ROW gdf gsste ALT 267 fhhfe.ddgdg. PATTERN ERE.MAY. 280 PATTERRNTH 5000 rt.rt. ROW SO a 678 PATTERN... (2 Replies)
Discussion started by: alex2005
2 Replies

9. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies

10. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies
mv(1)							      General Commands Manual							     mv(1)

Name
       mv - move or rename files

Syntax
       mv [-i] [-f] [-] file1 file2

       mv [-i] [-f] [-] file... directory

Description
       The command moves (changes the name of) file1 to file2.

       If  file2  already  exists,  it is removed before file1 is moved.  If file2 has a mode which forbids writing, prints the mode and reads the
       standard input to obtain a line.  If the line begins with y, the move takes place.  If it does not, exits.  For further information, see

       In the second form, one or more files (plain files or directories) are moved to the directory with their original file-names.

       The command refuses to move a file onto itself.

Options
       -		   Interprets all following arguments as file names to allow file names starting with a minus.

       -f		   Force. This option overrides any mode restrictions or the -i switch.

       -i		   Interactive mode.  If a move is to supersede an existing file, the system prompts youw with the name of the	file  fol-
			   lowed  by  a question mark.	If you type a string that begins with y, the move occurs.  If you type any other response,
			   the move does not occur.

Restrictions
       If file1 and file2 lie on different file systems, must copy the file and delete the original.  In this case the owner name becomes that	of
       the copying process and any linking relationship with other files is lost.

See Also
       cp(1), ln(1)

																	     mv(1)
All times are GMT -4. The time now is 05:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy