Sponsored Content
Top Forums Shell Programming and Scripting Join lines from two files based on match Post 302844326 by pathunkathunk on Sunday 18th of August 2013 01:46:36 PM
Old 08-18-2013
Join lines from two files based on match

I have two files.
File1
Code:
>gi|11320906|gb|AF197889.1|_Buchnera_aphidicola
ATGAAATTTAAGATAAAAAATAGTATTTT
>gi|11320898|gb|AF197885.1|_Buchnera_aphidicola
ATGAAATTTAATATAAACAATAAAA
>gi|11320894|gb|AF197883.1|_Buchnera_aphidicola
ATGAAATTTAATATAAACAATAAAATTTTT

File2
Code:
AF197885	Uroleucon aeneum
AF197886	Uroleucon jaceae
AF197889	Uroleucon obscurum
AF197883	Uroleucon astronomus
AF197893	Uroleucon erigeronense

For all lines in file1, I want to match the term bracked by "gb|" and "." (i.e. AF197889 in the first line) to a line in file2. In this example of file1, all terms of interest start with "AF" but this isn't always the case.

If there's a match, I'd like to append the species name in file2, preceded by "_host_" to the matching line in file1, using underscores and no spaces. Desired output:
Code:
>gi|11320906|gb|AF197889.1|_Buchnera_aphidicola_host_Uroleucon_obscurum
ATGAAATTTAAGATAAAAAATAGTATTTT
>gi|11320898|gb|AF197885.1|_Buchnera_aphidicola_host_Uroleucon_aeneum
ATGAAATTTAATATAAACAATAAAA
>gi|11320894|gb|AF197883.1|_Buchnera_aphidicola_host_Uroleucon_astronomus
ATGAAATTTAATATAAACAATAAAATTTTT

With the meager skills I have, I could use "|" as a filed separator for file 1 and use awk to fill an array to find matches. But I'm not sure how to to append the file2 data, or how to accomplish it in one step. Can anyone help?

Last edited by Don Cragun; 08-18-2013 at 02:56 PM.. Reason: CODE tags; not QUOTE tags for input, output, and code samples.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

join based on line number when one file is missing lines

I have a file that contains 87 lines, each with a set of coordinates (x & y). This file looks like: 1 200.3 -0.3 2 201.7 -0.32 ... 87 200.2 -0.314 I have another file which contains data that was taken at certain of these 87 positions. i.e.: 37 125 42 175 86 142 where the first... (1 Reply)
Discussion started by: jackiev
1 Replies

2. Shell Programming and Scripting

join two files based on one column

Hi All, I am trying to join to files based on one common column. Cat File1 ID HID Ab_1 23 Cd 45 df 22 Vv 33 Cat File2 ID pval Ab_1 0.3 Cd 10 Vv 0.0444 (3 Replies)
Discussion started by: newpro
3 Replies

3. UNIX for Dummies Questions & Answers

sed, join lines that do not match pattern

Hello, Could someone help me with sed. I have searched for solution 5 days allready :wall:, but cant find. Unfortunately my "sed" knowledge not good enough to manage it. I have the text: 123, foo1, bar1, short text1, dat1e, stable_pattern 124, foo2, bar2, long text with few lines, date,... (4 Replies)
Discussion started by: petrasl
4 Replies

4. UNIX for Dummies Questions & Answers

join 2 lines based on 1st field

hi i have a file with the following lines 2303:13593:137135 16 abc1 26213806....... 1234:45675:123456 16 bbc1 9813806....... 2303:13593:137135 17 bna1 26566444.... 1234:45675:123456 18 nnb1 98123456....... i want to join the lines having common 1st field i,e., ... (1 Reply)
Discussion started by: anurupa777
1 Replies

5. UNIX for Dummies Questions & Answers

Join the lines until next pattern match

Hi, I have a data file where data is splitted into multiple lines. And, each valid record starts with a patten date | <?xml and ends with pattern </dmm> e.g. 20120924|<?xml record 1 line1....record 1 line1....record 1 line1.... record 1 line2....record 1 line2....record 1 line2.... record 1... (3 Replies)
Discussion started by: Dipalik
3 Replies

6. UNIX for Dummies Questions & Answers

Join 2 files based on certain column

I have file input1.txt 11103|11|OTTAWA|City|AA|CAR|0|0|1|-1|0|8526|2014-09-07 23:00:14 11103|11|OTTAWA|City|BB|TRAIN|0|0|2|-2|6|6359|2014-09-07 23:00:14 11104|11|CANADA|City|CC|CAR|0|0|2|-2|0|5947|2014-09-07 23:00:14 11104|11|CANADA|City|DD|TRAIN|0|0|2|-2|1|4523|2014-09-07 23:00:14... (5 Replies)
Discussion started by: radius
5 Replies

7. Shell Programming and Scripting

Merge lines based on match

I am trying to merge two lines to one based on some matching condition. The file is as follows: Matches filter: 'request ', timestamp, <HTTPFlow request=<GET: Matches filter: 'request ', timestamp, <HTTPFlow request=<GET: Matches filter: 'request ', timestamp, <HTTPFlow ... (8 Replies)
Discussion started by: jamie_123
8 Replies

8. Shell Programming and Scripting

awk join lines based on keyword

Hello , I will need your help once again. I have the following file: cat file02.txt PATTERN XXX.YYY.ZZZ. 500 ROW01 aaa. 300 XS 14 ROW 45 29 AS XD.FD. PATTERN 500 ZZYN002 ROW gdf gsste ALT 267 fhhfe.ddgdg. PATTERN ERE.MAY. 280 PATTERRNTH 5000 rt.rt. ROW SO a 678 PATTERN... (2 Replies)
Discussion started by: alex2005
2 Replies

9. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies

10. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies
ppmtosixel(1)						      General Commands Manual						     ppmtosixel(1)

NAME
ppmtosixel - convert a portable pixmap into DEC sixel format SYNOPSIS
ppmtosixel [-raw] [-margin] [ppmfile] DESCRIPTION
Reads a portable pixmap as input. Produces sixel commands (SIX) as output. The output is formatted for color printing, e.g. for a DEC LJ250 color inkjet printer. If RGB values from the PPM file do not have maxval=100, the RGB values are rescaled. A printer control header and a color assignment table begin the SIX file. Image data is written in a compressed format by default. A printer control footer ends the image file. OPTIONS
-raw If specified, each pixel will be explicitly described in the image file. If -raw is not specified, output will default to com- pressed format in which identical adjacent pixels are replaced by "repeat pixel" commands. A raw file is often an order of magni- tude larger than a compressed file and prints much slower. -margin If -margin is not specified, the image will be start at the left margin (of the window, paper, or whatever). If -margin is speci- fied, a 1.5 inch left margin will offset the image. PRINTING
Generally, sixel files must reach the printer unfiltered. Use the lpr -x option or cat filename > /dev/tty0?. BUGS
Upon rescaling, truncation of the least significant bits of RGB values may result in poor color conversion. If the original PPM maxval was greater than 100, rescaling also reduces the image depth. While the actual RGB values from the ppm file are more or less retained, the color palette of the LJ250 may not match the colors on your screen. This seems to be a printer limitation. SEE ALSO
ppm(5) AUTHOR
Copyright (C) 1991 by Rick Vinci. 26 April 1991 ppmtosixel(1)
All times are GMT -4. The time now is 04:03 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy