Concatenating 2 lines from 2 files having matching strings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Concatenating 2 lines from 2 files having matching strings
# 1  
Old 06-07-2013
Concatenating 2 lines from 2 files having matching strings

Hello All Unix Users,

I am still new to Unix, however I am eager to learn it..
I have 2 files, some lines have some matching substrings, I would like to concatenate these lines into one lines, leaving other untouched. Here below is an example for that..

File 1 (fasta file):

Code:
>292183
AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGACAGGCTTAACACATGCAAGTCGAGGGGCAGCGGGGAGGAAGCTTGCTTTCTCTGCCGGCGACCGGCGCACGGGTGAGT
>551166
GTCGAGCGGCGAACGGGTGAGTAACGCGTGGATTATCTGCCCCGAGGTGGGGGATAACCCGGGGAAACTCGGGCTAATACCGCATATGACCGTGAGGTCAAAGGGGGGTCGCA

File 2:
Code:
292183	k__Bacteria
551166	k__Bacteria; p__Acidobacteria

The desired output:

Code:
>292183 k__Bacteria
AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGACAGGCTTAACACATGCAAGTCGAGGGGCAGCGGGGAGGAAGCTTGCTTTCTCTGCCGGCGACCGGCGCACGGGTGAGT
>551166 k__Bacteria; p__Acidobacteria
GTCGAGCGGCGAACGGGTGAGTAACGCGTGGATTATCTGCCCCGAGGTGGGGGATAACCCGGGGAAACTCGGGCTAATACCGCATATGACCGTGAGGTCAAAGGGGGGTCGCA

I tried to use awk and perl for that, but I never had them into one file..

I appreciate any help,
Best Regards,
Mohamed
# 2  
Old 06-07-2013
Code:
awk 'NR==FNR {a[">"$1] = ">"$0; next} a[$1] {$0 = a[$i]}'1 file2 file1

This User Gave Thanks to balajesuri For This Post:
# 3  
Old 06-07-2013
Code:
awk 'FILENAME=="file2" {arr[$1]=$0}'
       FILENAME=="file1" { if (index($0, ">")==1) 
                                      {print ">" arr[substr($0,2)]; next}
                                   {print $0}' file2 file1 > newfile

Note: file2 file1 in that order are required -this code is just a less compact form of balajesuri's post.
This User Gave Thanks to jim mcnamara For This Post:
# 4  
Old 06-07-2013
Quote:
Originally Posted by balajesuri
Code:
awk 'NR==FNR {a[">"$1] = ">"$0; next} a[$1] {$0 = a[$i]}'1 file2 file1

Thanks, but I would like the connector to be space not tab delimited..

>292183<\s>k__Bacteria
AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGACAGGCTTAACACATGCAAGTCGAGGGGCAGCGGGGAGGAAGCTTGCTTTCTCTGCCGGCGACCGG CGCACGGGTGAGT

Smilie

---------- Post updated at 06:31 AM ---------- Previous update was at 06:29 AM ----------

Quote:
Originally Posted by jim mcnamara
Code:
awk 'FILENAME=="file2" {arr[$1]=$0}'
       FILENAME=="file1" { if (index($0, ">")==1) 
                                      {print ">" arr[substr($0,2)]; next}
                                   {print $0}' file2 file1 > newfile

Note: file2 file1 in that order are required -this code is just a less compact form of balajesuri's post.
Thanks, it worked very well with me except I would like to have space not tabes..

>292183<\s>k__Bacteria
AGAGTTTGATCCTGGCTCAGGATGAACGCTAGCGACAGGCTTAACACATGCAAGTCGAGGGGCAGCGGGGAGGAAGCTTGCTTTCTCTGCCGGCGACCGG CGCACGGGTGAGT
# 5  
Old 06-07-2013
try
Code:
awk 'NR==FNR{for(i=1;i<=NF;i++){S=S?S" "$i:">"$i}; A[">"$1]=S; next}{print A[$0]?A[$0]:$0}' file_2 file_1

# 6  
Old 06-07-2013
You probably have tabs in your original pasta file, they get "carried over" to the new file by default.
add this line at the top of the awk code block

Code:
BEGIN{OFS=" "}

# 7  
Old 06-07-2013
I don't think OFS is relevant since none of the solutions rebuild $0 or use a print statement with multiple args.

Regards,
Alister
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to extract the partial matching strings among two files?

I have a two file as shown below, file:1 >Contig_152_415 (REVERSE SENSE) >Contig_152_420 (REVERSE SENSE) >Contig_152_472 (REVERSE SENSE) >Contig_152_484 (REVERSE SENSE) File:2 >Contig_152:49081-49929 ATCGAGCAGCGCCGCGTGCGGTGCACCCTTGTGCAGATCGGGAGTAACCACGCGCACGGC... (2 Replies)
Discussion started by: dineshkumarsrk
2 Replies

2. Shell Programming and Scripting

Concatenating strings and run it in bash

Hi, all, I tried to write a simple shell script as follow: #!/bin/bash # What want to do in bash is following # : pcd_viewer cloud_cluster_0.pcd cloud_cluster_1.pcd cloud_cluster_2.pcd cloud_cluster_3.pcd cloud_cluster_4.pcd STR = "pcd_viewer" for i in `seq 0 4` do STR... (1 Reply)
Discussion started by: bedeK
1 Replies

3. Shell Programming and Scripting

matching strings from different files

I want to compare file 1 to file 2 and if a string from file 1 appears in file 2, then print the file 2 row, where the string appears, onto file3. file 1 looks like this. DOG_0004340 blah blah2 j 22424 DOG_3010311 blah blah3 o 24500 DOG_9949221 blah blah6 x 35035 file 2 looks like... (5 Replies)
Discussion started by: verse123
5 Replies

4. Shell Programming and Scripting

Problem in concatenating two Strings

Hi Friends, I'm new to shell scripting and trying to concatenate two Strings to create a filepath like string but I'm getting an unexpected result. here is my code for 'runToneUserLoad.sh': script_dir="$(dirname $0)" echo "Script Dir:$script_dir" dirtest1="/installedUtility"... (6 Replies)
Discussion started by: kuldeept
6 Replies

5. Shell Programming and Scripting

Concatenating lines of separate files using awk or sed

For example: File 1: abc def ghi jkl mno pqr File 2: stu vwx yza bcd efg hij klm nop qrs I want the reult to be: abc def ghistu vwx yza jkl mno pqrbcd efg hij klm nop qrs (4 Replies)
Discussion started by: tamahomekarasu
4 Replies

6. Shell Programming and Scripting

concatenating selected lines of multiple files

Hi, I would like a shell script that reads all files in a directory and concatenate them. It is not a simple concatenation. The first few lines of the files should not be included. The lines to be included are the lines from where 'START HERE' appears up to the end of the file. For example, I... (4 Replies)
Discussion started by: laiko
4 Replies

7. Shell Programming and Scripting

concatenating strings

I m new to shell scripting and what i want is take as an i/p from command line the name of the file and inside my script i should redirect the o/p of my few commands to this file concatenated with .txt for example if i give ./linux filename i should get the o/p in filename.txt i need to... (2 Replies)
Discussion started by: tulip
2 Replies

8. Shell Programming and Scripting

Remove matching lines with list of strings

Hi, HP-UX gxxxxxxxc B.11.23 U ia64 3717505098 unlimited-user license I have a file with below pipe separated field values: xxx|xxx|abcd|xxx|xxx|xx xxx|xxx|abcd#123|xxx|xxx|xx xxx|xxx|abcd#345|xxx|xxx|xx xxx|xxx|pqrs|xxx|xxx|xx xxx|xxx|pqrs#123|xxx|xxx|xx The third field has values like... (6 Replies)
Discussion started by: Nanu_Manju
6 Replies

9. Shell Programming and Scripting

concatenating strings..

hey guys.. probably a simple question but i cant seem to find any info on it. i have a small array of strings, and i want to concatenate the contents of the array into one big string. any ideas on how i can do this? cheers. (2 Replies)
Discussion started by: jt_csv
2 Replies

10. Shell Programming and Scripting

Concatenating Strings

Is there any function to concatenate strings in shell script (2 Replies)
Discussion started by: radhika03
2 Replies
Login or Register to Ask a Question