Matching column search in two files


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Matching column search in two files
# 1  
Old 09-19-2019
Matching column search in two files

Hi,

I have a tab delimited file1:
Code:
NC_013499.1    3180    3269    GQ342961.1      
NC_030295.1    5925    6014    FN398100.2      
NC_007915.1    6307    6396    KU529284.1        
NC_013499.1    5033    5122    GQ342961.1

And a second file2:
Code:
NC_030295.1     RefSeq  gene    136     5115    .       +       .
NC_007915.1     RefSeq  CDS     6227    7596    .       +       0
NC_030295.1     RefSeq  sequence_feature        1050    1074    .
NC_030295.1     RefSeq  sequence_feature        1533    1557    .       +       .
NC_030295.1     RefSeq  gene    5520    7733    .       +       0


I want to use print combine lines of both files where column 1 of file1 matches column 1 of file2 and column 2 of file1 is >= column4 of file2 and column 3 of file1 is <= column 5 of file2.

My expected output in the example is
Code:
NC_030295.1    5925    6014    FN398100.2         NC_030295.1     RefSeq  gene    5520    7733    .       +       0
NC_007915.1    6307    6396    KU529284.1          NC_007915.1     RefSeq  CDS     6227    7596    .       +       0

Your help is well appreciated.

Last edited by Scrutinizer; 09-19-2019 at 01:41 AM.. Reason: code tags
# 2  
Old 09-19-2019
Any attemps / ideas from your side ?

You might want to check out the bottom of this page for releated discussions.

Regards
Peasant.
# 3  
Old 09-19-2019
I tried this but does not give me the desired result
Code:
awk '{if (NR==FNR) {l[NR]=$0;a[NR]=$2;b[NR]=$3} else if (a[FNR]>=$4 && b[FNR]<=$5) {print l[FNR],$0}}' file1 file2 . file3

# 4  
Old 09-19-2019
Try:-

Code:
awk -F'\t' '
        NR == FNR {
                A[$1] = $0
                next
        }
        $1 in A {
                split(A[$1], T)
                if ( T[2] >= $4 && T[3] <= $5 )
                        print A[$1], $0
        }
' OFS='\t' file1 file2

# 5  
Old 09-19-2019
The code did no give any output
# 6  
Old 09-19-2019
I suppose then your input is not tab delimited, try:-
Code:
awk '
        NR == FNR {
                A[$1] = $0
                next
        }
        $1 in A {
                split(A[$1], T)
                if ( T[2] >= $4 && T[3] <= $5 )
                        print A[$1], $0
        }
'  file1 file2

This User Gave Thanks to Yoda For This Post:
# 7  
Old 09-20-2019
Thanks Yoda. Woks perfectly
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search string in multiple files and display column wise

I have 3 files. Each of those files have the same number of records, however certain records have different values. I would like to grep the field in ALL 3 files and display the output with only the differences in column wise and if possible line number File1 Name = Joe Age = 33... (3 Replies)
Discussion started by: sidnow
3 Replies

2. Shell Programming and Scripting

Comparing same column from two files, printing whole row with matching values

First I'd like to apologize if I opened a thread which is already open somewhere. I did a bit of searching but could quite find what I was looking for, so I will try to explaing what I need. I'm writing a script on our server, got to a point where I have two files with results. Example: File1... (6 Replies)
Discussion started by: mitabrev83
6 Replies

3. UNIX for Beginners Questions & Answers

Concatenate column values when header is Matching from multiple files

there can be n number of columns but the number of columns and header name will remain same in all 3 files. Files are tab Delimited. a.txt Name 9/1 9/2 X 1 7 y 2 8 z 3 9 a 4 10 b 5 11 c 6 12 b.xt Name 9/1 9/2 X 13 19 y 14 20 z 15 21 a 16 22 b 17 23 c 18 24 c.txt Name 9/1 9/2... (14 Replies)
Discussion started by: Nina2910
14 Replies

4. Shell Programming and Scripting

Matching two files per column

Hi, I hope somebody can help me with this problem, since I would like to solve this problem using awk, but im not experienced enough with this. I have two files which i want to match, and output the matching column name and row number. One file contains 4 columns like this: FILE1: a ... (6 Replies)
Discussion started by: Jenna.bos
6 Replies

5. Shell Programming and Scripting

Matching column and search closest elements

Hi all I have a great challenge that I am not able to resolve. Briefly, I have a file like this: ID_1 chr1 100 - ID_2 chr2 300 + and another file like this: name_1 chr1 150 no - name_2 chr1 250 yes - name_3 chr2 350 yes + name_4 chr2 280 yes + Well, for each entry in file1 I would... (2 Replies)
Discussion started by: giuliangiuseppe
2 Replies

6. Shell Programming and Scripting

Compare and matching column entries in 2 files and

I have 2 files. File 1 has more columns (6 columns but the last column has spaces) than file 2 (file 2 has 4 columns). The entries in file 1 do not change but column 4 in file 2 can be different from the the entry in file 1. I want to create a script that reads in file 1 and then uses column 1 2... (5 Replies)
Discussion started by: kieranfoley
5 Replies

7. Shell Programming and Scripting

How to merge two or more fields from two different files where there is non matching column?

Hi, Please excuse for often requesting queries and making R&D, I am trying to work out a possibility where i have two files field separated by pipe and another file containing only one field where there is no matching columns, Could you please advise how to merge two files. $more... (3 Replies)
Discussion started by: karthikram
3 Replies

8. UNIX for Advanced & Expert Users

Recursively search the string from a column in no. of files

i have a file named keyword.csv(contains around 8k records) which contains a no. of columns. The 5th column contains all the keywords. I want to recursively search these keywords in all .pl files(around 1k) and display the filename....Afterthat i will use the filename and some of the column from... (3 Replies)
Discussion started by: millan
3 Replies

9. Shell Programming and Scripting

Join 3 or more files using matching column

Dear Forum, Full title of the topic would be: "Join 3 or more files using matching column without full list in any of these columns" I have several, typically 3 or 4 files which I need to join, something like FULL JOIN in slq scripts, all combinations of matches should be printed into an... (3 Replies)
Discussion started by: cyz700
3 Replies

10. Shell Programming and Scripting

Matching 2 files based on one column

Hi, On a similar subject, the following. I have two files: file1.txt dbSNP_rsID,Chromosome,Position,Gene rs10399749,chr. 01,45162,? rs4030303,chr. 01,72434,? rs4030300,chr. 01,72515,? rs940550,chr. 01,78032,? rs13328714,chr. 01,81468,? rs11490937,chr. 01,222077,? rs6683466,chr.... (5 Replies)
Discussion started by: swvanderlaan
5 Replies
Login or Register to Ask a Question