Matching two files per column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Matching two files per column
# 1  
Old 09-22-2014
Matching two files per column

Hi,

I hope somebody can help me with this problem, since I would like to solve this problem using awk, but im not experienced enough with this.

I have two files which i want to match, and output the matching column name and row number.

One file contains 4 columns like this:
FILE1:
Code:
a         b         c         d
10.5    20.2    50.4    100.2
200.1  500.4   20.6    10.4

and the second file just one column with entries:
FILE2:
Code:
12
50
200

I want for the entries in FILE2, to find the value they match to in FILE1, within a certain range of 2 (so +/- 2). And output the column name and the row number

So the output would be:
Code:
12   a   1
50   c   1
200 a   2

I can do this using loops, but this takes long because of the size of the files.
Is there a way of doing this using AWK?

Sorry I don't have any code of what I tried yet. I was trying code from previous questions, but didn't really get close to what I need..

Last edited by Corona688; 09-22-2014 at 12:30 PM..
# 2  
Old 09-22-2014
I don't see an advantage with AWK, you have to examine every value of both files; moreover, what if the value matches in more than one place?
# 3  
Old 09-22-2014
Hi blackrageous,

Yes you are right. It can match at multiple places.
I was just hoping there was a faster way then using for loops..

Thanks for your reply Smilie
# 4  
Old 09-22-2014
I sometimes will use a RAM disk in an effort to speed up scripts where I have to process each line a number of times.
# 5  
Old 09-22-2014
awk needs two loops but is pretty fast.
Code:
awk -v range=2 '
function adiff(a,b) {if (a>b) return a-b; else return b-a}
NR==FNR {val[$1]; next}
FNR>1 {
  for (f=1; f<=NF; f++)
    for (i in val)
      if (adiff($f,i)<=range) print i, f, FNR-1
}
' FILE2 FILE1


Last edited by MadeInGermany; 09-22-2014 at 07:24 PM.. Reason: inspired by Yoda: store the values as keys!
# 6  
Old 09-22-2014
Another awk approach:-
Code:
awk '
        NR == FNR {
                A[$1]
                next
        }
        FNR == 1 {
                for ( i = 1; i <= NF; i++ )
                        R[i] = $i
                next
        }
        {
                for ( i = 1; i <= NF; i++ )
                {
                        for ( k in A )
                        {
                                d = ( k > $i ? k - $i : $i - k )
                                if ( d >=0 && d <=2 )
                                        print k, R[i], FNR-1
                        }
                }
        }
' file2 file1

# 7  
Old 09-23-2014
Thanks a lot! That worked!!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Matching column search in two files

Hi, I have a tab delimited file1: NC_013499.1 3180 3269 GQ342961.1 NC_030295.1 5925 6014 FN398100.2 NC_007915.1 6307 6396 KU529284.1 NC_013499.1 5033 5122 GQ342961.1 And a second file2: NC_030295.1 RefSeq gene 136 5115 ... (6 Replies)
Discussion started by: Ibk
6 Replies

2. Shell Programming and Scripting

Matching column value from 2 different file using awk and append value from different column

Hi, I have 2 csv files. a.csv HUAWEI,20LMG011_DEKET_1296_RTN-980_IDU-1-11-ISV3-1(to LAMONGAN_M),East_Java,20LMG011_DEKET_1296_RTN-980_IDU-1,20LMG011,20LMG 027_1287_LAMONGAN_RTN980_IDU1,20LMG027,1+1(HSB),195.675,20LMG011-20LMG027,99.9995,202.6952012... (7 Replies)
Discussion started by: tententen
7 Replies

3. Shell Programming and Scripting

Comparing same column from two files, printing whole row with matching values

First I'd like to apologize if I opened a thread which is already open somewhere. I did a bit of searching but could quite find what I was looking for, so I will try to explaing what I need. I'm writing a script on our server, got to a point where I have two files with results. Example: File1... (6 Replies)
Discussion started by: mitabrev83
6 Replies

4. UNIX for Beginners Questions & Answers

Concatenate column values when header is Matching from multiple files

there can be n number of columns but the number of columns and header name will remain same in all 3 files. Files are tab Delimited. a.txt Name 9/1 9/2 X 1 7 y 2 8 z 3 9 a 4 10 b 5 11 c 6 12 b.xt Name 9/1 9/2 X 13 19 y 14 20 z 15 21 a 16 22 b 17 23 c 18 24 c.txt Name 9/1 9/2... (14 Replies)
Discussion started by: Nina2910
14 Replies

5. Shell Programming and Scripting

Matching column then append to existing File as new column

Good evening I have the below requirements, as I am not an experts in Linux/Unix and am looking for your ideas how I can do this. I have file called file1 and file2. I need to get the second column which is text1_random_alphabets and find that in file 2, if it's exists then print the 3rd... (4 Replies)
Discussion started by: mychbears
4 Replies

6. Shell Programming and Scripting

Compare and matching column entries in 2 files and

I have 2 files. File 1 has more columns (6 columns but the last column has spaces) than file 2 (file 2 has 4 columns). The entries in file 1 do not change but column 4 in file 2 can be different from the the entry in file 1. I want to create a script that reads in file 1 and then uses column 1 2... (5 Replies)
Discussion started by: kieranfoley
5 Replies

7. Shell Programming and Scripting

How to merge two or more fields from two different files where there is non matching column?

Hi, Please excuse for often requesting queries and making R&D, I am trying to work out a possibility where i have two files field separated by pipe and another file containing only one field where there is no matching columns, Could you please advise how to merge two files. $more... (3 Replies)
Discussion started by: karthikram
3 Replies

8. Shell Programming and Scripting

Find lines with matching column 1 value, retain only the one with highest value in column 2

I have a file like: I would like to find lines lines with duplicate values in column 1, and retain only one based on two conditions: 1) keep line with highest value in column 3, 2) if column 3 values are equal, retain the line with the highest value in column 4. Desired output: I was able to... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

9. Shell Programming and Scripting

Join 3 or more files using matching column

Dear Forum, Full title of the topic would be: "Join 3 or more files using matching column without full list in any of these columns" I have several, typically 3 or 4 files which I need to join, something like FULL JOIN in slq scripts, all combinations of matches should be printed into an... (3 Replies)
Discussion started by: cyz700
3 Replies

10. Shell Programming and Scripting

Matching 2 files based on one column

Hi, On a similar subject, the following. I have two files: file1.txt dbSNP_rsID,Chromosome,Position,Gene rs10399749,chr. 01,45162,? rs4030303,chr. 01,72434,? rs4030300,chr. 01,72515,? rs940550,chr. 01,78032,? rs13328714,chr. 01,81468,? rs11490937,chr. 01,222077,? rs6683466,chr.... (5 Replies)
Discussion started by: swvanderlaan
5 Replies
Login or Register to Ask a Question