Match ids and print original file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match ids and print original file
# 1  
Old 03-08-2013
Match ids and print original file

Hello,

I have two files

Original: ( 5000 entries)
Chr Position
chr1 879108
chr1 881918
chr1 896874 ...

and a file with allele freq ( 2000 entries)
Chr Position MAF
chr1 881918 0.007
chr1 979748 0.007
chr1 1120377 0.007
chr1 1178925 0.036

I would like the original file matched with the allele freq and print out the output file with 5000 entries.
Chr Position MAF
chr1 879108 NULL
chr1 881918 0.007
chr1 896874 NULL
...

Any help is appreciated. Thank you.

Last edited by nans; 03-08-2013 at 04:59 AM..
# 2  
Old 03-08-2013
what's the matching point between both files? your post doesn't clear the requirement..can you please mentioned something that is actually needed?
# 3  
Old 03-08-2013
The common column with both the files is the "position" which is the second column.
# 4  
Old 03-08-2013
if you are looking for something like matching between both files based on 2nd columne,

Code:
awk 'FNR==NR &&  NR>2 { a[$2]=$2; next } { if( $2 in a) { print  } }' original allelefreq
chr1 881918 0.007

# 5  
Old 03-08-2013
Thank you but that only prints the positions which match with the original file.
The desired output is to print all 5000 entries from the original file whether or not it has a 3rd value.

Eg:
chr1 12345 0.07
chr1 6789 NULL
chr1 13456 0.78
.....
chr22 465546 0.12
chr22 6757657 NULL
# 6  
Old 03-08-2013
reverse the filename order then

Code:
awk 'FNR==NR &&  NR>2 { a[$2]=$2; next } { if( $2 in a) { print  } }' allelefreq original

and let me know if this what you wanted
# 7  
Old 03-08-2013
Well, this gives me exactly all the entries common with original and allele freq file without the MAF values
chr1 979748
chr1 1120377
chr1 1178925
chr1 1222958
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

2. UNIX for Beginners Questions & Answers

Match duplicate ids in two files

I have two text files. File 1 has 150 ids but all the ids exists in duplicates so it has 300 ids in total. File 2 has 1500 ids but all exists in duplicates so file 2 has 300 ids in total. i want to match the first occurance of every id in file 1 with first occurance of thet id in file 2 and 2nd... (2 Replies)
Discussion started by: limd
2 Replies

3. UNIX for Beginners Questions & Answers

Count multiple columns and print original file

Hello, I have two tab files with headers File1: with 4 columns header1 header2 header3 header4 44 a bb 1 57 c ab 4 64 d d 5 File2: with 26 columns header1.. header5 header6 header7 ... header 22...header26 id1 44 a bb id2 57 ... (6 Replies)
Discussion started by: nans
6 Replies

4. Shell Programming and Scripting

Match ids

Hello, I have two files File 1 with 10 columns rsid position ........ xx 1:10000 File 2 position 1:10000 2:2000 .... I need to extract the IDs given in file 2(column1) from file 1 (column2) and print all columns from file1. I am trying this command (1 Reply)
Discussion started by: nans
1 Replies

5. Shell Programming and Scripting

Match and print columns in second file

Hi All, I have to match each row in file 1 with 1st row in file 2 and print the corresponding column from file2. I am trying to use an awk script to do this. For example cat File1 X1 X3 X4 cat File2 ID X1 X2 X3 X4 A 1 6 2 1 B 2 7 3 3 C 3 8 4 1 D 4 9 1 1 (3 Replies)
Discussion started by: newpro
3 Replies

6. Shell Programming and Scripting

AWK print and retain original format

I have a file with very specific column spacing formatting, I wish to do the following: awk '{print $1, $2, $3, $4, $5, $6, $19-$7, $20-$8, $21-$9, $10, $11, $12}' merge.pdb > vector.pdb but the format gets ruined. I have tried with print -f but to no avail.... (7 Replies)
Discussion started by: chrisjorg
7 Replies

7. UNIX for Dummies Questions & Answers

Match values/IDs from column and text files

Hello, I am trying to modify 2 files, to yield results in a 3rd file. File-1 is a 8-columned file, separted with tab. 1234:1 xyz1234 blah blah blah blah blah blah 1234:1 xyz1233 blah blah blah blah blah blah 1234:1 abc1234 blah blah blah blah blah blah n/a RRR0000 blah blah blah... (1 Reply)
Discussion started by: ad23
1 Replies

8. Shell Programming and Scripting

print when column match with other file

Hello all, please help. There are two file like this: file1: 1197510.0 294777.7 9666973.0 21.6 1839.8 1197510.0 294777.7 9666973.0 413.2 2075.9 1197510.0 294777.7 9666973.0 689.3 2260.0 ... (1 Reply)
Discussion started by: attila
1 Replies

9. Shell Programming and Scripting

Strings from one file which exactly match to the 1st column of other file and then print lines.

Hi, I have two files. 1st file has 1 column (huge file containing ~19200000 lines) and 2nd file has 2 columns (small file containing ~6000 lines). ################################# huge_file.txt a a ab b ################################## small_file.txt a 1.5 b 2.5 ab ... (4 Replies)
Discussion started by: AshwaniSharma09
4 Replies

10. UNIX for Dummies Questions & Answers

print remaining part from the first-match within a file

Hi, i was looking for unix command(s) for : find the first occurrence of a given pattern with in a file and print the remaining part. below is an example of what i am looking for: lets say, a file named myfile.txt now, the command i am looking for will do the following (4 Replies)
Discussion started by: nurulamin862
4 Replies
Login or Register to Ask a Question