AWK filter from file and print


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK filter from file and print
# 1  
Old 01-06-2012
AWK filter from file and print

Dear all,

I am using awk to filter some data like this:-

Code:
 awk 'NR==FNR{a[$2];next}($1 in a)' FS=":" filter.dat data.dat >! out.dat

where the filter and input data look like this:-

filter.dat...
Code:
n_o00j_1900_40_007195350_0:n_o00j_1940_40_007308526
n_o00m_1900_40_007195353_1:n_o00m_1940_40_007264509
n_o011_1900_40_007195368_0:n_o011_1940_40_007398635
n_o011_1900_40_007195368_2:n_t011_1940_40_007423081

data.dat...
Code:
n_o00j_1940_40_007308526:_2 n_o00j_1980_40_007412841
n_o00m_1940_40_007264509:_0 n_o00m_1980_40_007425752
n_o011_1940_40_007398635:_0 n_o011_1980_40_007535417
n_o013_1940_40_007451863:_2 n_o013_1980_40_007537133
n_o01h_1940_40_007267748:_0 n_o01h_1980_40_007436626
n_o01q_1940_40_007444906:_4 n_o01q_1980_40_007539529
n_o01z_1940_40_007264516:_2 n_o01z_1980_40_007456727
n_o01z_1940_40_007423659:_1 n_o01z_1980_40_007461267
n_o021_1940_40_007301516:_2 n_o021_1980_40_007423548
n_o027_1940_40_007301512:_1 n_o027_1980_40_007426427

At the moment, the command above only prints the lines in data.dat where column 1 in data.dat has a match in column 2 of filter.dat.

What I would like it to do is to print column 1 of filter.dat and then the line from data.dat which was found to have a match with column 2 of filter.dat, so output from the above example would be:-

Code:
n_o00j_1900_40_007195350_0:n_o00j_1940_40_007308526:_2 n_o00j_1980_40_007412841
n_o00m_1900_40_007195353_1:n_o00m_1940_40_007264509:_0 n_o00m_1980_40_007425752
n_o011_1900_40_007195368_0:n_o011_1940_40_007398635:_0 n_o011_1980_40_007535417

I am sure this must be possible and the solution simple, but so far it has me beat! Any help much appreciated!

Last edited by vgersh99; 01-06-2012 at 01:39 PM.. Reason: code tags, please!
# 2  
Old 01-06-2012
Code:

awk 'NR==FNR{a[$2]=$1;next}$1 in a {print a[$1], $0}' FS=":" OFS=: filter.dat data.dat >! out.dat

This User Gave Thanks to vgersh99 For This Post:
# 3  
Old 01-06-2012
Note that vgersh99's code assume that $2 of the filter file are all distincts...

(which is currently the case in the given example, but if your real filter file is bigger, you should then check that it also follow that rule)
This User Gave Thanks to ctsgnb For This Post:
# 4  
Old 01-09-2012
Thanks vgersh99! And now I can see how you did it I can apply it elsewhere too.

And thanks for the caveat ctsgnb. My real filter file is far far bigger, but $2 is always unique.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Filter and sort the file using awk

I have file and process it and provide clean output. input file Device Symmetrix Name : 000A4 Device Symmetrix Name : 000A5 Device Symmetrix Name : 000A6 Device Symmetrix Name : 000A7 Device Symmetrix Name : 000A8 Device Symmetrix Name : 000A9 Device Symmetrix Name ... (10 Replies)
Discussion started by: ranjancom2000
10 Replies

2. Shell Programming and Scripting

awk to filter file based on seperate conditions

The below awk will filter a list of 30,000 lines in the tab-delimited file. What I am having trouble with is adding a condition to SVTYPE=CNV that will only print that line if CI= must be >.05 . The other condition to add is if SVTYPE=Fusion, then in order to print that line READ_COUNT must... (3 Replies)
Discussion started by: cmccabe
3 Replies

3. Shell Programming and Scripting

awk to filter file using another working on smaller subset

In the below awk if I use the attached file as the input, I get no results for TCF4. However, if I just copy that line from the attached file and use that as input I get results for TCF4. Basically the gene file is a 1 column list that is used to filter $8 of the attached file. When there is a... (9 Replies)
Discussion started by: cmccabe
9 Replies

4. UNIX for Dummies Questions & Answers

How to create a print filter that print text & image?

Currently, I have a print filter that takes a text file, that convert it into PCL which then gets to a HP printer. This works. Now I need to embedded a image file within the text file. I'm able to convert the image file into PCL and I can cat both files together to into a single document... (1 Reply)
Discussion started by: chedlee88-1
1 Replies

5. Shell Programming and Scripting

awk filter by columns of file csv

Hi, I would like extract some lines from file csv using awk , below the example: I have the file test.csv with in content below. FLUSSO;COD;DATA_LAV;ESITO ULL;78;17/09/2013;OL ULL;45;05/09/2013;Apertura NP;45;13/09/2013;Riallineamento ULLNP;78;17/09/2013;OL NPG;14;12/09/2013;AperturaTK... (6 Replies)
Discussion started by: giankan
6 Replies

6. Shell Programming and Scripting

awk Help: Filter Multiple Entry & print in one line.

AWK Gurus, data: srvhcm01 AZSCI srvhcm01 AZSDB srvhcm01 BZSDB srvhcm01 E2QDI31 srvhcm01 YPDCI srvhcm01 YPDDB srvhcm01 UV2FSCR srvhcm01 UV2FSBI srvhcm01 UV2FSXI srvhcm01 UV2FSUC srvhcm01 UV2FSEP srvhcm01 UV2FSRE srvhcm01 NASCI srvhcm01 NASDB srvhcm01 UV2FSSL srvhcm01 UV2FSDI (7 Replies)
Discussion started by: rveri
7 Replies

7. Shell Programming and Scripting

Help with awk, using a file to filter another one

I have a main file: ... 17,466971 0,095185 17,562156 id 676 17,466971 0,096694 17,563665 id 677 17,466971 0,09816 17,565131 id 678 17,466971 0,099625 17,566596 id 679 17,466971 0,101091 17,568062 id 680 17,466971 0,016175 17,483146 id... (4 Replies)
Discussion started by: boblix
4 Replies

8. Shell Programming and Scripting

awk-filter record by another file

I have file1 3049 3138 4672 22631 45324 112382 121240 125470 130289 186128 193996 194002 202776 228002 253221 273523 284601 284605 641858 (8 Replies)
Discussion started by: biomed
8 Replies

9. Shell Programming and Scripting

Filter records in a file using AWK

I want to filter records in one of my file using AWK command (or anyother command). I am using the below code awk -F@ '$1=="0003"&&"$2==20100402" print {$0}' $INPUT > $OUTPUT I want to pass the 0003 and 20100402 values through a variable. How can I do this? Any help is much... (1 Reply)
Discussion started by: gpaulose
1 Replies

10. Shell Programming and Scripting

filter parts of a big file using awk or sed script

I need an assistance in file generation using awk, sed or anything... I have a big file that i need to filter desired parts only. The objective is to select (and print) the report # having the string "apple" on 2 consecutive lines in every report. Please note that the "apple" line has a HEX... (1 Reply)
Discussion started by: apalex
1 Replies
Login or Register to Ask a Question