awk print only select records from file2


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk print only select records from file2
# 1  
Old 08-02-2011
awk print only select records from file2

Print only records from file 2 that do not match file 1 based on criteria of comparing column 1 and column 6

Was trying to play around with following code I found on other threads but not too successful

Code:
Code:
awk 'NR==FNR{p=$1;$1=x;A[p]=$0;next}{$2=$2(A[$1]?A[$1]:",,,")}1' FS=~ OFS=~ file1 FS="[ \t]*" file2

NR==FNR If we are reading the first file (The variables NR and FNR are only equal when reading the first file)
A[$1]=$2 store the second field in array a using the index of the first field
next proceed to read the next record
A[$1] if A[$1] exists (using the $1 of the second file)
$2=A[$1] FS $2;print then append FS (a comma) followed by A[$1] (using the $1 of the second file)
Fs=, set the input file seperator to "~"
OFS=, set the output file seperator to "~"

Code:
Code:
awk 'NR==FNR{a[$1]=$2;next} # save first file "small_file.txt" into array a. column 1 as array index, column 2 as the array value. 
{print $1,a[$1]}' small_file.txt huge_file.txt # print each line of huge_file.txt, and the related value in array a


Last edited by radoulov; 08-02-2011 at 03:02 PM.. Reason: Code tags!
# 2  
Old 08-02-2011
Maybe this example can help:
Code:
% cat file1
a
b
% cat file2
1
a
2
b
% awk '
  NR==FNR {a[$1]++}
  NR!=FNR && a[$1]' file1 file2
a
b

# 3  
Old 08-02-2011
but this takes into account identical matching records not being printed.
The records are not exact duplicates but records have duplicate values in certain columns. So how can i have it compare the files and not print records in second file based on comparing column 1 and olumn 6?
# 4  
Old 08-02-2011
I gave you my example because you didn't give yours. If you want to get a real help always give real examples of your input and desired output.

And you can see I didn't use $0 but $1, so this example is not about identical records.
# 5  
Old 08-02-2011
file 1
Code:
 
123~blah~static~abc~static~7/29/2010
456~blah~static~def~static~1/13/2011
789~blah~static~ghi~static~9/10/2009
012~blah~static~jkl~static~11/10/2010
345~blah~static~mno~static~5/27/2014
678~blah~static~pqr~static~3/30/2010
901~blah~static~stu~static~10/14/2011

file 2

Code:
 
123~xxx~xxx~xxx~xxx~7/29/2010
123~xxx~xxx~xxx~xxx~5/11/2011
456~xxx~xxx~xxx~xxx~1/13/2011
456~xxx~xxx~xxx~xxx~2/15/2013
456~xxx~xxx~xxx~xxx~12/22/2017
789~xxx~xxx~xxx~xxx~9/10/2009
012~xxx~xxx~xxx~xxx~11/10/2010
012~xxx~xxx~xxx~xxx~01/17/2015
345~xxx~xxx~xxx~xxx~5/27/2014
678~xxx~xxx~xxx~xxx~3/30/2010
901~xxx~xxx~xxx~xxx~10/14/2011

output
Code:
 
123~xxx~xxx~xxx~xxx~5/11/2011
456~xxx~xxx~xxx~xxx~2/15/2013
456~xxx~xxx~xxx~xxx~12/22/2017
012~xxx~xxx~xxx~xxx~01/17/2015

Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by sigh2010; 08-02-2011 at 01:18 PM.. Reason: code tags, please!
# 6  
Old 08-02-2011
This:
Code:
awk -F'~' '
  NR==FNR {a[$1 $6]++;}     
  NR!=FNR && a[$1 $6]' file1 file2

gives the next output:

Code:
123~xxx~xxx~xxx~xxx~7/29/2010
456~xxx~xxx~xxx~xxx~1/13/2011
789~xxx~xxx~xxx~xxx~9/10/2009
345~xxx~xxx~xxx~xxx~5/27/2014
678~xxx~xxx~xxx~xxx~3/30/2010
901~xxx~xxx~xxx~xxx~10/14/2011

I don't understand why you don't want 789..., 345..., 678... and 901... lines and want the line with 456...12/22/2017.
And lines with 012 have 7 fields, not 6. It's possible to fix this but maybe it is a mistake?
# 7  
Old 08-02-2011
lines 789, 3445, 678, 901 have matching values in column 6 and therefore should not be in file 3

lines with 012 have been corrected to have 6 fields, it was typo
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk- Indexing a list of numbers in file2 to print certain rows in file1

Hi Does anyone know of an efficient way to index a column of data in file2 to print the coresponding row in file1 which corresponds to the data in file2 AND 30 rows preceding and after the row in file1. For example suppose you have a list of numbers in file2 (single column) as follows:... (6 Replies)
Discussion started by: Geneanalyst
6 Replies

2. Shell Programming and Scripting

awk print matching records and occurences of each record

Hi all , I have two files : dblp.xml with dblp records and itu1.txt with faculty members records. I need to find out how many dblp records are related to the faculty members. More specific: I need to find out which names from itu1.txt are a match in dblp. xml file , print them and show how many... (4 Replies)
Discussion started by: iori
4 Replies

3. Shell Programming and Scripting

To select non-duplicate records using awk

Friends, I have data sorted on id like this id addressl 1 abc 2 abc 2 abc 2 abc 3 aabc 4 abc 4 abc I want to pick all ids with addressesses leaving out duplicate records. Desired output would be id address 1 abc 2 abc 3 abc 4 abc (5 Replies)
Discussion started by: paresh n doshi
5 Replies

4. Shell Programming and Scripting

awk read in file1, gsub in file2, print to file3

I'm trying to use awk to do the following. I have file1 with many lines, each containing 5 fields describing an individual set. I have file2 which is a template config file with variable space holders to be replaced by the values in file1. I would like to substitute each set of values in file1 with... (6 Replies)
Discussion started by: msmehaffey
6 Replies

5. Shell Programming and Scripting

Print unique records in 2 columns using awk

Is it possible to print the records that has only 1 value in 2nd column. Ex: input awex1 1 awex1 2 awex1 3 assww 1 ader34 1 ader34 2 output assww 1 (5 Replies)
Discussion started by: quincyjones
5 Replies

6. Shell Programming and Scripting

AWK, print no of records after pattern match.

Hi ALL :). i have a file, cat 3 + dog 5 + rat 6 - i want to print no of record having pattern "+". thanks in advance :confused:. (2 Replies)
Discussion started by: admax
2 Replies

7. UNIX for Advanced & Expert Users

print contents of file2 for matching pattern in file1 - AWK

File1 row is same as column 2 in file 2. Also file 2 will either start with A, B or C. And 3rd column in file 2 is always F2. When column 2 of file 2 matches file1 column, print all those rows into a separate file. Here is an example. file 1: 100 103 104 108 file 2: ... (6 Replies)
Discussion started by: i.scientist
6 Replies

8. Shell Programming and Scripting

Based on num of records in file1 need to check records in file2 to set some condns

Hi All, I have two files say file1 and file2. I want to check the number of records in file1 and if its atleast 2 (i.e., 2 or greater than 2 ) then I have to check records in file2 .If records in file2 is atleast 1 (i.e. if its not empty ) i have to set some conditions . Could you pls... (3 Replies)
Discussion started by: mavesum
3 Replies

9. UNIX for Dummies Questions & Answers

AWK ??-print for fields within records in a file

Hello all, Would appreciate if someone can help me out on the following requirement. INPUT FILE: -------------------------- TPS REPORT abc def ghi jkl mon pqr stu vrs lll END OF TPS REPORT TPS REPORT field1 field2 field3 field4 field5 field6 (8 Replies)
Discussion started by: hyennah
8 Replies

10. Shell Programming and Scripting

Using a variable to select records with awk

As part of a bigger task, I had to read thru a file and separate records into various batches based on a field. Specifically, separate records based on the value in the batch field as defined below. The batch field left-justified numbers. The datafile is here > cat infile 12345 1 John Smith ... (5 Replies)
Discussion started by: joeyg
5 Replies
Login or Register to Ask a Question