Looping over a file to count common fields from another file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Looping over a file to count common fields from another file
# 1  
Old 06-28-2012
Looping over a file to count common fields from another file

Hi,

I would like to know how can I get the number of rows in file1 that:
- the 1st and 2nd field should be the same (text)
- the 3rd field should be less or equal (numeric)
when comparing to file2.

So for each row of file1, I would like to have the number of rows in file2 that follow the above 2 rules. This is just a small example. In reality, my files have millions of rows and more columns (fields, tab separated)

Example of file1
Code:
A AB 3.7
B AB 2.5

Example of file2
Code:
A AB 3.5
A AB 3.7

Desired output file
Code:
A AB 3.7 2
B AB 2.5 0

Many thanks!

Moderator's Comments:
Mod Comment Please use code tags next time for your code and data.
# 2  
Old 06-28-2012
Could this help you ?
Code:
awk 'NR==FNR{a[$1$2]++;next} {if (a[$1$2]){print $0,a[$1$2]}else {print $0,"0"}}' File2 File1

# 3  
Old 06-28-2012
It would we preferable to use some form of field separation to prevent unexpected mixing of the the two index fields.
Code:
awk -F '\t' '
  NR==FNR{
    A[$1 OFS $2]=$3
    C[$1 OFS $2]=0
    next
  } 
  $3<=A[$1 OFS $2]{
    C[$1 OFS $2]++
  } 
  END{
    for(i in A)print i,A[i],C[i]
  }
' file1 file2


Last edited by Scrutinizer; 06-28-2012 at 06:10 AM..
# 4  
Old 06-28-2012
Code:
awk 'BEGIN{SUBSEP=FS} FNR==NR{a[$1,$2]=$3;next} {
for(i in a){
if($1SUBSEP$2==i) {
if($3<=a[i]) {
b[i]++;break}}}
} END{for(i in a) printf("%s %s %d\n",i,a[i],b[i])}' file1 file2

# 5  
Old 06-28-2012
Looping over a file to count common fields from another file

Hi Scrut,

Can you please explain that in for loop what is the value of variable i?

As there are two array:
Code:
A[A AB]=3.7 etc
C[A AB]=0
Now here index is field 1 and 2.
 
When printing in for loop how system decide the value of var i whether it should be variable or string [A AB].

Please explain?
# 6  
Old 06-28-2012
Hi, the content of the i variable in the loop is determined by for(i in A), which enumerates the indexes... I am not sure if I understood your question correctly...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to count lines of CSV file where 2 fields match variables?

I'm trying to use awk to count the occurrences of two matching fields of a CSV file. For instance, for data that looks like this... Joe,Blue,Yes,No,High Mike,Blue,Yes,Yes,Low Joe,Red,No,No,Low Joe,Red,Yes,Yes,Low I've been trying to use code like this... countvar=`awk ' $2~/$color/... (4 Replies)
Discussion started by: nmoore2843
4 Replies

2. Shell Programming and Scripting

Speed : awk command to count the occurrences of fields from one file present in the other file

Hi, file1.txt AAA BBB CCC DDD file2.txt abc|AAA|AAAabcbcs|fnwufnq bca|nwruqf|AAA|fwfwwefwef fmimwe|BBB|fnqwufw|wufbqw wcdbi|CCC|wefnwin|wfwwf DDD|wabvfav|wqef|fwbwqfwfe i need the count of rows of file1.txt present in the file2.txt required output: AAA 2 (10 Replies)
Discussion started by: mdkm
10 Replies

3. Shell Programming and Scripting

awk - count character count of fields

Hello All, I got a requirement when I was working with a file. Say the file has unloads of data from a table in the form 1|121|asda|434|thesi|2012|05|24| 1|343|unit|09|best|2012|11|5| I was put into a scenario where I need the field count in all the lines in that file. It was simply... (6 Replies)
Discussion started by: PikK45
6 Replies

4. Shell Programming and Scripting

Count and merge using common column

I have the following records from multiple files. 415 A G 415 A G 415 A T 415 A . 415 A . 421 G A 421 G A,C 421 G A 421 G A 421 G A,C 421 G . 427 A C 427 A ... (3 Replies)
Discussion started by: empyrean
3 Replies

5. Shell Programming and Scripting

Looping inside directories based on a file which contains file directory list

Hi All, Please help. I have got a file which contains a listing of a file and some directories after it, one by one. I am calling this file xyz.txt here file1 dir1 dir2 dir3 dir4 file2 dir5 dir6 dir7 dir8 file3 dir9 dir10 dir11 dir12 (6 Replies)
Discussion started by: Piyush Jakra
6 Replies

6. Shell Programming and Scripting

FILE_ID extraction from file name and save it in CSV file after looping through each folders

FILE_ID extraction from file name and save it in CSV file after looping through each folders My files are located in UNIX Server, i want to extract file_id and file_name from each file .and save it in a CSV file. How do I do that? I have folders in unix environment, directory structure is... (15 Replies)
Discussion started by: princetd001
15 Replies

7. Shell Programming and Scripting

Merging CSV fields based on a common field

Hi List, I have two files. File1 contains all of the data I require to be processed, and I need to add another field to this data by matching a common field in File2 and appending a corresponding field to the data in File1 based on the match... So: File 1:... (1 Reply)
Discussion started by: landossa
1 Replies

8. UNIX for Dummies Questions & Answers

Extract some common fields from 1 file that are presnt in another file

I have 2 files FILEA 720646363*PHILIPPINES 117183970*USA 116274291*USA 107940983*USA 107395824*USA 106632425*USA 105861926*USA 105208607*USA 053077046*USA 065428026*ENGLAND FILEB 001125236 001408905 002316511 002521094 020050725 035018308 052288735 (1 Reply)
Discussion started by: unxusr123
1 Replies

9. Shell Programming and Scripting

Get count on different fields along the raws in a file

Dear All, Please help me to do this. I have a file like this. 5|94662240807|94776109911|94776325901|94779007172|||||| 5|94112925421|94352240384|94352259199|94672229012|||||| 5|94714242745|94722952461|94777660793|94788914465|||||| 5|94242224624|94776145420|94776172499|94776531059|||||| ... (7 Replies)
Discussion started by: Nayanajith
7 Replies

10. Shell Programming and Scripting

help me to count no of fields in a file

hi i am a new unix user i want to check whether a file contains spacefied no of fields if so i should delete last fields and then insert some fields in 2nd field please help me Thanks Regards babu :mad: (7 Replies)
Discussion started by: babu@shell
7 Replies
Login or Register to Ask a Question