Computing the ratio of similar columns in the two files using awk script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Computing the ratio of similar columns in the two files using awk script
# 1  
Old 02-13-2012
Computing the ratio of similar columns in the two files using awk script

Thanks Bartus11 for your help in the following code to compare the two files "t1" and "t2".

Code:
awk 'NR==FNR{a[$2]=1;next}$2 in a{print $2}' t1 t2



First can anyone explain that what is the purpose of assigning a[$2] =1?

Second, the current script is printing out the matched columns between the first and second file "t1" and "t2" but I want to print out only the ratio of matched columns. In other words, if only on column in matched between “t1” and “t2” then it should print out “1” instead value of the column "real_name".

Can anyone please suggest what kind of amendment is required in the above code to achieve the desired output?


Input: t1
Code:
7 real_name
     8 pa_name
     9 make_server_info_pw
     9 passon
    11 mapped_name
    11 nt_status
    13 passon
    15 p
    17 server_info
    18 p

Input t2:
Code:
1 CHECK_DECLS   
1 True   
1 conf   
1 headers   
1 reverse   
1 real_name

Current output:
Code:
real_name

Desired output:
Code:
1


Last edited by coder83; 02-13-2012 at 01:21 PM..
# 2  
Old 02-14-2012
There is no purpose of assigning a[$2]=1 as any reference to a[$2] will create the array entry. The following will work just as well (only difference is that the array will contain blank entries for each element):

Code:
awk 'NR==FNR{a[$2];next}$2 in a{print $2}' t1 t2

The following will display count of number of matches between two files:

Code:
awk 'NR==FNR{a[$2];next}$2 in a{b[$2]}END{print length(b)}' t1 t2

Only populates b[] if $2 from t2 was inserted into a[] during processing of file t1.
Length of b is total number of elements in b[] (i.e. count of matches).
This User Gave Thanks to Chubler_XL For This Post:
# 3  
Old 02-14-2012
Thanks Chubler XL

The script is not working, and gives the following error message.

Code:
illegal reference to array b

# 4  
Old 02-14-2012
You could try gawk if you have it or as you only need the count just sum them as you go:

Code:
awk 'NR==FNR{a[$2];next}$2 in a{c++}END{print c}' t1 t2

This User Gave Thanks to Chubler_XL For This Post:
# 5  
Old 02-15-2012
Thanks Chubler XL,

It worked Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Vlookup using awk non similar files

I need to vlookup and check the server not found. Source file 1 server1 server2 server3 server4 server5_root server6_silver server7 server7-test server7-temp Source file 2 server1_bronze (6 Replies)
Discussion started by: ranjancom2000
6 Replies

2. Shell Programming and Scripting

awk script to perform an action similar to vlookup between two csv files in UNIX

Hi, I am new to awk/unix and am trying to put together an awk script to perform an action similar to vlookup between the two csv files. Here are the contents of the two files: File 1: Date,ParentID,Number,Area,Volume,Dimensions 2014-01-01,ABC,247,83430.33,857.84,8110.76... (9 Replies)
Discussion started by: Prit Siv
9 Replies

3. Shell Programming and Scripting

Perl script to calculate failure ratio

Hello geeks, Find below a Perl script am writing to calculate some failure rate in our GPRS network, am just starting to use perl for scripting. #!/usr/bin/perl #Script written to calculate the following: #PDP activation failure rate for every 15 minutes interval #Number of Active PDP... (1 Reply)
Discussion started by: infinitydon
1 Replies

4. Shell Programming and Scripting

awk script to split file into multiple files based on many columns

So I have a space delimited file that I'd like to split into multiple files based on multiple column values. This is what my data looks like 1bc9A02 1 10 1000 FTDLNLVQALRQFLWSFRLPGEAQKIDRMMEAFAQRYCQCNNGVFQSTDTCYVLSFAIIMLNTSLHNPNVKDKPTVERFIAMNRGINDGGDLPEELLRNLYESIKNEPFKIPELEHHHHHH 1ku1A02 1 10... (9 Replies)
Discussion started by: viored
9 Replies

5. Shell Programming and Scripting

Merging two columns from two files with similar names into a loop

I have two files like this: fileA.net A B C fileA.dat 1 2 3 and I want the output output_expected A 1 B 2 C 3 I know that the easier way is to do a paste fileA.net fileA.dat, but the problem is that I have 10,000 couple of files (fileB.net with fileB.dat; fileC.net with... (3 Replies)
Discussion started by: valente
3 Replies

6. Shell Programming and Scripting

AWK: calculate ratio of columns

Hi all, I have a tab-delimited text file in which i have a few columns which look like, X Y U V 2 3 4 5 4 5 3 4 6 4 3 2 For example, I want to calculate the ratio (X+Y)/(X+Y+U+V) for each row and print the output. X Y U V ... (3 Replies)
Discussion started by: mehar
3 Replies

7. UNIX for Dummies Questions & Answers

Merge two files with two columns being similar

Hi everyone. How can I merge two files, where each file has 2 columns and the first columns in both files are similar? I want all in a file of 4 columns; join command removes the duplicate columns. 1 Dave 2 Mark 3 Paul 1 Apple 2 Orange 3 Grapes to get it like this in the 3rd file:... (9 Replies)
Discussion started by: Atrisa
9 Replies

8. Shell Programming and Scripting

Script to move files with similar names to folder

I have in directory /media/AUDIO/WAVE many .mp3 files with names like: my filename_01of02.mp3 my filename_02of02.mp3 Your File_01of06.mp3 Your File_02of06.mp3 etc.... In the same directory, /media/AUDIO/WAVE, I have many folders with names like 9780743579490 9780743579491 etc.. Inside... (7 Replies)
Discussion started by: glev2005
7 Replies

9. Shell Programming and Scripting

Comparing similar columns in two different files

Hi, I have two text files.The first and the 2nd file have data in the same format For e.g. The first file has table_name1 column1 sum(column1) max(column1) min(column1) table_name1 column2 sum(column2) max(column2) min(column2) table_name1 coulmn3 sum(column3) max(column3) min(column3) ... (13 Replies)
Discussion started by: ragavhere
13 Replies

10. Shell Programming and Scripting

copy similar files only both at different locations using script.

Hello, Here is the situation.............. # pwd /opt/123 # cat index.txt abc-monitor/homedir/public_html/index.php abc-monitor/homedir/public_html/test/index.php abc-monitor/homedir/public_html/test1/index.php # cp index.txt index.home # cat /root/x (1 Reply)
Discussion started by: fed.linuxgossip
1 Replies
Login or Register to Ask a Question