Comparing two files with numbers and taking difference in third file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Comparing two files with numbers and taking difference in third file
# 1  
Old 10-21-2013
Comparing two files with numbers and taking difference in third file

Hi All,

I have two files in the following format, with numbers being defined under columns(described by a set of headers) and rows(again defined by a set of identifiers)


Code:
              2013  2013
              Make200  Make201
              Merc  BMW
              Jpur  Del
              PT  PT
              Aug  Aug
G73 V_C A 647369.3318 22055477.65
G73 V_C B 696906.5564 3455161.86
G73 V_C C 14564593.23 401494.0013
G73 V_C D 123849.9316 12708.25062
G73 V_C E 16524002.8 14764.98271
G73 V_C F 22423896.42 16736206.74
G73 V_C G 6705506.228 15559.77916
G73 V_C H 34358539.26 -47408.00907
G73 V_C I 498941.6802 82365308.39 

I have a similar 2nd file with identical row and column identifiers, only the numbers are different, my requirement is to get the difference of the numbers under the same row/column identifiers in a 3rd file.

Thanks in advance Smilie
# 2  
Old 10-21-2013
Quote:
Originally Posted by dev.devil.1983
Hi All,

I have two files in the following format, with numbers being defined under columns(described by a set of headers) and rows(again defined by a set of identifiers)


Code:
              2013  2013
              Make200  Make201
              Merc  BMW
              Jpur  Del
              PT  PT
              Aug  Aug
G73 V_C A 647369.3318 22055477.65
G73 V_C B 696906.5564 3455161.86
G73 V_C C 14564593.23 401494.0013
G73 V_C D 123849.9316 12708.25062
G73 V_C E 16524002.8 14764.98271
G73 V_C F 22423896.42 16736206.74
G73 V_C G 6705506.228 15559.77916
G73 V_C H 34358539.26 -47408.00907
G73 V_C I 498941.6802 82365308.39 

I have a similar 2nd file with identical row and column identifiers, only the numbers are different, my requirement is to get the difference of the numbers under the same row/column identifiers in a 3rd file.

Thanks in advance Smilie
I would be better for us to help you if you show second file as well as expected output.
# 3  
Old 10-21-2013
1st File

Code:
    2013  2013
    Make200  Make201
    Merc  BMW
    Jpur  Del
    PT  PT
    Aug  Aug
G73 V_C A 647369.3318 22055477.65
G73 V_C B 696906.5564 3455161.86
G73 V_C C 14564593.23 401494.0013
G73 V_C D 123849.9316 12708.25062
G73 V_C E 16524002.8 14764.98271
G73 V_C F 22423896.42 16736206.74
G73 V_C G 6705506.228 15559.77916
G73 V_C H 34358539.26 -47408.00907
G73 V_C I 498941.6802 82365308.39

File 2

Code:
    2013  2013
    Make200  Make201
    Merc  BMW
    Jpur  Del
    PT  PT
   Aug Aug
G73 V_C A 647371.3318 125009280.6
G73 V_C B 696908.5564 102953803
G73 V_C C 14564595.23 99498641.14
G73 V_C D 123851.9316 99097147.14
G73 V_C E 16524004.8 99084438.89
G73 V_C F 22423898.42 99069673.9
G73 V_C G 6705508.228 82333467.16
G73 V_C H 34358541.26 82317907.38
G73 V_C I 498943.6802 82365315.39

Output :

Code:
 
    2013  2013
    Make200  Make201
    Merc  BMW
    Jpur  Del
    PT  PT
G73 V_C A 2 7
G73 V_C B 2 7
G73 V_C C 2 7
G73 V_C D 2 7
G73 V_C E 2 7
G73 V_C F 2 7
G73 V_C G 2 7
G73 V_C H 2 7
G73 V_C I 2 7

PS: the files may have more than 2 columns, also the number in the files(highlighted) are aligned to the 2 columns respectively.
# 4  
Old 10-21-2013
Quote:
Originally Posted by dev.devil.1983
1st File

Code:
    2013  2013
    Make200  Make201
    Merc  BMW
    Jpur  Del
    PT  PT
    Aug  Aug
G73 V_C A 647369.3318 22055477.65
G73 V_C B 696906.5564 3455161.86
G73 V_C C 14564593.23 401494.0013
G73 V_C D 123849.9316 12708.25062
G73 V_C E 16524002.8 14764.98271
G73 V_C F 22423896.42 16736206.74
G73 V_C G 6705506.228 15559.77916
G73 V_C H 34358539.26 -47408.00907
G73 V_C I 498941.6802 82365308.39

File 2

Code:
    2013  2013
    Make200  Make201
    Merc  BMW
    Jpur  Del
    PT  PT
   Aug Aug
G73 V_C A 647371.3318 125009280.6
G73 V_C B 696908.5564 102953803
G73 V_C C 14564595.23 99498641.14
G73 V_C D 123851.9316 99097147.14
G73 V_C E 16524004.8 99084438.89
G73 V_C F 22423898.42 99069673.9
G73 V_C G 6705508.228 82333467.16
G73 V_C H 34358541.26 82317907.38
G73 V_C I 498943.6802 82365315.39

Output :

Code:
 
    2013  2013
    Make200  Make201
    Merc  BMW
    Jpur  Del
    PT  PT
G73 V_C A 2 7
G73 V_C B 2 7
G73 V_C C 2 7
G73 V_C D 2 7
G73 V_C E 2 7
G73 V_C F 2 7
G73 V_C G 2 7
G73 V_C H 2 7
G73 V_C I 2 7

PS: the files may have more than 2 columns, also the number in the files(highlighted) are aligned to the 2 columns respectively.
Try

I didn't get how come 2nd column all are 7

Code:
$ awk 'FNR==NR{A[$2 FS $3]=$4 FS $5;next}($1=="G73" && $2 FS $3 in A){split(A[$2 FS $3],m);$4-=m[1];$5-=m[2]}1' file1 file2 >3rd_file

Resulting

Code:
$ cat 3rd_file
 2013  2013
    Make200  Make201
    Merc  BMW
    Jpur  Del
    PT  PT
   Aug Aug
G73 V_C A 2 1.02954e+08
G73 V_C B 2 9.94986e+07
G73 V_C C 2 9.90971e+07
G73 V_C D 2 9.90844e+07
G73 V_C E 2 9.90697e+07
G73 V_C F 2 8.23335e+07
G73 V_C G 2 8.23179e+07
G73 V_C H 2 8.23653e+07
G73 V_C I 2 7


Last edited by Akshay Hegde; 10-21-2013 at 09:14 AM.. Reason: Bold..:)
These 2 Users Gave Thanks to Akshay Hegde For This Post:
# 5  
Old 10-23-2013
Thanks for the reply Akshay, but following are the issue that i am facing

1) The first being the syntactical error probably

Code:
$ awk 'FNR==NR{A[$2 FS $3]=$4 FS $5;next}($1=="G73" && $2 FS $3 in A){split(A[$2 FS $3],m);$4-=m[1];$5-=m[2]}1' ABC.txt XYZ.txt >result.txt
awk: syntax error near line 1
awk: bailing out near line 1


2) It seems you are assuming the columns and rows to be of fixed length as well as value, which infact keeps on changing
eg : column'G73' is not fixed, it can change in the next instance ..

Thanks again for your expert help ! Smilie
# 6  
Old 10-23-2013
Quote:
Originally Posted by dev.devil.1983
Thanks for the reply Akshay, but following are the issue that i am facing

1) The first being the syntactical error probably

Code:
$ awk 'FNR==NR{A[$2 FS $3]=$4 FS $5;next}($1=="G73" && $2 FS $3 in A){split(A[$2 FS $3],m);$4-=m[1];$5-=m[2]}1' ABC.txt XYZ.txt >result.txt
awk: syntax error near line 1
awk: bailing out near line 1

2) It seems you are assuming the columns and rows to be of fixed length as well as value, which infact keeps on changing
eg : column'G73' is not fixed, it can change in the next instance ..

Thanks again for your expert help ! Smilie
Try this then

Code:
$ awk 'FNR==NR{A[$1 FS $2 FS $3]=$4 FS $5;next}(!/^[ \t]+/ && $1 FS $2 FS $3 in A){split(A[$2 FS $3],m);$4-=m[1];$5-=m[2]}1'  file1 file2

I think you are working on Solaris and you are using standard awk.

If so, you need to use /usr/xpg4/bin/awk instead, which is POSIX awk (or nawk if that is not available)
This User Gave Thanks to Akshay Hegde For This Post:
# 7  
Old 10-23-2013
Thanks again Akshay!

The code seems to be working now when i am using 'nawk' instead of awk, the system as you correctly identified is solaris only ..

one last help that i would require is out of the output obtained in the previous post :

Code:
  2013  2013  
    Make200  Make201  
    Merc  BMW  
    Jpur  Del  
    PT  PT  
   Aug Aug  
G73 V_C A 2 7
G73 V_C B 2 7
G73 V_C C 2 7
G73 V_C D 2 7
G73 V_C E 2 7
G73 V_C F 2 7
G73 V_C G 2 7
G73 V_C H 2 7
G73 V_C I 2 7

i need to use another file, with values in red

Code:
 VAL
A 0.5
B 1.5
C 2.5
D 0.005
E 2
F 0.34
G 0.332
H 0.43
I 0.12


Need to multiply these reds in file 2 with the values in file 1(given columns would always be 2 in second file and 'n' in file 1)

to give the following result

Code:
  2013  2013  
    Make200  Make201  
    Merc  BMW  
    Jpur  Del  
    PT  PT  
   Aug Aug  
G73 V_C A 1 3.5
G73 V_C B 3 10.5
G73 V_C C 5 17.5
G73 V_C D 0.01 0.035
G73 V_C E 4 14
G73 V_C F 0.68 2.38
G73 V_C G 0.664 2.324
G73 V_C H 0.86 3.01
G73 V_C I 0.24 0.84

Thanks again for your expert help ! Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Comparing two files and list the difference with common first line content of both files

I have two file as given below which shows the ACL permissions of each file. I need to compare the source file with target file and list down the difference as specified below in required output. Can someone help me on this ? Source File ************* # file: /local/test_1 # owner: own #... (4 Replies)
Discussion started by: sarathy_a35
4 Replies

2. Shell Programming and Scripting

Perl for comparing numbers from previous lines in a file?

Hi everyone I have a question for you, as I am trying to learn more about Perl and work with some weather data. I have an ascii file (shown below) that has 10 lines with different columns. What I would like is have Perl find an "anomalous" value by comparing a field with the values from the last... (2 Replies)
Discussion started by: lucshi09
2 Replies

3. Shell Programming and Scripting

Comparing 2 CSV files and sending the difference to a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below: file1.csv ID,version,cost 1000,1,30 2000,2,40 3000,3,50 4000,4,60 file2.csv ID,version,cost 1000,1,30 2000,2,45 3000,4,55 6000,5,70 ... (1 Reply)
Discussion started by: Naresh101
1 Replies

4. Shell Programming and Scripting

Comparing 2 difference csv files

Hello, I have about 10 csv files which range from csv1 - csv10. Each csv file has same type/set of tabs and we have around 5-6 tabs for each of the csv file which have slightly different content(data). A sample of CSV1 is shown below: Joins: Data related to Joins, it can be any number of... (2 Replies)
Discussion started by: bobby1015
2 Replies

5. Shell Programming and Scripting

Comparing text in 2 files and output difference in another file.

I have 2 files of almost same text apart from 2,3 ending lines. Now I want to get that difference in another file. e.g file1.txt is Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_livecd-lv_root 18G 2.4G 15G 14% / tmpfs 504M ... (12 Replies)
Discussion started by: kashif.live
12 Replies

6. UNIX for Dummies Questions & Answers

Taking a average of a column of numbers

Hey all, I am relatively poor at programming and unfortunately don't have time to read about programming at this current moment. I wanted to be able to run a simple command to read a column of numbers in a file and give me the average of those numbers. In addition if I could specify the... (2 Replies)
Discussion started by: Leonidsg
2 Replies

7. Shell Programming and Scripting

Comparing Columns and printing the difference from a particular file

Gurus, I have one file which is having multiple columns and also this file is not always contain the exact columns; sometimes it contains 5 columns or 12 columns. Now, I need to find the difference from that particular file. Here is the sample file: param1 | 10 | 20 | 30 | param2 | 10 |... (6 Replies)
Discussion started by: buzzusa
6 Replies

8. UNIX for Dummies Questions & Answers

calculate the difference between numbers in lines of file

Hi everyone, i have files containing lines of number: 109 107 67 62 .. .. i want to calculate the difference between numbers in lines 1 and 2, 3 and 4, 5 and 6 and so on. would someone help me please?. Thanks (12 Replies)
Discussion started by: ahidayat
12 Replies

9. UNIX for Dummies Questions & Answers

Taking date difference

Hi, There is requirement in our project where in we have to calculate the elpased time of the process which are running and then if the elapsed time is greater than specific time we have to send a mail. In order to calculate the elapsed time we ahve use "ps -ef" command.The time displayed in... (6 Replies)
Discussion started by: Amey Joshi
6 Replies

10. UNIX for Dummies Questions & Answers

comparing numbers in a file

Hello, I'm searching for a quick method to read numeric values from a file or a defined variable and identifying the largest number. For instance if the following numbers are in a file or defined to a variable: 09192007 09202007 09182007 09172007 09162007 What "short" method could be used to... (7 Replies)
Discussion started by: dusk2dawn
7 Replies
Login or Register to Ask a Question