awk_Compare two files with a loop


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk_Compare two files with a loop
# 1  
Old 08-14-2012
awk_Compare two files with a loop

Hi,
I have a small query when comparing two files with awk. I have a small piece of code running in a shell. See below:
Code:
 
gawk -F"," 'NR == FNR { A[$1","$2","$3","$4]=1; next } \!A[$1","$2","$3","$4]' OFS="," 2011.csv 2012.csv > diff_2012.csv

The code works fine (Note I had to escape the ! with \! to run in shell). What I want to do is add a loop to this code. For the array I want to keep columns $1, $2, $3 for each loop and increment $4 to become $5 then $6 etc up to $33. Each pass of the loop I want to output the difference to a new csv file. An example of what i want is:
Code:
 
gawk -F"," 'NR == FNR { A[$1","$2","$3","$5]=1; next } \!A[$1","$2","$3","$5]' OFS="," 2011.csv 2012.csv > diff_2012_5.csv

Then
Code:
 
gawk -F"," 'NR == FNR { A[$1","$2","$3","$6]=1; next } \!A[$1","$2","$3","$6]' OFS="," 2011.csv 2012.csv > diff_2012_6.csv

etc etc. Only I want the above in a loop.

Thanks in advance for your help Smilie
# 2  
Old 08-14-2012
Show the input you have and the output you want. Code which doesn't do what you want really doesn't tell us what you do want.
# 3  
Old 08-14-2012
Ok,
I can't seem to get the tags to work so I've appended two files.

Desired output based on array $1,$2,$3,$5 would be:
Code:
 
RD158,SR25509,501,1.3

Desired output based on array $1,$2,$3,$6 would be:
Code:
 
RD164,SR24504,441,33

Desired output based on array $1,$2,$3,$7 would be:
Code:
 
RD164,SR24505,442,90.1

Rather than run three seperate lines of code, I want to change the array using a loop and output whats different in the 2012.csv to seperate files.

Sorry for the confusion Smilie

Last edited by theflamingmoe; 08-14-2012 at 11:56 PM.. Reason: Tags didn't work as I intended
# 4  
Old 08-15-2012
Code:
awk 'NR==FNR {
        A[$1,$2,$3,$5]=1
        B[$1,$2,$3,$6]=1
        C[$1,$2,$3,$7]=1
        next }

        !A[$1,$2,$3,$4] { print > "file1" }
        !B[$1,$2,$3,$4] { print > "file2" }
        !C[$1,$2,$3,$4] { print > "file3" }' input1 input2

# 5  
Old 08-15-2012
Thanks for your time Corona688. I guess my explanation and example was about as clear as mud. Partly because i'm not too sure how the arrays work. Is it possible to index multiple values, in this case field values, to a single array. I want to grab fields from the two master files (2011.txt and 2012.txt), compare them, then find the differences. I have a work around solution (writen for a tcsh shell), as follows. I tested it and it seems to do what I want.
Code:
 
foreach n (`seq -s" " -f "%0g" 5 1 7`)
gawk -F"," -v i=$n '{print $1","$2","$3","$i}' OFS="," 2011.csv >  2011_temp.csv
gawk -F"," -v i=$n '{print $1","$2","$3","$i}' OFS="," 2012.csv >  2012_temp.csv
gawk -F"," 'NR == FNR {A[$0]=$0; next } \!A[$0]' OFS="," 2011_temp.csv 2012_temp.csv >> diff_2012_${n}.csv
end

Is there a simple more eligant way than the above code? Preferably awk, grep or perhaps perl. Thanks in advance,
theflamingmoe
# 6  
Old 08-16-2012
Show the output you want for the given input. What you want to do will then be clear.
# 7  
Old 08-16-2012
One last try,

Lets say I have two comma separated lists of fruit:

fruit1.txt
Code:
apples,red,2,32,8
pears,green,4,8,20
grapes,black,150,200,160
bannas, yellow,20,15,12
mangos,yellow,30,40,60

fruit2.txt
Code:
apples,red,2,32,10
pears,green,4,8,20
grapes,black,150,300,160
bannas, yellow,20,15,12
mangos,yellow,50,40,60

If I use the code:
Code:
 
awk -F"," 'NR == FNR {A[$0]=$0; next } !A[$0]' OFS="," fruit1.txt fruit2.txt >> diff_fruit2.txt

The resultant file diff_fruit2.txt (difference between the two files) should look like below:

diff_fruit.txt
Code:
apples,red,2,32,10
grapes,black,150,300,160
mangos,yellow,50,40,60

Where row 1, field 5 has changed from 8 to 10. Row 3, field 4 has changed from 200 to 300. Row 5, field 3 has changed from 30 to 50.

What I want to know is, rather than index the whole row, $0, to an array, can I assign field numbers. Can I index fields $1, $2, and $5 to an array to get the ouput:

diff_fruit.txt
Code:
apples,red,10

Or index fields $1, $2, $4 to an array to get ouput:

diff_fruit.txt
Code:
grapes,black,300

Or index fields $1, $2, $3 to an array to get output:

diff_fruit.txt
Code:
mangos,yellow,50

My last part of the question is, if the above is possible. Can I put this in a loop and output to different files. For example, name the files using the field number.

diff_fruit_5.txt
Code:
apples,red,10

diff_fruit_4.txt
Code:
grapes,black,300

diff_fruit_3.txt
Code:
mangos,yellow,50

The files i'm using are very big with lots of fields. I need a practical way to spot differences between the two files. Printing out the whole row where there is a difference is just not feasible in my case. I would be there for a month of Sundays trying to decipher the output.

Hope my example is clearer. Thanks in advance,
theflamingmoe
Moderator's Comments:
Mod Comment Please view this code tag video for how to use code tags when posting code and data.

Last edited by Corona688; 08-17-2012 at 02:40 PM.. Reason: Small mistake.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Loop over files and use awk

Hi, I have a large number of files which are numbered numerically, i.e. of the type 1.usr, 2.usr, 3.usr ... This is what I'd like to do: 1. In ascending order, use awk to read a value from each file. 2. Write this value to another file (say data.txt). This file, 'data.txt' should be... (4 Replies)
Discussion started by: lost.identity
4 Replies

2. Linux

rename files using loop with different name

Hi, i need to write a shell script where i have to loop through all the file in a directory and rename them based on below condition. file1.dat file2.dat file3.dat the above files has to be moved to another directory like below file1_201001.dat file2_201002.dat file3_201003.dat... (3 Replies)
Discussion started by: feroz
3 Replies

3. Shell Programming and Scripting

Loop to copy like files

Hi, I need to write a script that copies all .zip files in the subdirectories of ~100 folders. No clue how to write a loop that goes into each folder, searches for a .zip file, and copies it and extracts it to a unique location. I imagine something like cp -f /home/folder1/*.zip... (6 Replies)
Discussion started by: nez
6 Replies

4. Shell Programming and Scripting

loop through files in directory

hi all i have some files present in a directory i want to loop through all the files in the directory each time i loop i should change the in_file parameter in the control file and load it into a table using sql loader there is only one table where i have to load alll the files ... (3 Replies)
Discussion started by: rajesh_tns
3 Replies

5. UNIX for Dummies Questions & Answers

For loop using 2 files

Hi all, Can anyone shed any light on the following problem? I have 2 columns in File1: 10 8 30 44 50 59 94 96 ... (15 Replies)
Discussion started by: dr_sabz
15 Replies

6. Shell Programming and Scripting

Grep Different Files Using a Loop?

I have a script to GREP for a text expression within certain files, the files being named file.11012008 thru file.11302008. 30 files in all, one for each day of the month. Instead of entering the following 3 lines of code 30 different times, I'm trying to find a way to loop the process: ... (6 Replies)
Discussion started by: foleyml
6 Replies

7. Shell Programming and Scripting

reading from 2 files through while loop

hi i have two files cat input.txt 123456| 43256 456482|5893242 cat data.txt xv 123456 abcd dsk sd 123456 afsfn dd df 43256 asdf ff ss 456482 aa sf 5893242 ff ff aa 5893242 aa aa i need to read inputs from input.txt and find data for data.txt. then i need to print them as a... (2 Replies)
Discussion started by: windows
2 Replies

8. UNIX for Dummies Questions & Answers

Loop through files

Hi, I'm trying loop through all files in a directory that have a filename starting with 'CC', and process them one by one. Can any provide an example of how I could do this. I've started with: if test -f CC* then #add files to an array #loop through array and process the file based on... (1 Reply)
Discussion started by: kshelluser
1 Replies

9. Shell Programming and Scripting

Loop through files in a directory

Hi, I want to write bash script that will keep on looking for files in a directory and if any file exists, it processes them. I want it to be a background process, which keeps looking for files in a directory. Is there any way to do that in bash script? I can loop through all the files like... (4 Replies)
Discussion started by: rladda
4 Replies
Login or Register to Ask a Question