Help with awk script csv compare


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with awk script csv compare
# 1  
Old 04-24-2014
Wrench Help with awk script csv compare

Need to compare 2 csv files.
Found similar problem solved here in forum but needs explanation and modification:

Im not allowed to post urls so here is the code from post.

Code:
awk -F\| '{
   k=$1FS$2
   for (i=1;i<=NF;i++) {
      if (FNR!=NR) {
         if (k in x && i > 2) {
            y[k]++
            x[k]=x[k]"|"$i
         }
      } else x[k]=x[k]?x[k]"|"$i:$i
   }
} END {
   for (i in y) print x[i]
}' File2.txt File1.txt

First, I do not understand
Quote:
else x[k]=x[k]?x[k]"|"$i:$i
is this a bug? to me it looks like the same as x[k]=x[k]"|"$i
My problem is i need to modify this script to:
1. Script should print lines that differ(not print lines that match.)
2. Script should take array containing column numbers to exclude from compare
3. If lines differ script should output 2 lines, one from each file and prefix with line nr and the filename it originates from.
4. If more lines in 1 file it should print that line and prefix with line number and the filename it originates from.

Ex(without excluded columns):
File2.txt:
Code:
ID,version,cost
1000,1,30
2000,2,40
3000,3,50
4000,4,60

File1.txt:
Code:
ID,version,cost
1000,1,30
2000,2,45
3000,4,55
5000,5,70
6000,6,80

output:
Code:
3,File2.txt,2000,2,40                                                    
3,File1.txt,2000,2,45
4,File2.txt,3000,3,50                                                    
4,File1.txt,3000,4,55
5,File2.txt,4000,4,60                                                    
5,File1.txt,5000,5,70
6,File1.txt,6000,6,80


Last edited by Franklin52; 04-25-2014 at 09:18 AM.. Reason: Additional code tags
# 2  
Old 04-25-2014
Code:
x[k]=x[k]?x[k]"|"$i:$i

If x[k] exists, append a pipe and $i. If it doesn't, set it to $i without a pipe in front. (Essentially, add a pipe separator before every field except the first one)

Conditional Exp - The GNU Awk User's Guide
# 3  
Old 04-28-2014
Tanks CarloM here is my solution

I came to this solution. I could be simplified i guess and I would be happy if someone posted a shortened version.

Input file1:
Code:
ID,version,cost
1000,1,30,a
2000,2,40,b
3000,3,50,c
4000,4,60,d

Input file2:
Code:
ID,version,cost
1000,1,30,a
2000,2,45,b
3000,4,55,c
5000,5,70,d
6000,6,80,e


Code:
paste ./file1.txt ./file2.txt | awk -F '\t' -v INDEX="2,3" '
{
fnum1 = split($1,linea,",")
fnum2 = split($2,lineb,",")
numexcl = split(INDEX, excluded,",")
linediff=0
lineamissing=0
linebmissing=0
for (f=1; f <= fnum1 || f <= fnum2; f++) {
        for(x=1; x <= numexcl; x++){
                if( f == excluded[x] ){
                        if(f in linea)
                                linea[f]="Skipped"
                        if(f in lineb)
                                lineb[f]="Skipped"
                }
        }

        if( f in linea && f in lineb ) {
                if ( linea[f] != lineb[f] )
                        ++linediff

        }
        else {
                if( !(f in linea) )
                        ++lineamissing
                if( !(f in lineb) )
                        ++linebmissing
        }
}
if( linediff > 0 ) {
 printf("%i,%s",NR,"file1")
 for(i=1; i <= fnum1; i++)
        printf(",%s", linea[i])
 printf("\n")
 printf("%i,%s",NR,"file2")
 for(j=1; j <= fnum2; j++)
        printf(",%s", lineb[j])
 printf("\n")
}
if ( lineamissing > 0){
 printf("%i,%s",NR,"file2")
 for(k=1; k <= fnum2; k++)
        printf(",%s", lineb[k])
 printf("\n")
}
if ( linebmissing > 0){
 printf("%i,%s",NR,"file1")
 for(m=1; m <= fnum1; m++)
        printf(",%s", linea[m])
 printf("\n")
}
}'

Output:
Code:
5,file1,4000,Skipped,Skipped,d
5,file2,5000,Skipped,Skipped,d
6,file2,6000,Skipped,Skipped,e

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need awk or Shell script to compare Column-1 of two different CSV files and print if column-1 matche

Example: I have files in below format file 1: zxc,133,joe@example.com cst,222,xyz@example1.com File 2 Contains: hxd hcd jws zxc cst File 1 has 50000 lines and file 2 has around 30000 lines : Expected Output has to be : hxd hcd jws (5 Replies)
Discussion started by: TestPractice
5 Replies

2. Shell Programming and Scripting

Compare 2 files of csv file and match column data and create a new csv file of them

Hi, I am newbie in shell script. I need your help to solve my problem. Firstly, I have 2 files of csv and i want to compare of the contents then the output will be written in a new csv file. File1: SourceFile,DateTimeOriginal /home/intannf/foto/IMG_0713.JPG,2015:02:17 11:14:07... (8 Replies)
Discussion started by: refrain
8 Replies

3. Shell Programming and Scripting

Script to compare count of two csv files

Hi Guys, I need to write a script to compare the count of two csv files each having 5 columns. Everyday a csv file is recived. Now we need to compare the count of todays csv file with yesterday's csv file and if the total count of records is same in todays csv file and yesterday csv file out... (3 Replies)
Discussion started by: Vivekit82
3 Replies

4. Shell Programming and Scripting

How to compare two lines in a csv file using shell script?

I have a file lets say input.csv having two columns like- Name,Mobile No A,111 B,222 A,333 A,123 B,213 I would like to find result in a new file lets say output.csv as- Name,Mobile No A,111 B,222 means short the file on the basis of first column and first value corresponding to the... (5 Replies)
Discussion started by: Ashish Singhal
5 Replies

5. Shell Programming and Scripting

Compare 2 csv files in ksh and o/p the difference in a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below: file1.csv ID,version,cost 1000,1,30 2000,2,40 3000,3,50 4000,4,60 file2.csv ID,version,cost 1000,1,30 2000,2,45 3000,4,55 6000,5,70 The... (7 Replies)
Discussion started by: Naresh101
7 Replies

6. Shell Programming and Scripting

Need Perl script to compare two CSV files

Need perl script to compare the two CSV files and and give out put in CSV format File MsPMTP.csv File ProfileNames.csv MsPMTP.csv is having lines like below JBL_VIJ_A_A962/r01sr4sl12/port#01-#13-Au4P-4c-TMi-PMNETR15 JBL_VIJ_A_A962/r01sr4sl12/port#01-#13-Au4P-4c-TMi-PMFETR15... (9 Replies)
Discussion started by: sreedhargouda
9 Replies

7. Shell Programming and Scripting

Help with script to open and compare csv files

We are testing an application that accesses two tables: A and B. I am to write a script to validate the ouput files of this application.The application marks any account that has become overdue as per rule. When it runs, it updates the overdue flag in the A table according to the following rules: ... (1 Reply)
Discussion started by: inkyponky
1 Replies

8. Shell Programming and Scripting

Need to compare two csv files values and write into another csv file

Hi all, Am new to scripting. So i just need your ideas to help me out. Here goes my requirement. I have two csv files 1.csv 2.csv abc,1.24 abc,1 def,2.13 def,1 I need to compare the first column of 1.csv with 2.csv and if matches then need to compare... (2 Replies)
Discussion started by: chinnahyd
2 Replies

9. Shell Programming and Scripting

convert this into csv using awk/shell script

Hi Scripting gurus, I need to convert following text snippet into csv. please help Input heading1 = data1 heading2 = data2 .. .. heading n = data n heading 1 = data1 .. .. Output data1,data2,....,data n (3 Replies)
Discussion started by: azs0309
3 Replies

10. UNIX for Dummies Questions & Answers

generate CSV file using AWK script

Hi guys I have a text report that consists of text in some parts and data in some parts. e.g Report for changes in cashflows No changes were found Report for changes in Bills deal_num deal_date trader maturity log_creator DF_234 20-5-2008 tman 20-5-2009 tman... (2 Replies)
Discussion started by: magikminox
2 Replies
Login or Register to Ask a Question