|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Difference between two huge .csv files
Hi all, I need help on getting difference between 2 .csv files. I have 2 large . csv files which has equal number of columns. I nned to compare them and get output in new file which will have difference olny. E.g. File1.csv Code:
Name, Date, age,number Sakshi, 16-12-2011, 22, 56 Akash, 14-12-2011, 23, 76 File2.csv Code:
Name,Date,age,number Sakshi, 14-12-2011,22,56 Akash,18-12-2011,23,76 then output should be like Code:
16-12-2011 14-12-2011 14-12-2011 18-12-2011 It's just an example. What I am trying to say is I should get only the values of columns where we have the difference. Not the whole line. Assuming File will be in sorted order. There can be m number of columns but for sure in both the files, we will get same columns. If values are different then those values should be given in output. It can also work if we can get difference in comma separated file like wherver values matches between 2 files we get blank ,16-12-2011,, Hope I am able to explain the issue. Last edited by Franklin52; 10-08-2012 at 03:07 AM.. Reason: Please use code tags for data and code samples |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
Code:
awk -F, 'FNR==NR{a[$1]=$2;next}{if(a[$1]!=$2){print a[$1],$2}}' file1 file2 |
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
I think I am not able to explain issue properly.
In the example given, there is a difference at 2nd column only. But there can be difference in some other columns value as well. This command is giving result for difference at 2nd place only. Can you give me the command so that I can get result in comma separated format only. By this I will get to know wherever values are not matching in our files. It's not neccessary to get values from both the file. Let say there is difference at 3rd column and 7th column so my result should be like ,,17-12-2011,,,,10,,,,,,,,, Please help |
|
#4
|
|||
|
|||
|
try with this.. Code:
paste file1 file2 | awk -F "[,\t]" '{for(i=1;i<=(NF/2);i++){if($i != $(NF/2+i)){printf $i}else{printf ","}}}{print ""}' |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Thanks for the help.
![]() But still have one issue. If i have difference in 2 consecutive columns, it's not showing any separation between them. E.g File1 Rahul, 1203,113,11 File2 Malik, 121,113,11 Output coming as Rahul1203,, Expected Output: Rahul,1203,, |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
Quote:
Code:
paste file1 file2 | awk -F "[,\t]" '{for(i=1;i<=(NF/2);i++){if($i != $(NF/2+i)){
if(s){s=s";"$i}else{s=$i}}else{if(s){s=s";,"}else{s=","}}}}{ print s;s=""}' |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
Thanks for your help
![]() It's working exactly what I want. If possible Can you please explain the code. Thanks |
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Huge performance difference between Java and C, Java faster | Phoib | Programming | 33 | 04-27-2011 10:22 AM |
| Three Difference File Huge Data Comparison Problem. | patrick87 | Shell Programming and Scripting | 4 | 10-22-2010 06:49 PM |
| Huge difference between _POSIX_OPEN_MAX and sysconf(_SC_OPEN_MAX). | gencon | Programming | 5 | 03-06-2010 03:47 PM |
| Huge difference in reported Disk usage between ls,df and du | cooperuf | AIX | 4 | 11-14-2008 03:11 PM |
| Difference between two huge files | pyaranoid | UNIX for Dummies Questions & Answers | 13 | 09-16-2008 10:11 AM |
|
|