csv 4 columns values comparison!


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting csv 4 columns values comparison!
# 8  
Old 03-13-2011
Paste the code commented.
Code:
use warnings;
use strict;

my (%line, @f, $f, $c, $i, @d, $e);
## Read file line by line.
while (<>) {
        ## Remove last 'break line' character and get a string with fields which                                                                                                                                  
        ## are the same (1º, 2º and 4º).                                                                                                                                                                          
        ## Example:                                                                                                                                                                                               
        ## @f = ('294', 'F03', '2000', '40441')                                                                                                                                                                   
        ## $f = "294;F03;40441"                                                                                                                                                                                   
        chomp;                                                                                                                                                                                                    
        @f = split /;/;                                                                                                                                                                                           
        $f = join ";", @f[0..1], $f[3];                                                                                                                                                                           
        ## Save in array same lines with different third field.                                                                                                                                                   
        ## Example:                                                                                                                                                                                               
        ## $line{ "294;F03;40441" } = ('2000')                                                                                                                                                                    
        push @{ $line{$f} }, $f[2] if not $f[2] ~~ @{ $line{$f} };                                                                                                                                                
        ## For example, at end of the while loop, one position of the array should                                                                                                                                
        ## be:                                                                                                                                                                                                    
        ## $line{ "294;F03;40443" } = ('2000', '2001')                                                                                                                                                            
}                                                                                                                                                                                                                 
                                                                                                                                                                                                                  
## Go through the array. If one position has more than one value, get the                                                                                                                                         
## complete line joining fields and print them.                                                                                                                                                                   
for $c ( keys %line ) {                                                                                                                                                                                           
        if (@{ $line{$c} } > 1) {                                                                                                                                                                                 
                for $i ( 0 .. $#{ $line{$c} } ) {                                                                                                                                                                 
                        @d = split /;/, $c;                                                                                                                                                                       
                        $e = join ";", @d[0..1], $line{$c}[$i], $d[2];                                                                                                                                            
                        print "$e\n";                                                                                                                                                                             
                }                                                                                                                                                                                                 
        }                                                                                                                                                                                                         
                                                                                                                                                                                                                  
}

Regards,
Birei
# 9  
Old 03-13-2011
Not sure if this one line awk is exactly what you want:

Code:
ArchT60::/tmp
kent$ cat t.txt                                                                              
294;F03;2000;40441
294;F03;2000;40443
284;F01;5400;44051
284;F01;5700;45666
294;F03;2001;40443



ArchT60::/tmp
kent$  awk -F';' '{key=$1 FS $2 FS $4; a[key]=(a[key] && a[key]!= $3)?a[key]","$3:$3;} END { for (k in a ){ if( split(a[k],t,",")>1 ){ split(k,h,FS); for (i in t) print h[1] FS h[2] FS t[i] FS h[3] } } }'  t.txt


294;F03;2000;40443
294;F03;2001;40443


Last edited by sk1418; 03-13-2011 at 06:21 PM..
# 10  
Old 03-13-2011
Code:
awk -F';' '{B[$1$2$4]++}B[$1$2$4]==2{print A[$1$2$4]}B[$1$2$4]==1{A[$1$2$4]=$0}B[$1$2$4]>1' infile

Code:
[ctsgnb@shell ~/sand]$ cat tst
294;F03;2000;40441
294;F03;2000;40443
284;F01;5400;44051
284;F01;5700;45666
294;F03;2001;40443
[ctsgnb@shell ~/sand]$ awk -F';' '{B[$1$2$4]++}B[$1$2$4]==2{print A[$1$2$4]}B[$1$2$4]==1{A[$1$2$4]=$0}B[$1$2$4]>1' tst
294;F03;2000;40443
294;F03;2001;40443
[ctsgnb@shell ~/sand]$

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Check for null values in a columns. I have dozen of CSV files in a directory.

Hi Folks, I'm trying to write a simple file sanity check script. I have a directory with dozen CSV files containing id,edname,firstname,lastname,suffix,email. I like to write a awk script to check if first field contain a number and is not empty. and fields number 3,4 & 6 are not empty and... (3 Replies)
Discussion started by: dc34684
3 Replies

2. Shell Programming and Scripting

Match columns from two csv files and update field in one of the csv file

Hi, I have a file of csv data, which looks like this: file1: 1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628 2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
Discussion started by: djoseph
10 Replies

3. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

4. Shell Programming and Scripting

How to delete a columns of a CSV file which has cell values with a string enclosed in " , "?

Hi How can I delete a columns from a CSV file which has comma separated value with a string enclosed in double quotes or square bracket and a comma in between? I have a csv file with below format. Template,Target Server,Target Component,Rule Group,Rule,Rule Reference Number,Rule... (7 Replies)
Discussion started by: Litu19
7 Replies

5. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

6. Shell Programming and Scripting

How to delete a column/columns of a CSV file which has cell values with a string enclosed in " , "?

How can I delete a column from a CSV file which has comma separated value with a string enclosed in double quotes and a comma in between? I have a file 44.csv with 4 lines including the header like the below format: column1, column2, column3, column 4, column5, column6 12,455,"string with... (6 Replies)
Discussion started by: dhruuv369
6 Replies

7. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

8. Shell Programming and Scripting

CSV with commas in field values, remove duplicates, cut columns

Hi Description of input file I have: ------------------------- 1) CSV with double quotes for string fields. 2) Some string fields have Comma as part of field value. 3) Have Duplicate lines 4) Have 200 columns/fields 5) File size is more than 10GB Description of output file I need:... (4 Replies)
Discussion started by: krishnix
4 Replies

9. UNIX for Advanced & Expert Users

Unix Bash: substitute columns in .csv using other .csv columns

Hi All, I have two .csv's input.csv having values as (7 columns) ABC,A19907103,ABC DEV YUNG,2.17,1000,2157,07/07/2006 XYZ,H00213850,MM TRUP HILL,38.38,580,23308,31/08/2010 output.csv having (25 columns) A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y... (4 Replies)
Discussion started by: abhivyas
4 Replies

10. Shell Programming and Scripting

Need to compare two csv files values and write into another csv file

Hi all, Am new to scripting. So i just need your ideas to help me out. Here goes my requirement. I have two csv files 1.csv 2.csv abc,1.24 abc,1 def,2.13 def,1 I need to compare the first column of 1.csv with 2.csv and if matches then need to compare... (2 Replies)
Discussion started by: chinnahyd
2 Replies
Login or Register to Ask a Question