csv 4 columns values comparison!


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting csv 4 columns values comparison!
# 1  
Old 03-12-2011
csv 4 columns values comparison!

Hi all,
i have a csv file which as the following data:
Code:
294;F03;2000;40441
294;F03;2000;40443
284;F01;5400;44051
284;F01;5700;45666

the file holds 11689 lines.

I was trying to get a script running to output results from this file that for each line with the condition:
if a line is found in the whole file ($1 == $1 && $2 == $2 && $4 == $4 && $3 != $3)
Or translated, if the first value from the first column is equal to the value of the first column and the second is equal to the second and the fourth is equal to the forth and the third value is different, output the repeated lines.
Tried the above with awk but lack the knowledge and was looking forward for some help if possible.
It should be capable of finding the below condition, were the 3rd column value difers and all the other are equal to at least another within the file.
Example
:
Code:
294;F03;2000;40441
294;F03;2000;40443
284;F01;5400;44051
284;F01;5700;45666
294;F03;2001;40443

Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 03-12-2011 at 03:11 PM.. Reason: code tags, please!
# 2  
Old 03-12-2011
Quote:
Originally Posted by stryng
Or translated, if the first value from the first column is equal to the value of the first column and the second is equal to the second and the fourth is equal to the forth and the third value is different, output the repeated lines.
Apply only for consecutive records ?
# 3  
Old 03-12-2011
Hi and thanks Danmero,
well the condition is to be applied for all the lines of the file. so every line that is duplicated with the same condition values should be outputed to the stdout

---------- Post updated at 01:09 PM ---------- Previous update was at 12:58 PM ----------

tried the following but think it lacks a function for the condition that would allow me to print if the condition is meet. What do you guys think?
Code:
root@ubuntu:~# awk -F";" '
{ f1[NR] = $1 ; f2[NR] = $2 ; f3[NR] = $3 ; f4[NR] = $4 }
END {
   for (i=1; i<=NR; i++) {
      if (f1[i] == f1[i] && f2[i] == f2[i] && f4[i] == f4[i] && f3[i] != f3[i]) {
print f1[i], f2[i], f3[i], f4[i]
}
 }
   }
    ' c1.csv

Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 03-12-2011 at 03:11 PM.. Reason: code tags, PLEASE!
# 4  
Old 03-12-2011
What should be the output for the following recordset?
Code:
294;F03;2000;40441
294;F03;2000;40443
284;F01;5400;44051
284;F01;5700;45666
294;F03;2001;40443
294;F03;2000;40444
294;F03;2000;40443

PS. Please use [code] tags when you post code or data sample.
# 5  
Old 03-12-2011
it should be:
Found 2 rows within the file with the 1st, 2nd, 4th identical column value and the 3rd diferent:

Code:
294;F03;2001;40443
294;F03;2000;40443

As you can see im searching for equal values in the 1st, 2nd and 4th but with diferent values in the 3rd.
Kindest regards
tHe_stRyNg
Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 03-12-2011 at 04:51 PM.. Reason: once AGAIN - please use code tags!
# 6  
Old 03-12-2011
Hi,

Try next 'perl' script:
Code:
$ cat script.pl
use warnings;
use strict;

my (%line, @f, $f, $c, $i, @d, $e);
while (<>) {
        chomp;
        @f = split /;/;
        $f = join ";", @f[0..1], $f[3];
        push @{ $line{$f} }, $f[2] if not $f[2] ~~ @{ $line{$f} };
}

for $c ( keys %line ) {
        if (@{ $line{$c} } > 1) {
                for $i ( 0 .. $#{ $line{$c} } ) {
                        @d = split /;/, $c;
                        $e = join ";", @d[0..1], $line{$c}[$i], $d[2];
                        print "$e\n";
                }
        }
                 
}
$ perl script.pl infile
(your desired output)

Regards,
Birei
# 7  
Old 03-12-2011
Thanks, it worked out perfectly.
Can you comment a bit the perl script?

Regards
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Check for null values in a columns. I have dozen of CSV files in a directory.

Hi Folks, I'm trying to write a simple file sanity check script. I have a directory with dozen CSV files containing id,edname,firstname,lastname,suffix,email. I like to write a awk script to check if first field contain a number and is not empty. and fields number 3,4 & 6 are not empty and... (3 Replies)
Discussion started by: dc34684
3 Replies

2. Shell Programming and Scripting

Match columns from two csv files and update field in one of the csv file

Hi, I have a file of csv data, which looks like this: file1: 1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628 2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
Discussion started by: djoseph
10 Replies

3. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

4. Shell Programming and Scripting

How to delete a columns of a CSV file which has cell values with a string enclosed in " , "?

Hi How can I delete a columns from a CSV file which has comma separated value with a string enclosed in double quotes or square bracket and a comma in between? I have a csv file with below format. Template,Target Server,Target Component,Rule Group,Rule,Rule Reference Number,Rule... (7 Replies)
Discussion started by: Litu19
7 Replies

5. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

6. Shell Programming and Scripting

How to delete a column/columns of a CSV file which has cell values with a string enclosed in " , "?

How can I delete a column from a CSV file which has comma separated value with a string enclosed in double quotes and a comma in between? I have a file 44.csv with 4 lines including the header like the below format: column1, column2, column3, column 4, column5, column6 12,455,"string with... (6 Replies)
Discussion started by: dhruuv369
6 Replies

7. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

8. Shell Programming and Scripting

CSV with commas in field values, remove duplicates, cut columns

Hi Description of input file I have: ------------------------- 1) CSV with double quotes for string fields. 2) Some string fields have Comma as part of field value. 3) Have Duplicate lines 4) Have 200 columns/fields 5) File size is more than 10GB Description of output file I need:... (4 Replies)
Discussion started by: krishnix
4 Replies

9. UNIX for Advanced & Expert Users

Unix Bash: substitute columns in .csv using other .csv columns

Hi All, I have two .csv's input.csv having values as (7 columns) ABC,A19907103,ABC DEV YUNG,2.17,1000,2157,07/07/2006 XYZ,H00213850,MM TRUP HILL,38.38,580,23308,31/08/2010 output.csv having (25 columns) A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y... (4 Replies)
Discussion started by: abhivyas
4 Replies

10. Shell Programming and Scripting

Need to compare two csv files values and write into another csv file

Hi all, Am new to scripting. So i just need your ideas to help me out. Here goes my requirement. I have two csv files 1.csv 2.csv abc,1.24 abc,1 def,2.13 def,1 I need to compare the first column of 1.csv with 2.csv and if matches then need to compare... (2 Replies)
Discussion started by: chinnahyd
2 Replies
Login or Register to Ask a Question