Parse a File ColumnWise & Delete the Rows


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parse a File ColumnWise & Delete the Rows
# 1  
Old 02-08-2011
Parse a File ColumnWise & Delete the Rows

Hi ,

I have a CSV file that has around 50K records in it. I have to delete all the duplicate rows from the file except one, depending upon data in column4.
Lets say there are 3 rows for 'column4data' in the file. I have to retain only that line which has the smallest date value in column2.
The file is already sorted based on column4data.

e.g For the lines below :
Code:
           bcd,20090101,20100101,column4data ,4,5,6,7
           a_b,20080101,20090101,column4data ,1,2,3,4
           xyzz,20100101,20110101,column4data ,1,2,6,7

Result Expected ( Since 20080101 is least among all ):
Code:
           a_b,20080101,20090101,column4data ,1,2,3,4

The goal is to read every row of the file & read column4data in every row. Lets say column4data is unique ( does not occur twice) , move ahead and read column4data from second row and so on. Once we get a dulicate occurrence , read column2data from all the rows and do date comparsion and subsequent deletion.

Can somebody pls help me with a solution using 'awk' or 'sed' !

TIA

Last edited by Franklin52; 02-08-2011 at 04:28 AM.. Reason: Please use code tags, thank you
# 2  
Old 02-08-2011
Try this,

Code:
awk -F"," '{if(! a[$4] ) {a[$4]=$2;b[++i]=$0} else if( $2 < a[$4]){a[$4]=$2;b[i]=$0}} END {for(j=1;j<=i;j++) {print b[j]}}' inputfile

This User Gave Thanks to pravin27 For This Post:
# 3  
Old 02-08-2011
Since the file is sorted on column 4:
Code:
awk -F, 'p!=$4{if(s)print s;min=$2;p=$4}min>=$2{s=$0;min=$2}END{print s}' infile

# 4  
Old 02-08-2011
Code:
sort -t, -k4,4 -k2,2n infile |awk -F, '!a[$4]++'

This User Gave Thanks to rdcwayx For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

SFTP Shell Script Get & Delete && Upload & Delete

Hi All, Do you have any sample script, - auto get file from SFTP remote server and delete file in remove server after downloaded. - only download specify filename - auto upload file from local to SFTP remote server and delete local folder file after uploaded - only upload specify filename ... (3 Replies)
Discussion started by: weesiong
3 Replies

2. Shell Programming and Scripting

Need Script to ZIP/SAVE & then DELETE Log file & send a mail conformation for any error

ENVIROMENT Linux: RHEL 6.4 Log Path: /usr/iplanet/servers/https-company/logs Log Format: user.log.03-15-2015 I have log4j log rotation enabled rotating files on a daily basis. The rotated logs are NOT compressed & are taking up too much space. I need a script that will run daily that... (1 Reply)
Discussion started by: admin_job_admin
1 Replies

3. Red Hat

Need Script to ZIP/SAVE & then DELETE Log file & DELETE ZIPS older than 12 months

ENVIROMENT Linux: Fedora Core release 1 (Yarrow) iPlanet: iPlanet-WebServer-Enterprise/6.0SP1 Log Path: /usr/iplanet/servers/https-company/logs I have iPlanet log rotation enabled rotating files on a daily basis. The rotated logs are NOT compressed & are taking up too much space. I... (7 Replies)
Discussion started by: zachs
7 Replies

4. Shell Programming and Scripting

Delete rows from big file

Hi all, I have a big file (about 6 millions rows) and I have to delete same occurrences, stored in a small file (about 9000 rews). I have tried this: while read line do grep -v $line big_file > ok_file.tmp mv ok_file.tmp big_file done < small_file It works, but is very slow. How... (2 Replies)
Discussion started by: Tibbeche
2 Replies

5. UNIX for Advanced & Expert Users

Delete rows from a file...!!

Say i have a file with X rows and Y columns....i see that in some of the rows,some columns are blank (no value set)...i wish to delete such rows....how can it be done? e.g 181766 100 2009-06-04 184443 2009-06-04 10962 151 2009-06-04 161 2009-06-04... (7 Replies)
Discussion started by: ak835
7 Replies

6. Shell Programming and Scripting

delete rows in a file based on the rows of another file

I need to delete rows based on the number of lines in a different file, I have a piece of code with me working but when I merge with my C application, it doesnt work. sed '1,'\"`wc -l < /tmp/fileyyyy`\"'d' /tmp/fileA > /tmp/filexxxx Can anyone give me an alternate solution for the above (2 Replies)
Discussion started by: Muthuraj K
2 Replies

7. UNIX for Dummies Questions & Answers

Search for & edit rows & columns in data file and pipe

Dear unix gurus, I have a data file with header information about a subject and also 3 columns of n rows of data on various items he owns. The data file looks something like this: adam peter blah blah blah blah blah blah car 01 30 200 02 31 400 03 57 121 .. .. .. .. .. .. n y... (8 Replies)
Discussion started by: tintin72
8 Replies

8. Shell Programming and Scripting

how to delete duplicate rows in a file

I have a file content like below. "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","","" "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","","" "0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""... (5 Replies)
Discussion started by: vamshikrishnab
5 Replies

9. Shell Programming and Scripting

How to delete particular rows from a file

Hi I have a file having 1000 rows. Now I would like to remove 10 rows from it. Plz give me the script. Eg: input file like 4 1 4500.0 1 5 1 1.0 30 6 1 1.0 4500 7 1 4.0 730 7 2 500000.0 730 8 1 785460.0 45 8 7 94255.0 30 9 1 31800.0 30 9 4 36000.0 30 10 1 15000.0 30... (5 Replies)
Discussion started by: suresh3566
5 Replies

10. Shell Programming and Scripting

Delete repeated rows from a file

Hi everybody: Could anybody tell me how I can delete repeated rows from a file?, this is, for exemple I have a file like this: 0.490 958.73 281.85 6.67985 0.002481 0.490 954.833 283.991 8.73019 0.002471 0.590 950.504 286.241 6.61451 0.002461 0.690 939.323 286.112 6.16451 0.00246 0.790... (8 Replies)
Discussion started by: tonet
8 Replies
Login or Register to Ask a Question