Sponsored Content
Top Forums Shell Programming and Scripting Remove duplicate lines based on field and sort Post 302608559 by pravin27 on Saturday 17th of March 2012 11:23:21 PM
Old 03-18-2012
using Perl


Code:
#!/usr/bin/perl

use strict;
my %seen=();
my @flds;
while (<DATA>){
chomp;
@flds=split /,/;
print $_,"\n" if !$seen{$flds[0]}++;
}

__DATA__
55,I,like,cookies,2,8,9
44,I,like,cookies,2,8,9
88,I,like,cookies,5,7,8
88,I,like,cookies,2,8,9
99,I,like,cookies,5,7,8
99,I,like,cookies,2,8,9
77,I,like,cookies,5,7,8
77,I,like,cookies,2,8,9
66,I,like,cookies,5,7,8
66,I,like,cookies,2,8,9
55,I,like,cookies,5,7,8
44,I,like,cookies,5,7,8

 

10 More Discussions You Might Find Interesting

1. Solaris

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (2 Replies)
Discussion started by: svenkatareddy
2 Replies

2. Shell Programming and Scripting

Remove lines, Sorted with Time based columns using AWK & SORT

Hi having a file as follows MediaErr.log 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:12:16 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:22:47 84 Server1 Policy1 Schedule1 master1 05/08/2008 03:41:26 84 Server1 Policy1 ... (1 Reply)
Discussion started by: karthikn7974
1 Replies

3. Shell Programming and Scripting

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (19 Replies)
Discussion started by: svenkatareddy
19 Replies

4. Shell Programming and Scripting

Remove duplicate lines (the first matching line by field criteria)

Hello to all, I have this file 2002 1 23 0 0 2435.60 131.70 5.60 20.99 0.89 0.00 285.80 2303.90 2002 1 23 15 0 2436.60 132.90 6.45 21.19 1.03 0.00 285.80 2303.70 2002 1 23 ... (6 Replies)
Discussion started by: joggdial3000
6 Replies

5. Shell Programming and Scripting

Sort and Remove Duplicate on file

How do we sort and remove duplicate on column 1,2 retaining the record with maximum date (in feild 3) for the file with following format. aaa|1234|2010-12-31 aaa|1234|2010-11-10 bbb|345|2011-01-01 ccc|346|2011-02-01 bbb|345|2011-03-10 aaa|1234|2010-01-01 Required Output ... (5 Replies)
Discussion started by: mabarif16
5 Replies

6. UNIX for Dummies Questions & Answers

remove duplicate lines based on two columns and judging from a third one

hello all, I have an input file with four columns like this with a lot of lines and for example, line 1 and line 5 match because the first 4 characters match and the fourth column matches too. I want to keep the line that has the lowest number in the third column. So I discard line 5.... (5 Replies)
Discussion started by: TheTransporter
5 Replies

7. Shell Programming and Scripting

Remove lines with duplicate first field

Trying to cut down the size of some log files. Now that I write this out it looks more dificult than i thought it would be. Need a bash script or command that goes sequentially through all lines of a file, and does this: if field1 (space separated) is the number 2012 print the entire line. Do... (7 Replies)
Discussion started by: ajp7701
7 Replies

8. Shell Programming and Scripting

Remove duplicate value based on two field $4 and $5

Hi All, i have input file like below... CA009156;20091003;M;AWBKCA72;123;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;321;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;231;;CANADIAN... (2 Replies)
Discussion started by: mohan sharma
2 Replies

9. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies

10. Shell Programming and Scripting

Remove duplicate lines, sort it and save it as file itself

Hi, all I have a csv file that I would like to remove duplicate lines based on 1st field and sort them by the 1st field. If there are more than 1 line which is same on the 1st field, I want to keep the first line of them and remove the rest. I think I have to use uniq or something, but I still... (8 Replies)
Discussion started by: refrain
8 Replies
DPKG-AWK(1)						      General Commands Manual						       DPKG-AWK(1)

NAME
dpkg-awk - Utility to read a dpkg style db file SYNOPSIS
dpkg-awk [(-f|--file) filename] [(-d|--debug) ##] [(-s|--sort) list] [(-rs|--rec_sep) ??] '<fieldname>:<regex>' ... -- <out_fieldname> .. DESCRIPTION
dpkg-awk Parses a dpkg status file (or other similarly formatted file) and outputs the resulting records. It can use regex on the field values to limit the returned records, it can also be told which fields to output, and it can sort the matched fields. OPTIONS
-f filename --file filename The file to parse. The default is /var/lib/dpkg/status. -d [#] --debug [#] Each time this is specified, it increased the debug level. -s field(s) --sort field(s) A space or comma separated list of fields to sort on. -n field(s) --numeric field(s) A space or comma separated list of fields that should be interpreted as numeric in value. -rs ?? --rec_sep ?? Output this string at the end of each output paragraph. -h --help Display some help. fieldname The fields from the file, that are matched with the regex given. The fieldnames are case insensitive. out_fieldname The fields from the file, that are output for each record. If the first field listed begins with ^, then the list of fields that follows will NOT be output. BUGS
Be warned that the author has only a shallow understanding of the dpkg packaging system, so there are probably tons of bugs in this pro- gram. This program comes with no warranties. If running this program causes fire and brimstone to rain down upon the earth, you will be on your own. This program accesses the dpkg database directly in places, querying for data that cannot be gotten via dpkg. AUTHOR
Adam Heath <doogie@debian.org> DEBIAN
Debian Utilities DPKG-AWK(1)
All times are GMT -4. The time now is 07:20 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy