Sponsored Content
Top Forums Shell Programming and Scripting Remove duplicate lines based on field and sort Post 302608559 by pravin27 on Saturday 17th of March 2012 11:23:21 PM
Old 03-18-2012
using Perl


Code:
#!/usr/bin/perl

use strict;
my %seen=();
my @flds;
while (<DATA>){
chomp;
@flds=split /,/;
print $_,"\n" if !$seen{$flds[0]}++;
}

__DATA__
55,I,like,cookies,2,8,9
44,I,like,cookies,2,8,9
88,I,like,cookies,5,7,8
88,I,like,cookies,2,8,9
99,I,like,cookies,5,7,8
99,I,like,cookies,2,8,9
77,I,like,cookies,5,7,8
77,I,like,cookies,2,8,9
66,I,like,cookies,5,7,8
66,I,like,cookies,2,8,9
55,I,like,cookies,5,7,8
44,I,like,cookies,5,7,8

 

10 More Discussions You Might Find Interesting

1. Solaris

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (2 Replies)
Discussion started by: svenkatareddy
2 Replies

2. Shell Programming and Scripting

Remove lines, Sorted with Time based columns using AWK & SORT

Hi having a file as follows MediaErr.log 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:12:16 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:22:47 84 Server1 Policy1 Schedule1 master1 05/08/2008 03:41:26 84 Server1 Policy1 ... (1 Reply)
Discussion started by: karthikn7974
1 Replies

3. Shell Programming and Scripting

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (19 Replies)
Discussion started by: svenkatareddy
19 Replies

4. Shell Programming and Scripting

Remove duplicate lines (the first matching line by field criteria)

Hello to all, I have this file 2002 1 23 0 0 2435.60 131.70 5.60 20.99 0.89 0.00 285.80 2303.90 2002 1 23 15 0 2436.60 132.90 6.45 21.19 1.03 0.00 285.80 2303.70 2002 1 23 ... (6 Replies)
Discussion started by: joggdial3000
6 Replies

5. Shell Programming and Scripting

Sort and Remove Duplicate on file

How do we sort and remove duplicate on column 1,2 retaining the record with maximum date (in feild 3) for the file with following format. aaa|1234|2010-12-31 aaa|1234|2010-11-10 bbb|345|2011-01-01 ccc|346|2011-02-01 bbb|345|2011-03-10 aaa|1234|2010-01-01 Required Output ... (5 Replies)
Discussion started by: mabarif16
5 Replies

6. UNIX for Dummies Questions & Answers

remove duplicate lines based on two columns and judging from a third one

hello all, I have an input file with four columns like this with a lot of lines and for example, line 1 and line 5 match because the first 4 characters match and the fourth column matches too. I want to keep the line that has the lowest number in the third column. So I discard line 5.... (5 Replies)
Discussion started by: TheTransporter
5 Replies

7. Shell Programming and Scripting

Remove lines with duplicate first field

Trying to cut down the size of some log files. Now that I write this out it looks more dificult than i thought it would be. Need a bash script or command that goes sequentially through all lines of a file, and does this: if field1 (space separated) is the number 2012 print the entire line. Do... (7 Replies)
Discussion started by: ajp7701
7 Replies

8. Shell Programming and Scripting

Remove duplicate value based on two field $4 and $5

Hi All, i have input file like below... CA009156;20091003;M;AWBKCA72;123;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;321;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;231;;CANADIAN... (2 Replies)
Discussion started by: mohan sharma
2 Replies

9. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies

10. Shell Programming and Scripting

Remove duplicate lines, sort it and save it as file itself

Hi, all I have a csv file that I would like to remove duplicate lines based on 1st field and sort them by the 1st field. If there are more than 1 line which is same on the 1st field, I want to keep the first line of them and remove the rest. I think I have to use uniq or something, but I still... (8 Replies)
Discussion started by: refrain
8 Replies
VOP_READDIR(9)						   BSD Kernel Developer's Manual					    VOP_READDIR(9)

NAME
VOP_READDIR -- read contents of a directory SYNOPSIS
#include <sys/param.h> #include <sys/dirent.h> #include <sys/vnode.h> int VOP_READDIR(struct vnode *vp, struct uio *uio, struct ucred *cred, int *eofflag, int *ncookies, u_long **cookies); DESCRIPTION
Read directory entries. vp The vnode of the directory. uio Where to read the directory contents. cred The caller's credentials. eofflag Return end of file status (NULL if not wanted). ncookies Number of directory cookies generated for NFS (NULL if not wanted). cookies Directory seek cookies generated for NFS (NULL if not wanted). The directory contents are read into struct dirent structures. If the on-disc data structures differ from this then they should be trans- lated. LOCKS
The directory should be locked on entry and will still be locked on exit. RETURN VALUES
Zero is returned on success, otherwise an error code is returned. If this is called from the NFS server, the extra arguments eofflag, ncookies and cookies are given. The value of *eofflag should be set to TRUE if the end of the directory is reached while reading. The directory seek cookies are returned to the NFS client and may be used later to restart a directory read part way through the directory. There should be one cookie returned per directory entry. The value of the cookie should be the offset within the directory where the on-disc version of the appropriate directory entry starts. Memory for the cookies should be allocated using: ...; *ncookies = number of entries read; *cookies = (u_int*)# malloc(*ncookies * sizeof(u_int), M_TEMP, M_WAITOK); ERRORS
[EINVAL] An attempt was made to read from an illegal offset in the directory. [EIO] A read error occurred while reading the directory. SEE ALSO
vnode(9) AUTHORS
This manual page was written by Doug Rabson. BSD
July 24, 1996 BSD
All times are GMT -4. The time now is 07:50 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy