Sponsored Content
Top Forums Shell Programming and Scripting fastest way to remove duplicates. Post 76033 by vino on Friday 24th of June 2005 06:27:51 AM
Old 06-24-2005
Quote:
Originally Posted by amit_sapre
Try out this one...

sed '$!N; /^\(.*\)\n\1$/!P; D'

# The first line of duplicate ones is only kept and rest are deleted.

Hope this will work faster than sort command.

I haven't tried on large files.
Havn't tried your sed. But doesnt it assume that all the entries are already sorted and then it removes the duplicates.

and/or

If the file is unsorted, then duplicate entries based on first line are removed. since sed makes just one-pass through the file.

Or did I get it wrong ?

vino
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to delete/remove directory in fastest way

hello i need help to remove directory . The directory is not empty ., it contains several sub directories and files inside that.. total number of files in one directory is 12,24,446 . rm -rf doesnt work . it is prompting for every file .. i want to delete without prompting and... (6 Replies)
Discussion started by: getdpg
6 Replies

2. UNIX for Dummies Questions & Answers

How to remove duplicates without sorting

Hello, I can remove duplicate entries in a file by: sort File1 | uniq > File2 but how can I remove duplicates without sorting the file? I tried cat File1 | uniq > File2 but it doesn't work thanks (4 Replies)
Discussion started by: orahi001
4 Replies

3. Shell Programming and Scripting

Remove duplicates

Hello Experts, I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra... (4 Replies)
Discussion started by: forumthreads
4 Replies

4. Shell Programming and Scripting

Remove duplicates from a file

Hi, I need to remove duplicates from a file. The file will be like this 0003 10101 20100120 abcdefghi 0003 10101 20100121 abcdefghi 0003 10101 20100122 abcdefghi 0003 10102 20100120 abcdefghi 0003 10103 20100120 abcdefghi 0003 10103 20100121 abcdefghi Here if the first colum and... (6 Replies)
Discussion started by: gpaulose
6 Replies

5. Shell Programming and Scripting

Script to remove duplicates

Hi I need a script that removes the duplicate records and write it to a new file for example I have a file named test.txt and it looks like abcd.23 abcd.24 abcd.25 qwer.25 qwer.26 qwer.98 I want to pick only $1 and compare with the next record and the output should be abcd.23... (6 Replies)
Discussion started by: antointoronto
6 Replies

6. Shell Programming and Scripting

remove duplicates and sort

Hi, I'm using the below command to sort and remove duplicates in a file. But, i need to make this applied to the same file instead of directing it to another. Thanks (6 Replies)
Discussion started by: dvah
6 Replies

7. Shell Programming and Scripting

Fastest way to delete duplicates from a large filelist.....

OK I have two filelists...... The first is formatted like this.... /path/to/the/actual/file/location/filename.jpg and has up to a million records The second list shows filename.jpg where there is more then on instance. and has maybe up to 65,000 records I want to copy files... (4 Replies)
Discussion started by: Bashingaway
4 Replies

8. Shell Programming and Scripting

bash - remove duplicates

I need to use a bash script to remove duplicate files from a download list, but I cannot use uniq because the urls are different. I need to go from this: http://***/fae78fe/file1.wmv http://***/39du7si/file1.wmv http://***/d8el2hd/file2.wmv http://***/h893js3/file2.wmv to this: ... (2 Replies)
Discussion started by: locoroco
2 Replies

9. Shell Programming and Scripting

Remove duplicates

I have a file with the following format: fields seperated by "|" title1|something class|long...content1|keys title2|somhing class|log...content1|kes title1|sothing class|lon...content1|kes title3|shing cls|log...content1|ks I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies

10. Shell Programming and Scripting

Remove duplicates

Hi I have a below file structure. 200,1245,E1,1,E1,,7611068,KWH,30, ,,,,,,,, 200,1245,E1,1,E1,,7611070,KWH,30, ,,,,,,,, 300,20140223,0.001,0.001,0.001,0.001,0.001 300,20140224,0.001,0.001,0.001,0.001,0.001 300,20140225,0.001,0.001,0.001,0.001,0.001 300,20140226,0.001,0.001,0.001,0.001,0.001... (1 Reply)
Discussion started by: tejashavele
1 Replies
pkgproto(1)							   User Commands						       pkgproto(1)

NAME
pkgproto - generate prototype file entries for input to pkgmk command SYNOPSIS
pkgproto [-i] [-c class] [path1] pkgproto [-i] [-c class] [path1=path2...] DESCRIPTION
pkgproto scans the indicated paths and generates prototype(4) file entries that may be used as input to the pkgmk(1) command. If no paths are specified on the command line, standard input is assumed to be a list of paths. If the pathname listed on the command line is a directory, the contents of the directory is searched. However, if input is read from stdin, a directory specified as a pathname will not be searched. OPTIONS
-i Ignores symbolic links and records the paths as ftype=f (a file) versus ftype=s (symbolic link). -c class Maps the class of all paths to class. OPERANDS
path1 Pathname where objects are located. path2 Pathname which should be substituted on output for path1. EXAMPLES
Example 1: Examples of the use of pkgproto.1. The following two examples show uses of pkgproto and a partial listing of the output produced. Example 1: example% pkgproto /bin=bin /usr/bin=usrbin /etc=etc f none bin/sed=/bin/sed 0775 bin bin f none bin/sh=/bin/sh 0755 bin daemon f none bin/sort=/bin/sort 0755 bin bin f none usrbin/sdb=/usr/bin/sdb 0775 bin bin f none usrbin/shl=/usr/bin/shl 4755 bin bin d none etc/master.d 0755 root daemon f none etc/master.d/kernel=/etc/master.d/kernel 0644 root daemon f none etc/rc=/etc/rc 0744 root daemon Example 2: example% find / -type d -print | pkgproto d none / 755 root root d none /bin 755 bin bin d none /usr 755 root root d none /usr/bin 775 bin bin d none /etc 755 root root d none /tmp 777 root root EXIT STATUS
0 Successful completion. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ SEE ALSO
pkgmk(1), pkgparam(1), pkgtrans(1), prototype(4), attributes(5) Application Packaging Developer's Guide NOTES
By default, pkgproto creates symbolic link entries for any symbolic link encountered (ftype=s). When you use the -i option, pkgproto cre- ates a file entry for symbolic links (ftype=f). The prototype(4) file would have to be edited to assign such file types as v (volatile), e (editable), or x (exclusive directory). pkgproto detects linked files. If multiple files are linked together, the first path encountered is considered the source of the link. By default, pkgproto prints prototype entries on the standard output. However, the output should be saved in a file (named Prototype or prototype, for convenience) to be used as input to the pkgmk(1) command. SunOS 5.10 6 Nov 2000 pkgproto(1)
All times are GMT -4. The time now is 02:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy