04-03-2008
Please post a sample of the file if this doesn't work:
Look into the -k option to sort. It will allow you to sort by specified fields.
However, I have never tried this together with the -u flag, so I'm curious as to whether it will not only sort on the -k fields, but also uniq on them...
ShawnMilo
10 More Discussions You Might Find Interesting
1. Solaris
Can any one give me command How to delete duplicate records with out sort.
Suppose if the records like below:
345,bcd,789
123,abc,456
234,abc,456
712,bcd,789
out tput should be
345,bcd,789
123,abc,456
Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (2 Replies)
Discussion started by: svenkatareddy
2 Replies
2. Shell Programming and Scripting
Can any one give me command How to delete duplicate records with out sort.
Suppose if the records like below:
345,bcd,789
123,abc,456
234,abc,456
712,bcd,789
out tput should be
345,bcd,789
123,abc,456
Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (19 Replies)
Discussion started by: svenkatareddy
19 Replies
3. Shell Programming and Scripting
I have a file which consists of 1000 entries. Out of 1000 entries i have 500 Duplicate Entires. I want to remove the first Duplicate Entry (i,e entire Line) in the File.
The example of the File is shown below:
8244100010143276|MARISOL CARO||MORALES|HSD768|CARR 430 KM 1.7 ... (1 Reply)
Discussion started by: ravi_rn
1 Replies
4. Shell Programming and Scripting
I have a pipe delimited file. Key is field 2, date is field 5 (as example, my real file is more complicated of course, but the KEY and DATE are accurate)
There can be duplicate rows for a key with different dates.
I need to keep only rows with latest date in this case.
Example data: ... (4 Replies)
Discussion started by: LisaS
4 Replies
5. Shell Programming and Scripting
How do we sort and remove duplicate on column 1,2 retaining the record with maximum date (in feild 3) for the file with following format.
aaa|1234|2010-12-31
aaa|1234|2010-11-10
bbb|345|2011-01-01
ccc|346|2011-02-01
bbb|345|2011-03-10
aaa|1234|2010-01-01
Required Output
... (5 Replies)
Discussion started by: mabarif16
5 Replies
6. Shell Programming and Scripting
I'm looking to remove duplicate rows from a CSV file with a twist.
The first row is a header.
There are 31 columns. I want to remove duplicates when the first 29 rows are identical ignoring row 30 and 31 BUT the duplicate that is kept should have the shortest total character length in rows 30... (6 Replies)
Discussion started by: Michael Stora
6 Replies
7. UNIX for Advanced & Expert Users
I have an input file of 5GB which contains duplicate records and have to remove duplicate records by retaing first instance of that record .
Based on 5 fields the duplicates has to be removed .
Kindly request to help me in writing a Unix Script.
Thanks
Asim (11 Replies)
Discussion started by: duplicate
11 Replies
8. UNIX for Dummies Questions & Answers
I want to delete partical duplicate file
>gma-miR156d Gm01,PACID=26323927 150.00 -18.28 2 18 17 35 16 75.00% 81.25%
>>gma-miR156d Gm01,PACID=26323927 150.00 -18.28 150.00 -18.28 1 21 119 17
I want to order by the second column and delete the... (1 Reply)
Discussion started by: grace_shen
1 Replies
9. Shell Programming and Scripting
I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code:
awk... (34 Replies)
Discussion started by: Michael Stora
34 Replies
10. UNIX for Beginners Questions & Answers
I am using DB2 v9 and trying to get country values in comma seperated format using below query
SELECT distinct LISTAGG(COUNTRIES, ',') WITHIN GROUP(ORDER BY EMPLOYEE)
FROM LOCATION ;
Output Achieved
MEXICO,UNITED STATES,INDIA,JAPAN,UNITED KINGDOM,MEXICO,UNITED STATES
The table... (4 Replies)
Discussion started by: Perlbaby
4 Replies
SORT(1) General Commands Manual SORT(1)
NAME
sort - sort a file of ASCII lines
SYNOPSIS
sort [-bcdfimnru] [-tc] [-o name] [+pos1] [-pos2] file ...
OPTIONS
-b Skip leading blanks when making comparisons
-c Check to see if a file is sorted
-d Dictionary order: ignore punctuation
-f Fold upper case onto lower case
-i Ignore nonASCII characters
-m Merge presorted files
-n Numeric sort order
-o Next argument is output file
-r Reverse the sort order
-t Following character is field separator
-u Unique mode (delete duplicate lines)
EXAMPLES
sort -nr file # Sort keys numerically, reversed
sort +2 -4 file # Sort using fields 2 and 3 as key
sort +2 -t: -o out # Field separator is :
sort +.3 -.6 # Characters 3 through 5 form the key
DESCRIPTION
Sort sorts one or more files. If no files are specified, stdin is sorted. Output is written on standard output, unless -o is specified.
The options +pos1 -pos2 use only fields pos1 up to but not including pos2 as the sort key, where a field is a string of characters delim-
ited by spaces and tabs, unless a different field delimiter is specified with -t. Both pos1 and pos2 have the form m.n where m tells the
number of fields and n tells the number of characters. Either m or n may be omitted.
SEE ALSO
comm(1), grep(1), uniq(1).
SORT(1)