Sponsored Content
Top Forums Shell Programming and Scripting Remove duplicate lines based on field and sort Post 302608473 by ahamed101 on Saturday 17th of March 2012 03:48:14 PM
Old 03-17-2012
man sort

Quote:
-t, --field-separator=SEP
use SEP instead of non-blank to blank transition

-n, --numeric-sort
compare according to string numerical value

-u, --unique
with -c, check for strict ordering; without -c, output only the
first of an equal run

-k, --key=POS1[,POS2]
start a key at POS1 (origin 1), end it at POS2 (default end of
line). See POS syntax below
So basically, it will sort numerically (-n) on the first field (-k1) which is separated by ,(comma) (-t,) and produce unique (-u) results

--ahamed
This User Gave Thanks to ahamed101 For This Post:
 

10 More Discussions You Might Find Interesting

1. Solaris

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (2 Replies)
Discussion started by: svenkatareddy
2 Replies

2. Shell Programming and Scripting

Remove lines, Sorted with Time based columns using AWK & SORT

Hi having a file as follows MediaErr.log 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:12:16 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:22:47 84 Server1 Policy1 Schedule1 master1 05/08/2008 03:41:26 84 Server1 Policy1 ... (1 Reply)
Discussion started by: karthikn7974
1 Replies

3. Shell Programming and Scripting

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (19 Replies)
Discussion started by: svenkatareddy
19 Replies

4. Shell Programming and Scripting

Remove duplicate lines (the first matching line by field criteria)

Hello to all, I have this file 2002 1 23 0 0 2435.60 131.70 5.60 20.99 0.89 0.00 285.80 2303.90 2002 1 23 15 0 2436.60 132.90 6.45 21.19 1.03 0.00 285.80 2303.70 2002 1 23 ... (6 Replies)
Discussion started by: joggdial3000
6 Replies

5. Shell Programming and Scripting

Sort and Remove Duplicate on file

How do we sort and remove duplicate on column 1,2 retaining the record with maximum date (in feild 3) for the file with following format. aaa|1234|2010-12-31 aaa|1234|2010-11-10 bbb|345|2011-01-01 ccc|346|2011-02-01 bbb|345|2011-03-10 aaa|1234|2010-01-01 Required Output ... (5 Replies)
Discussion started by: mabarif16
5 Replies

6. UNIX for Dummies Questions & Answers

remove duplicate lines based on two columns and judging from a third one

hello all, I have an input file with four columns like this with a lot of lines and for example, line 1 and line 5 match because the first 4 characters match and the fourth column matches too. I want to keep the line that has the lowest number in the third column. So I discard line 5.... (5 Replies)
Discussion started by: TheTransporter
5 Replies

7. Shell Programming and Scripting

Remove lines with duplicate first field

Trying to cut down the size of some log files. Now that I write this out it looks more dificult than i thought it would be. Need a bash script or command that goes sequentially through all lines of a file, and does this: if field1 (space separated) is the number 2012 print the entire line. Do... (7 Replies)
Discussion started by: ajp7701
7 Replies

8. Shell Programming and Scripting

Remove duplicate value based on two field $4 and $5

Hi All, i have input file like below... CA009156;20091003;M;AWBKCA72;123;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;321;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;231;;CANADIAN... (2 Replies)
Discussion started by: mohan sharma
2 Replies

9. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies

10. Shell Programming and Scripting

Remove duplicate lines, sort it and save it as file itself

Hi, all I have a csv file that I would like to remove duplicate lines based on 1st field and sort them by the 1st field. If there are more than 1 line which is same on the 1st field, I want to keep the first line of them and remove the rest. I think I have to use uniq or something, but I still... (8 Replies)
Discussion started by: refrain
8 Replies
SORT(1) 							   User Commands							   SORT(1)

NAME
sort - sort lines of text files SYNOPSIS
sort [OPTION]... [FILE]... sort [OPTION]... --files0-from=F DESCRIPTION
Write sorted concatenation of all FILE(s) to standard output. Mandatory arguments to long options are mandatory for short options too. Ordering options: -b, --ignore-leading-blanks ignore leading blanks -d, --dictionary-order consider only blanks and alphanumeric characters -f, --ignore-case fold lower case to upper case characters -g, --general-numeric-sort compare according to general numerical value -i, --ignore-nonprinting consider only printable characters -M, --month-sort compare (unknown) < `JAN' < ... < `DEC' -h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G) -n, --numeric-sort compare according to string numerical value -R, --random-sort sort by random hash of keys --random-source=FILE get random bytes from FILE -r, --reverse reverse the result of comparisons --sort=WORD sort according to WORD: general-numeric -g, human-numeric -h, month -M, numeric -n, random -R, version -V -V, --version-sort natural sort of (version) numbers within text Other options: --batch-size=NMERGE merge at most NMERGE inputs at once; for more use temp files -c, --check, --check=diagnose-first check for sorted input; do not sort -C, --check=quiet, --check=silent like -c, but do not report first bad line --compress-program=PROG compress temporaries with PROG; decompress them with PROG -d --debug annotate the part of the line used to sort, and warn about questionable usage to stderr --files0-from=F read input from the files specified by NUL-terminated names in file F; If F is - then read names from standard input -k, --key=POS1[,POS2] start a key at POS1 (origin 1), end it at POS2 (default end of line). See POS syntax below -m, --merge merge already sorted files; do not sort -o, --output=FILE write result to FILE instead of standard output -s, --stable stabilize sort by disabling last-resort comparison -S, --buffer-size=SIZE use SIZE for main memory buffer -t, --field-separator=SEP use SEP instead of non-blank to blank transition -T, --temporary-directory=DIR use DIR for temporaries, not $TMPDIR or /tmp; multiple options specify multiple directories --parallel=N change the number of sorts run concurrently to N -u, --unique with -c, check for strict ordering; without -c, output only the first of an equal run -z, --zero-terminated end lines with 0 byte, not newline --help display this help and exit --version output version information and exit POS is F[.C][OPTS], where F is the field number and C the character position in the field; both are origin 1. If neither -t nor -b is in effect, characters in a field are counted from the beginning of the preceding whitespace. OPTS is one or more single-letter ordering options, which override global ordering options for that key. If no key is given, use the entire line as the key. SIZE may be followed by the following multiplicative suffixes: % 1% of memory, b 1, K 1024 (default), and so on for M, G, T, P, E, Z, Y. With no FILE, or when FILE is -, read standard input. *** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values. AUTHOR
Written by Mike Haertel and Paul Eggert. REPORTING BUGS
Report sort bugs to bug-coreutils@gnu.org GNU coreutils home page: <http://www.gnu.org/software/coreutils/> General help using GNU software: <http://www.gnu.org/gethelp/> Report sort translation bugs to <http://translationproject.org/team/> COPYRIGHT
Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO
The full documentation for sort is maintained as a Texinfo manual. If the info and sort programs are properly installed at your site, the command info coreutils 'sort invocation' should give you access to the complete manual. GNU coreutils 8.12.197-032bb September 2011 SORT(1)
All times are GMT -4. The time now is 03:57 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy