Can any one give me command How to delete duplicate records with out sort.
Suppose if the records like below:
345,bcd,789
123,abc,456
234,abc,456
712,bcd,789
out tput should be
345,bcd,789
123,abc,456
Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (2 Replies)
Can any one give me command How to delete duplicate records with out sort.
Suppose if the records like below:
345,bcd,789
123,abc,456
234,abc,456
712,bcd,789
out tput should be
345,bcd,789
123,abc,456
Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (19 Replies)
I have a file which consists of 1000 entries. Out of 1000 entries i have 500 Duplicate Entires. I want to remove the first Duplicate Entry (i,e entire Line) in the File.
The example of the File is shown below:
8244100010143276|MARISOL CARO||MORALES|HSD768|CARR 430 KM 1.7 ... (1 Reply)
I have a pipe delimited file. Key is field 2, date is field 5 (as example, my real file is more complicated of course, but the KEY and DATE are accurate)
There can be duplicate rows for a key with different dates.
I need to keep only rows with latest date in this case.
Example data: ... (4 Replies)
How do we sort and remove duplicate on column 1,2 retaining the record with maximum date (in feild 3) for the file with following format.
aaa|1234|2010-12-31
aaa|1234|2010-11-10
bbb|345|2011-01-01
ccc|346|2011-02-01
bbb|345|2011-03-10
aaa|1234|2010-01-01
Required Output
... (5 Replies)
I'm looking to remove duplicate rows from a CSV file with a twist.
The first row is a header.
There are 31 columns. I want to remove duplicates when the first 29 rows are identical ignoring row 30 and 31 BUT the duplicate that is kept should have the shortest total character length in rows 30... (6 Replies)
I have an input file of 5GB which contains duplicate records and have to remove duplicate records by retaing first instance of that record .
Based on 5 fields the duplicates has to be removed .
Kindly request to help me in writing a Unix Script.
Thanks
Asim (11 Replies)
I want to delete partical duplicate file
>gma-miR156d Gm01,PACID=26323927 150.00 -18.28 2 18 17 35 16 75.00% 81.25%
>>gma-miR156d Gm01,PACID=26323927 150.00 -18.28 150.00 -18.28 1 21 119 17
I want to order by the second column and delete the... (1 Reply)
I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code:
awk... (34 Replies)
I am using DB2 v9 and trying to get country values in comma seperated format using below query
SELECT distinct LISTAGG(COUNTRIES, ',') WITHIN GROUP(ORDER BY EMPLOYEE)
FROM LOCATION ;
Output Achieved
MEXICO,UNITED STATES,INDIA,JAPAN,UNITED KINGDOM,MEXICO,UNITED STATES
The table... (4 Replies)
Discussion started by: Perlbaby
4 Replies
LEARN ABOUT OPENSOLARIS
lsort
lsort(1T) Tcl Built-In Commands lsort(1T)__________________________________________________________________________________________________________________________________________________NAME
lsort - Sort the elements of a list
SYNOPSIS
lsort ?options? list
_________________________________________________________________DESCRIPTION
This command sorts the elements of list, returning a new list in sorted order. The implementation of the lsort command uses the merge-sort
algorithm which is a stable sort that has O(n log n) performance characteristics.
By default ASCII sorting is used with the result returned in increasing order. However, any of the following options may be specified
before list to control the sorting process (unique abbreviations are accepted):
-ascii Use string comparison with Unicode code-point collation order (the name is for backward-compatibility reasons.) This
is the default.
-dictionary Use dictionary-style comparison. This is the same as -ascii except (a) case is ignored except as a tie-breaker and (b)
if two strings contain embedded numbers, the numbers compare as integers, not characters. For example, in -dictionary
mode, bigBoy sorts between bigbang and bigboy, and x10y sorts between x9y and x11y.
-integer Convert list elements to integers and use integer comparison.
-real Convert list elements to floating-point values and use floating comparison.
-command command Use command as a comparison command. To compare two elements, evaluate a Tcl script consisting of command with the two
elements appended as additional arguments. The script should return an integer less than, equal to, or greater than
zero if the first element is to be considered less than, equal to, or greater than the second, respectively.
-increasing Sort the list in increasing order (``smallest'' items first). This is the default.
-decreasing Sort the list in decreasing order (``largest'' items first).
-index index If this option is specified, each of the elements of list must itself be a proper Tcl sublist. Instead of sorting
based on whole sublists, lsort will extract the index'th element from each sublist and sort based on the given element.
The keyword end is allowed for the index to sort on the last sublist element, and end-index sorts on a sublist element |
offset from the end. For example,
lsort -integer -index 1 {{First 24} {Second 18} {Third 30}}
returns {Second 18} {First 24} {Third 30}, and |
lsort -index end-1 {{a 1 e i} {b 2 3 f g} {c 4 5 6 d h}} |
returns {c 4 5 6 d h} {a 1 e i} {b 2 3 f g}. This option is much more efficient than using -command to achieve the
same effect.
-unique If this option is specified, then only the last set of duplicate elements found in the list will be retained. Note
that duplicates are determined relative to the comparison used in the sort. Thus if -index 0 is used, {1 a} and {1 b}
would be considered duplicates and only the second element, {1 b}, would be retained.
NOTES
The options to lsort only control what sort of comparison is used, and do not necessarily constrain what the values themselves actually
are. This distinction is only noticeable when the list to be sorted has fewer than two elements.
The lsort command is reentrant, meaning it is safe to use as part of the implementation of a command used in the -command option.
EXAMPLES
Sorting a list using ASCII sorting:
% lsort {a10 B2 b1 a1 a2}
B2 a1 a10 a2 b1
Sorting a list using Dictionary sorting:
% lsort -dictionary {a10 B2 b1 a1 a2}
a1 a2 a10 b1 B2
Sorting lists of integers:
% lsort -integer {5 3 1 2 11 4}
1 2 3 4 5 11
% lsort -integer {1 2 0x5 7 0 4 -1}
-1 0 1 2 4 0x5 7
Sorting lists of floating-point numbers:
% lsort -real {5 3 1 2 11 4}
1 2 3 4 5 11
% lsort -real {.5 0.07e1 0.4 6e-1}
0.4 .5 6e-1 0.07e1
Sorting using indices:
% # Note the space character before the c
% lsort {{a 5} { c 3} {b 4} {e 1} {d 2}}
{ c 3} {a 5} {b 4} {d 2} {e 1}
% lsort -index 0 {{a 5} { c 3} {b 4} {e 1} {d 2}}
{a 5} {b 4} { c 3} {d 2} {e 1}
% lsort -index 1 {{a 5} { c 3} {b 4} {e 1} {d 2}}
{e 1} {d 2} { c 3} {b 4} {a 5}
Stripping duplicate values using sorting:
% lsort -unique {a b c a b c a b c}
a b c
More complex sorting using a comparison function:
% proc compare {a b} {
set a0 [lindex $a 0]
set b0 [lindex $b 0]
if {$a0 < $b0} {
return -1
} elseif {$a0 > $b0} {
return 1
}
return [string compare [lindex $a 1] [lindex $b 1]]
}
% lsort -command compare
{{3 apple} {0x2 carrot} {1 dingo} {2 banana}}
{1 dingo} {2 banana} {0x2 carrot} {3 apple}
SEE ALSO list(1T), lappend(1T), lindex(1T), linsert(1T), llength(1T), lsearch(1T), lset(1T), lrange(1T), lreplace(1T) |
KEYWORDS
element, list, order, sort
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+--------------------+-----------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+--------------------+-----------------+
|Availability | SUNWTcl |
+--------------------+-----------------+
|Interface Stability | Uncommitted |
+--------------------+-----------------+
NOTES
Source for Tcl is available on http://opensolaris.org.
Tcl 8.3 lsort(1T)