01-08-2009
sort and semi-duplicate row - keep latest only
I have a pipe delimited file. Key is field 2, date is field 5 (as example, my real file is more complicated of course, but the KEY and DATE are accurate)
There can be duplicate rows for a key with different dates.
I need to keep only rows with latest date in this case.
Example data:
W|AAA|DD|D|20080101
W|BBB|CC|C|20080101
W|AAA|BB|B|20080201
W|CCC|DD|D|20080701
W|CCC|EE|E|20080801
W|AAA|DD|D|20081231
I would want to see:
W|AAA|DD|D|20081231
W|BBB|CC|C|20080101
W|CCC|EE|E|20080801
I want to use sort for this but am open to other options. I'm guessing awk could be involved, but I'm bad at writing awk.
I've searched but didn't find anything that seemed to match.
Ideas?
edit to add a little more info:
It could be that I have rows with same key and same date. In that case, I'd prefer to take the one that is last in the file (because of differences in the other fields of the rows and how the file gets built) -- but if that is not possible I understand.
Last edited by LisaS; 01-08-2009 at 04:36 AM..
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I need a shell script which should find the latest date in the field of file and print that line only. For eg.,
I have a file /date.log
Name Date Status
IBM 06/06/07 close
DELL 07/27/07 open
DELL 06/07/07 open
: : :
From... (1 Reply)
Discussion started by: cvkishore
1 Replies
2. Shell Programming and Scripting
Ok here's what I'm trying to do. I need to get a listing of all the mountpoints on a system into a file, which is easy enough, just using something like "mount | awk '{print $1}'"
However, on a couple of systems, they have some mount points looking like this:
/stage
/stand
/usr
/MFPIS... (2 Replies)
Discussion started by: paqman
2 Replies
3. Shell Programming and Scripting
Hi,
I need to delete all occurences of the repeated lines from a file and retain only the lines that is not repeated elsewhere in the file. As seen below the first two lines are same except that for the string "From BaseLine" and "From SMS".I shouldn't consider the string "From SMS" and "From... (7 Replies)
Discussion started by: ragavhere
7 Replies
4. Shell Programming and Scripting
I'm trying to remove lines of data that contain duplicate data in a specific column.
For example.
apple 12345
apple 54321
apple 14234
orange 55656
orange 88989
orange 99898
I only want to see
apple 12345
orange 55656
How would i go about doing this? (5 Replies)
Discussion started by: spartan22
5 Replies
5. Shell Programming and Scripting
Hi,
I have a directory which contains a number of sub directories. They are named as 1.0.0, 1.0.1, 1.0.2...1.1.0..1.1.1...1.2.0..and so on..
Basically these are the tags created at the time of release. Tags are named as major.minor.buildnumber format for modules.
Now I have to search the... (2 Replies)
Discussion started by: bhaskar_m
2 Replies
6. Shell Programming and Scripting
Hi,
How to identify duplicate columns in a row?
Input data: may have 30 columns
9211480750 LK 120070417 920091030
9211480893 AZ 120070607
9205323621 O7 120090914 120090914 1420090914 2020090914 2020090914
9211479568 AZ 120070327 320090730
9211479571 MM 120070326
9211480892 MM 120070324... (3 Replies)
Discussion started by: suresh3566
3 Replies
7. UNIX for Dummies Questions & Answers
Gurus,
From a file I need to remove duplicate rows based on the first column data but also we need to consider a date column where we need to keep the latest date (13th column).
Ex:
Input File:
Output File:
I know how to take out the duplicates but I couldn't figure out... (5 Replies)
Discussion started by: shash
5 Replies
8. Shell Programming and Scripting
Hi all
I have a big file like this in rows and columns from 2 column onwards the next column is desciption of previous column means 3rd columns is description of 2 columns and 5 column is description of 4 column.
All cloumns are separated by comma
... (1 Reply)
Discussion started by: manigrover
1 Replies
9. Shell Programming and Scripting
Hi all,
how can delete duplicate files in file form, e.g.
$cat file1
aaa 123 234 345 456
bbb 345 345 657 568
ccc 345 768 897 456
aaa 123 234 345 456
ddd 786 784 234 263
ccc 345 768 897 456
aaa 123 234 345 456
ccc 345 768 897 456
then i need ouput file1 some, (4 Replies)
Discussion started by: aav1307
4 Replies
10. Shell Programming and Scripting
I have tried the following code and with that i couldnt achieve what i want.
#!/usr/bin/bash
find ./ -type f \( -iname "*.xml" \) | sort -n > fileList
sed -i '/\.\/fileList/d' fileList
NAMEOFTHISFILE=$(echo $0|sed -e 's/\/()$*.^|/\\&/g')
sed -i "/$NAMEOFTHISFILE/d"... (2 Replies)
Discussion started by: gold2k8
2 Replies
LEARN ABOUT CENTOS
dblink_build_sql_insert
DBLINK_BUILD_SQL_INSERT(3) PostgreSQL 9.2.7 Documentation DBLINK_BUILD_SQL_INSERT(3)
NAME
dblink_build_sql_insert - builds an INSERT statement using a local tuple, replacing the primary key field values with alternative supplied
values
SYNOPSIS
dblink_build_sql_insert(text relname,
int2vector primary_key_attnums,
integer num_primary_key_atts,
text[] src_pk_att_vals_array,
text[] tgt_pk_att_vals_array) returns text
DESCRIPTION
dblink_build_sql_insert can be useful in doing selective replication of a local table to a remote database. It selects a row from the local
table based on primary key, and then builds a SQL INSERT command that will duplicate that row, but with the primary key values replaced by
the values in the last argument. (To make an exact copy of the row, just specify the same values for the last two arguments.)
ARGUMENTS
relname
Name of a local relation, for example foo or myschema.mytab. Include double quotes if the name is mixed-case or contains special
characters, for example "FooBar"; without quotes, the string will be folded to lower case.
primary_key_attnums
Attribute numbers (1-based) of the primary key fields, for example 1 2.
num_primary_key_atts
The number of primary key fields.
src_pk_att_vals_array
Values of the primary key fields to be used to look up the local tuple. Each field is represented in text form. An error is thrown if
there is no local row with these primary key values.
tgt_pk_att_vals_array
Values of the primary key fields to be placed in the resulting INSERT command. Each field is represented in text form.
RETURN VALUE
Returns the requested SQL statement as text.
NOTES
As of PostgreSQL 9.0, the attribute numbers in primary_key_attnums are interpreted as logical column numbers, corresponding to the column's
position in SELECT * FROM relname. Previous versions interpreted the numbers as physical column positions. There is a difference if any
column(s) to the left of the indicated column have been dropped during the lifetime of the table.
EXAMPLES
SELECT dblink_build_sql_insert('foo', '1 2', 2, '{"1", "a"}', '{"1", "b''a"}');
dblink_build_sql_insert
--------------------------------------------------
INSERT INTO foo(f1,f2,f3) VALUES('1','b''a','1')
(1 row)
PostgreSQL 9.2.7 2014-02-17 DBLINK_BUILD_SQL_INSERT(3)