Sort data by date first and then remove duplicates


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort data by date first and then remove duplicates
# 1  
Old 05-21-2013
Sort data by date first and then remove duplicates

Hi ,
I have below data inside a file named ref.psv . I want to create a shell script which will do the below 2 points :
(1) sort the file content first based on the latest date which is the last column in the file (actual file its the 175th column)
(2)after sorting the file based on latest date , remove the duplicates based on the first column only

15277105||Common Stick|ESHR||Common Stock|CYRO AB|2013-05-14T00:52:31.662-04:00

16111278||Common Stick|ESHR||Common Stock|STANDARD REGISTER CO|2013-05-14T00:52:31.672-04:00

15277105||Common Stick|ESHR||Common Stock|CYRO AB|2013-05-15T00:52:31.672-04:00

39693766||Common Stick|ESHR||Common Stock|HS AG|2013-05-15T00:52:31.672-04:00

Any help with the script is requested .
thanks,
Sam
# 2  
Old 05-21-2013
Can you provide a breakup of the last date and time field...
# 3  
Old 05-21-2013
Sort data by date first and then remove duplicates

Quote:
Originally Posted by shamrock
Can you provide a breakup of the last date and time field...
Hi Shamrock,
The time is distributed as :

2013-05-14T00:52:31.662-04:00 indicates YYYY-MM-DDThh:mm:ss.[miliseconds]-[GMT-4].

For me YYYY-MM-DDThh:mm:ss is ok if it can be sorted
# 4  
Old 05-21-2013
Quote:
Originally Posted by samrat dutta
Hi Shamrock,
The time is distributed as :

2013-05-14T00:52:31.662-04:00 indicates YYYY-MM-DDThh:mm:ss.[miliseconds]-[GMT-4].

For me YYYY-MM-DDThh:mm:ss is ok if it can be sorted
This sort of thing is best doable in perl...
Code:
#!/usr/bin/perl

use warnings;
use Time::Local;
use Time::localtime;

while (<>) {
    chomp($_);
    @f = split(/\|/, $_);
    @dnt = split(/T/, $f[7]);
    $gmt = 4*60*60;

    my($yr, $mo, $dy) = split(/-/, $dnt[0]);
    my($hr, $mi, $sx) = split(/:/, $dnt[1]);
    my($se, $ms) = split(/\./, $sx);

    $ep = timelocal($se,$mi,$hr,$dy,$mo,$yr) + $gmt;
    push(@{$rec{$ep}}, $_);
}

foreach $t (sort keys %rec) {
    printf("%s\n", @{$rec{$t}});
};

Save the above perl script in a file and run it with your input file as an argument...
Code:
my_perl_script ref.psv

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Concatenate and sort to remove duplicates

Following is the input. 1st and 3rd block are same(block starts here with '*' and ends before blank line) , 2nd and 4th blocks are also the same: cat <file> * Wed Feb 24 2016 Tariq Saeed <tariq.x.saeed@mail.com> 2.0.7-1.0.7 - add vmcore dump support for ocfs2 * Mon Jun 8 2015 Brian Maly... (4 Replies)
Discussion started by: Paras Pandey
4 Replies

2. UNIX for Beginners Questions & Answers

Sort and remove duplicates in directory based on first 5 columns:

I have /tmp dir with filename as: 010020001_S-FOR-Sort-SYEXC_20160229_2212101.marker 010020001_S-FOR-Sort-SYEXC_20160229_2212102.marker 010020001-S-XOR-Sort-SYEXC_20160229_2212104.marker 010020001-S-XOR-Sort-SYEXC_20160229_2212105.marker 010020001_S-ZOR-Sort-SYEXC_20160229_2212106.marker... (4 Replies)
Discussion started by: gnnsprapa
4 Replies

3. Shell Programming and Scripting

Sort and Remove duplicates

Here is my task : I need to sort two input files and remove duplicates in the output files : Sort by 13 characters from 97 Ascending Sort by 1 characters from 96 Ascending If duplicates are found retain the first value in the file the input files are variable length, convert... (4 Replies)
Discussion started by: ysvsr1
4 Replies

4. Shell Programming and Scripting

sed --> sort data by date

Hi, i "tried" to sort data by date. So far, i used sed to take the data from the last and the actual month. Now, after changing the year it is not working properly. i use: GNU bash, version 4.2.45(1)-release (x86_64-suse-linux-gnu) sed -n '/\//p' $Home/../scripte/pd_0.txt y is a... (6 Replies)
Discussion started by: IMPe
6 Replies

5. Shell Programming and Scripting

Bash - remove duplicates without sort

I need to use bash to remove duplicates without using sort first. I can not use: cat file | sort | uniq But when I use only cat file | uniq some duplicates are not removed. (4 Replies)
Discussion started by: locoroco
4 Replies

6. Shell Programming and Scripting

Sort data by date and then search by column

Hi, I have a file where data is pipe separated.First i want to sort the file content by date . Then i want to pick up the records based on the first column which should be unique and not have duplicates. NYSE|yyyrrrddd|toronto|isin|ticker|2013-05-15... (2 Replies)
Discussion started by: samrat dutta
2 Replies

7. Shell Programming and Scripting

Remove Duplicates on multiple Key Columns and get the Latest Record from Date/Time Column

Hi Experts , we have a CDC file where we need to get the latest record of the Key columns Key Columns will be CDC_FLAG and SRC_PMTN_I and fetch the latest record from the CDC_PRCS_TS Can we do it with a single awk command. Please help.... (3 Replies)
Discussion started by: vijaykodukula
3 Replies

8. Shell Programming and Scripting

Kindly check:remove duplicates with similar data in front of it

Hi all, I have 2 files containing data like this: so if there is same entry repeated in the column like1,2,3,4 I have to check if there is different entries column like 2,4 but similar entries for duplicatein column 2 like1,3 the output shuld be like this for first file ... (5 Replies)
Discussion started by: manigrover
5 Replies

9. Shell Programming and Scripting

remove duplicates and sort

Hi, I'm using the below command to sort and remove duplicates in a file. But, i need to make this applied to the same file instead of directing it to another. Thanks (6 Replies)
Discussion started by: dvah
6 Replies

10. Programming

sort data by date.. pls help

hi all, could anyone help me? I need to query output by compare dates from 2 table and i'm using a UNION query..and wanted to sort the output by date.. My query like this: SELECT TO_CHAR(DATE) DATE1, INVOICE FROM ACCOUNT1 WHERE DATE < (to_date('122003','MMYYYY')) UNION ... (2 Replies)
Discussion started by: kate katherine
2 Replies
Login or Register to Ask a Question