Sort and remove duplicates in directory based on first 5 columns:
I have /tmp dir with filename as:
i want to sort these files based on first 5 columns and then remove the duplicates based on those same first 5 columns:
i tried below code:
later on i felt, there is no need to sort my files just remove the duplicates as i need only unique names, order doesn't matter, so i tried this:
i got:
If you see closely i am missing one file: i.e
please note the field separator in first 5 columns.
so my desired output should be :
help me out on this, also i want to run the for loop on the desired result set..so shall i delete the duplicate filenames or store the unique filenames at some other directory and then run for loop, need some kind of advise.
Can some one provide me a shell script.
I have file with many columns and many rows. need to sort the first column and then remove the duplicates records if exists.. finally print the full data with first coulm as unique.
Sort BASED ON FIRST FIELD and remove the duplicates if exists... (2 Replies)
Hi All,
I needs to fetch unique records based on a keycolumn(ie., first column1) and also I needs to get the records which are having max value on column2 in sorted manner... and duplicates have to store in another output file.
Input :
Input.txt
1234,0,x
1234,1,y
5678,10,z
9999,10,k... (7 Replies)
Hi,
I am unable to search the duplicates in a file based on the 1st,2nd,4th,5th columns in a file and also remove the duplicates in the same file.
Source filename: Filename.csv
"1","ccc","information","5000","temp","concept","new"
"1","ddd","information","6000","temp","concept","new"... (2 Replies)
Hi,
I'm using the below command to sort and remove duplicates in a file. But, i need to make this applied to the same file instead of directing it to another.
Thanks (6 Replies)
Hi ,
I have below data inside a file named ref.psv . I want to create a shell script which will do the below 2 points :
(1) sort the file content first based on the latest date which is the last column in the file (actual file its the 175th column)
(2)after sorting the file based on latest date... (3 Replies)
I need to use bash to remove duplicates without using sort first.
I can not use:
cat file | sort | uniq
But when I use only
cat file | uniq
some duplicates are not removed. (4 Replies)
Hi guys,Got a bit of a bind I'm in. I'm looking to remove duplicates from a pipe delimited file, but do so based on 2 columns. Sounds easy enough, but here's the kicker...
Column #1 is a simple ID, which is used to identify the duplicate.
Once dups are identified, I need to only keep the one... (2 Replies)
Here is my task :
I need to sort two input files and remove duplicates in the output files :
Sort by 13 characters from 97 Ascending
Sort by 1 characters from 96 Ascending
If duplicates are found retain the first value in the file
the input files are variable length, convert... (4 Replies)
Following is the input. 1st and 3rd block are same(block starts here with '*' and ends before blank line) , 2nd and 4th blocks are also the same:
cat <file>
* Wed Feb 24 2016 Tariq Saeed <tariq.x.saeed@mail.com> 2.0.7-1.0.7
- add vmcore dump support for ocfs2
* Mon Jun 8 2015 Brian Maly... (4 Replies)
Discussion started by: Paras Pandey
4 Replies
LEARN ABOUT DEBIAN
sort::key::maker
Sort::Key::Maker(3pm) User Contributed Perl Documentation Sort::Key::Maker(3pm)NAME
Sort::Key::Maker - multikey sorter creator
SYNOPSYS
# create a function that sorts strings by length:
use Sort::Key::Maker sort_by_length => sub { length $_}, qw(integer);
# create a multikey sort function;
# first key is integer sorted in descending order,
# second key is a string in default (ascending) order:
use Sort::Key::Maker ri_s_keysort => qw(-integer string);
# some sample data...
my @foo = qw(foo bar t too tood mama);
# and now, use the sorter functions previously made:
# get the values on @foo sorted by length:
my @sorted = sort_by_length @foo;
# sort @foo inplace by its length and then by its value:
ri_s_keysort_inplace { length $_, $_ } @foo;
DESCRIPTION
Sort::Key::Maker is a pragmatic module that provides an easy to use interface to Sort::Key multikey sorting functionality.
It creates multikey sorting functions on the fly for any key type combination and exports them to the caller package.
The key types natively accepted are:
string, str, locale, loc, integer, int,
unsigned_integer, uint, number, num
and support for other types can be added via Sort::Key::Register (or also via Sort::Key::register_type()).
USAGE
use Sort::Key::Maker foo_sort => @keys;
exports two subroutines to the caller package: "foo_sort (&@)" and "foo_sort_inplace (&@)".
Those two subroutines require a sub reference as their first argument and then respectively, the list to be sorted or an array.
For instance:
use Sort::Key::Maker bar_sort => qw(int int str);
@bar=qw(doo tomo 45s tio);
@sorted = bar_sort { unpack "CCs", $_ } @bar;
# or sorting @bar inplace
bar_sort_inplace { unpack "CCs", $_ } @bar;
use Sort::Key::Maker foo_sort => &genmultikey, @keys;
when the first argument after the sorter name is a reference to a subroutine it is used as the multikey extraction function. The
generated sorter functions doesn't require neither accept one, i.e.:
use Sort::Key::Maker sort_by_length => sub { length $_ }, 'int';
my @sorted = sort_by_length qw(foo goo h mama picasso);
SEE ALSO
Sort::Key, Sort::Key::Register.
Sort::Maker also available from CPAN provides similar functionality.
AUTHOR
Salvador Fandin~o, <sfandino@yahoo.com>
COPYRIGHT AND LICENSE
Copyright (C) 2005 by Salvador Fandin~o
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or,
at your option, any later version of Perl 5 you may have available.
perl v5.14.2 2010-04-16 Sort::Key::Maker(3pm)