Find lines with duplicate values in a particular column
I have a file with 5 columns. I want to pull out all records where the value in column 4 is not unique. For example in the sample below, I would want it to print out all lines except for the last two.
Code:
40991764 2419 724 47182 Cand A
40992936 3591 724 47182 Cand B
40993016 3671 724 47182 Cand C
40993876 4531 724 10154 Strep A
40993878 4533 724 10154 Strep B
40993990 4645 724 58899 Cala A
40993991 4646 724 63849 Myco A
I tried this:
Code:
awk -F '\t' 'a=x[$4]{print a"\n"$0;} {x[$4]=$0;}'
It works well if there is only one duplicate per line (10154 above), but if there is more than 1 duplicate (47182 above), it prints out one of the matched duplicates twice (Cand B):
Code:
40991764 2419 724 47182 Cand A
40992936 3591 724 47182 Cand B
40992936 3591 724 47182 Cand B
40993016 3671 724 47182 Cand C
40993876 4531 724 10154 Strep A
40993878 4533 724 10154 Strep B
How can I get it to print each unique line only once?
I have a text file names test2 with 3 columns as below . We have to retrieve the distinct values (not duplicate) from 2nd column and display. I have used the below command but giving some error.
NS3303 NS CRAFT LTD
NS3303 NS CHIRON VACCINES LTD
NS3303 NS ALLIED MEDICARE LTD
NS3303 NS... (16 Replies)
I have file which as 12 columns and values like this
1,2,3,4,5
a,b,c,d,e
b,c,a,e,f
a,b,e,a,h
if you see the first column has duplicate values, I need to identify (print it to console) the duplicate value (which is 'a') and also remove duplicate values like below. I could be in two... (5 Replies)
Hi I have a file like this. I need to eliminate lines with first column having the same value 10 times.
13 18 1 + chromosome 1, 122638287 AGAGTATGGTCGCGGTTG
13 18 1 + chromosome 1, 128904080 AGAGTATGGTCGCGGTTG
13 18 1 - chromosome 14, 13627938 CAACCGCGACCATACTCT
13 18 1 + chromosome 1,... (5 Replies)
Hi, I've got a file that I'd like to uniquely sort based on column 2 (values in column 2 begin with "comp").
I tried sort -t -nuk2,3 file.txtBut got:
sort: multi-character tab `-nuk2,3'
"man sort" did not help me out
Any pointers?
Input:
Output: (5 Replies)
Hello experts,
I have a requirement where I have to implement two checks on a csv file:
1. Check to see if the value in first column is duplicate, if any value is duplicate script should exit.
2. Check to verify if the value at second column is between "yes" or "no", if it is anything else... (4 Replies)
Dear Experts,
Kindly help me please,
I have a big file where there is duplicate values in col 11 till col 23, every 2 rows appers a new numbers, but in each row there is different coordinates x and y in col 57 till col 74.
Please i will like to get a single value and average of the x and y... (8 Replies)
Input
1,ABCD,no
2,system,yes
3,ABCD,yes
4,XYZ,no
5,XYZ,yes
6,pc,noCode used to find duplicate with regard to 2nd column
awk 'NR == 1 {p=$2; next} p == $2 { print "Line" NR "$2 is duplicated"} {p=$2}' FS="," ./input.csv
Now is there a wise way to de-duplicate the entire line (remove... (4 Replies)
Hello,
I have a script that is generating a tab delimited output file.
num Name PCA_A1 PCA_A2 PCA_A3
0 compound_00 -3.5054 -1.1207 -2.4372
1 compound_01 -2.2641 0.4287 -1.6120
3 compound_03 -1.3053 1.8495 ... (3 Replies)
Hi Gurus,
I have a file(weblog) as below
abc|xyz|123|agentcode=sample code abcdeeess,agentcode=sample code abcdeeess,agentcode=sample code abcdeeess|agentadd=abcd stereet 23343,agentadd=abcd stereet 23343
sss|wwq|999|agentcode=sample1 code wqwdeeess,gentcode=sample1 code... (4 Replies)
Dear folks
I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies
LEARN ABOUT DEBIAN
devel::refcount
Devel::Refcount(3pm) User Contributed Perl Documentation Devel::Refcount(3pm)NAME
"Devel::Refcount" - obtain the REFCNT value of a referent
SYNOPSIS
use Devel::Refcount qw( refcount );
my $anon = [];
print "Anon ARRAY $anon has " . refcount($anon) . " reference
";
my $otherref = $anon;
print "Anon ARRAY $anon now has " . refcount($anon) . " references
";
DESCRIPTION
This module provides a single function which obtains the reference count of the object being pointed to by the passed reference value.
FUNCTIONS
$count = refcount($ref)
Returns the reference count of the object being pointed to by $ref.
COMPARISON WITH SvREFCNT
This function differs from "Devel::Peek::SvREFCNT" in that SvREFCNT() gives the reference count of the SV object itself that it is passed,
whereas refcount() gives the count of the object being pointed to. This allows it to give the count of any referent (i.e. ARRAY, HASH,
CODE, GLOB and Regexp types) as well.
Consider the following example program:
use Devel::Peek qw( SvREFCNT );
use Devel::Refcount qw( refcount );
sub printcount
{
my $name = shift;
printf "%30s has SvREFCNT=%d, refcount=%d
",
$name, SvREFCNT($_[0]), refcount($_[0]);
}
my $var = [];
printcount 'Initially, $var', $var;
my $othervar = $var;
printcount 'Before CODE ref, $var', $var;
printcount '$othervar', $othervar;
my $code = sub { undef $var };
printcount 'After CODE ref, $var', $var;
printcount '$othervar', $othervar;
This produces the output
Initially, $var has SvREFCNT=1, refcount=1
Before CODE ref, $var has SvREFCNT=1, refcount=2
$othervar has SvREFCNT=1, refcount=2
After CODE ref, $var has SvREFCNT=2, refcount=2
$othervar has SvREFCNT=1, refcount=2
Here, we see that SvREFCNT() counts the number of references to the SV object passed in as the scalar value - the $var or $othervar
respectively, whereas refcount() counts the number of reference values that point to the referent object - the anonymous ARRAY in this
case.
Before the CODE reference is constructed, both $var and $othervar have SvREFCNT() of 1, as they exist only in the current lexical pad. The
anonymous ARRAY has a refcount() of 2, because both $var and $othervar store a reference to it.
After the CODE reference is constructed, the $var variable now has an SvREFCNT() of 2, because it also appears in the lexical pad for the
new anonymous CODE block.
PURE-PERL FALLBACK
An XS implementation of this function is provided, and is used by default. If the XS library cannot be loaded, a fallback implementation in
pure perl using the "B" module is used instead. This will behave identically, but is much slower.
Rate pp xs
pp 225985/s -- -66%
xs 669570/s 196% --
SEE ALSO
o Test::Refcount - assert reference counts on objects
AUTHOR
Paul Evans <leonerd@leonerd.org.uk>
perl v5.14.2 2011-11-15 Devel::Refcount(3pm)