Find duplicate values in specific column and delete all the duplicate values
Dear folks
I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same values in the specific column.
Say part of my input data is like this example:
my desire output is:
Thanks in advance
I have a text file names test2 with 3 columns as below . We have to retrieve the distinct values (not duplicate) from 2nd column and display. I have used the below command but giving some error.
NS3303 NS CRAFT LTD
NS3303 NS CHIRON VACCINES LTD
NS3303 NS ALLIED MEDICARE LTD
NS3303 NS... (16 Replies)
I have file which as 12 columns and values like this
1,2,3,4,5
a,b,c,d,e
b,c,a,e,f
a,b,e,a,h
if you see the first column has duplicate values, I need to identify (print it to console) the duplicate value (which is 'a') and also remove duplicate values like below. I could be in two... (5 Replies)
Hi I have a file like this. I need to eliminate lines with first column having the same value 10 times.
13 18 1 + chromosome 1, 122638287 AGAGTATGGTCGCGGTTG
13 18 1 + chromosome 1, 128904080 AGAGTATGGTCGCGGTTG
13 18 1 - chromosome 14, 13627938 CAACCGCGACCATACTCT
13 18 1 + chromosome 1,... (5 Replies)
Hi, I've got a file that I'd like to uniquely sort based on column 2 (values in column 2 begin with "comp").
I tried sort -t -nuk2,3 file.txtBut got:
sort: multi-character tab `-nuk2,3'
"man sort" did not help me out
Any pointers?
Input:
Output: (5 Replies)
Hello experts,
I have a requirement where I have to implement two checks on a csv file:
1. Check to see if the value in first column is duplicate, if any value is duplicate script should exit.
2. Check to verify if the value at second column is between "yes" or "no", if it is anything else... (4 Replies)
Dear Experts,
Kindly help me please,
I have a big file where there is duplicate values in col 11 till col 23, every 2 rows appers a new numbers, but in each row there is different coordinates x and y in col 57 till col 74.
Please i will like to get a single value and average of the x and y... (8 Replies)
Input
1,ABCD,no
2,system,yes
3,ABCD,yes
4,XYZ,no
5,XYZ,yes
6,pc,noCode used to find duplicate with regard to 2nd column
awk 'NR == 1 {p=$2; next} p == $2 { print "Line" NR "$2 is duplicated"} {p=$2}' FS="," ./input.csv
Now is there a wise way to de-duplicate the entire line (remove... (4 Replies)
Hello,
I have a script that is generating a tab delimited output file.
num Name PCA_A1 PCA_A2 PCA_A3
0 compound_00 -3.5054 -1.1207 -2.4372
1 compound_01 -2.2641 0.4287 -1.6120
3 compound_03 -1.3053 1.8495 ... (3 Replies)
Hi Gurus,
I have a file(weblog) as below
abc|xyz|123|agentcode=sample code abcdeeess,agentcode=sample code abcdeeess,agentcode=sample code abcdeeess|agentadd=abcd stereet 23343,agentadd=abcd stereet 23343
sss|wwq|999|agentcode=sample1 code wqwdeeess,gentcode=sample1 code... (4 Replies)
I have a file with 5 columns. I want to pull out all records where the value in column 4 is not unique. For example in the sample below, I would want it to print out all lines except for the last two.
40991764 2419 724 47182 Cand A
40992936 3591 724 47182 Cand B
40993016 3671 724 47182 Cand C... (5 Replies)
Discussion started by: kaktus
5 Replies
LEARN ABOUT CENTOS
comm
COMM(1) User Commands COMM(1)NAME
comm - compare two sorted files line by line
SYNOPSIS
comm [OPTION]... FILE1 FILE2
DESCRIPTION
Compare sorted files FILE1 and FILE2 line by line.
With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and
column three contains lines common to both files.
-1 suppress column 1 (lines unique to FILE1)
-2 suppress column 2 (lines unique to FILE2)
-3 suppress column 3 (lines that appear in both files)
--check-order
check that the input is correctly sorted, even if all input lines are pairable
--nocheck-order
do not check that the input is correctly sorted
--output-delimiter=STR
separate columns with STR
--help display this help and exit
--version
output version information and exit
Note, comparisons honor the rules specified by 'LC_COLLATE'.
EXAMPLES
comm -12 file1 file2
Print only lines present in both file1 and file2.
comm -3 file1 file2
Print lines in file1 not in file2, and vice versa.
GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Report comm translation bugs to <http://translationproject.org/team/>
AUTHOR
Written by Richard M. Stallman and David MacKenzie.
COPYRIGHT
Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO join(1), uniq(1)
The full documentation for comm is maintained as a Texinfo manual. If the info and comm programs are properly installed at your site, the
command
info coreutils 'comm invocation'
should give you access to the complete manual.
GNU coreutils 8.22 June 2014 COMM(1)