01-16-2012
Remove duplicate rows when >10 based on single column value
Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column.
e.g.
a 1
a 2
a 3
b 1
c 1
gives
b 1
c 1
but requires 11 duplicates before it deletes.
Thanks for the help
Last edited by informaticist; 01-17-2012 at 03:53 PM..
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi,
I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g.
COL1 COL2 COL3
A 1234 1234
B 3k32 2322
C Xk32 TTT
A NEW XX22
B 3k32 ... (7 Replies)
Discussion started by: risk_sly
7 Replies
2. Shell Programming and Scripting
hii i have a huge amt of data stored in a file.Here in this file i need to remove duplicates rows in such a way that the last column has different data & i must check for greatest among last colmn data & print the largest data along with other entries but just one of other duplicate entries is... (16 Replies)
Discussion started by: reva
16 Replies
3. Shell Programming and Scripting
My input file:
AVI.out <detail>named as the RRM .</detail>
AVI.out <detail>Contains 1 RRM .</detail>
AR0.out <detail>named as the tellurite-resistance.</detail>
AWG.out <detail>Contains 2 HTH .</detail>
ADV.out <detail>named as the DENR family.</detail>
ADV.out ... (10 Replies)
Discussion started by: patrick87
10 Replies
4. Shell Programming and Scripting
I am a newbie to shell scripting ..
I have a .csv file. It has 1000 some rows and about 7 columns...
but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type...
example below..
column 1 ... (2 Replies)
Discussion started by: mitr
2 Replies
5. Shell Programming and Scripting
Hello,
I am new to shell scripting. I have a huge file with multiple columns for example:
I have 5 columns below.
HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG
HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL
HWUSI-EAS000_29:1:108 + ... (4 Replies)
Discussion started by: Diya123
4 Replies
6. Shell Programming and Scripting
Hi,
I want to remove duplicate records including the first line based on column1. For example
inputfile(filer.txt):
-------------
1,3000,5000
1,4000,6000
2,4000,600
2,5000,700
3,60000,4000
4,7000,7777
5,999,8888
expected output:
----------------
3,60000,4000
4,7000,7777... (5 Replies)
Discussion started by: G.K.K
5 Replies
7. Shell Programming and Scripting
I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1.
https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html
Can someone explain this command please? How are there no... (5 Replies)
Discussion started by: cokedude
5 Replies
8. UNIX for Dummies Questions & Answers
I have 2 files,
file01= 7 columns, row unknown (but few)
file02= 7 columns, row unknown (but many)
now I want to create an output with the first field that is shared in both of them and then subtract the results from the rest of the fields and print there
e.g.
file 01
James|0|50|25|10|50|30... (1 Reply)
Discussion started by: A-V
1 Replies
9. Shell Programming and Scripting
Dear fellows, I need your help.
I'm trying to write a script to convert a single column into multiple rows.
But it need to recognize the beginning of the string and set it to its specific Column number.
Each Line (loop) begins with digit (RANGE).
At this moment it's kind of working, but it... (6 Replies)
Discussion started by: AK47
6 Replies
10. Shell Programming and Scripting
Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed.
example of my file... (3 Replies)
Discussion started by: clarissab
3 Replies
LEARN ABOUT NETBSD
msguniq
MSGUNIQ(1) GNU MSGUNIQ(1)
NAME
msguniq - unify duplicate translations in message catalog
SYNOPSIS
msguniq [OPTION] [INPUTFILE]
DESCRIPTION
Unifies duplicate translations in a translation catalog. Finds duplicate translations of the same message ID. Such duplicates are invalid
input for other programs like msgfmt, msgmerge or msgcat. By default, duplicates are merged together. When using the --repeated option,
only duplicates are output, and all other messages are discarded. Comments and extracted comments will be cumulated, except that if
--use-first is specified, they will be taken from the first translation. File positions will be cumulated. When using the --unique
option, duplicates are discarded.
Mandatory arguments to long options are mandatory for short options too.
Input file location:
INPUTFILE
input PO file
-D, --directory=DIRECTORY
add DIRECTORY to list for input files search
If no input file is given or if it is -, standard input is read.
Output file location:
-o, --output-file=FILE
write output to specified file
The results are written to standard output if no output file is specified or if it is -.
Message selection:
-d, --repeated
print only duplicates
-u, --unique
print only unique messages, discard duplicates
Input file syntax:
-P, --properties-input
input file is in Java .properties syntax
--stringtable-input
input file is in NeXTstep/GNUstep .strings syntax
Output details:
-t, --to-code=NAME
encoding for output
--use-first
use first available translation for each message, don't merge several translations
-e, --no-escape
do not use C escapes in output (default)
-E, --escape
use C escapes in output, no extended chars
--force-po
write PO file even if empty
-i, --indent
write the .po file using indented style
--no-location
do not write '#: filename:line' lines
-n, --add-location
generate '#: filename:line' lines (default)
--strict
write out strict Uniforum conforming .po file
-p, --properties-output
write out a Java .properties file
--stringtable-output
write out a NeXTstep/GNUstep .strings file
-w, --width=NUMBER
set output page width
--no-wrap
do not break long message lines, longer than the output page width, into several lines
-s, --sort-output
generate sorted output
-F, --sort-by-file
sort output by file location
Informative output:
-h, --help
display this help and exit
-V, --version
output version information and exit
AUTHOR
Written by Bruno Haible.
REPORTING BUGS
Report bugs to <bug-gnu-gettext@gnu.org>.
COPYRIGHT
Copyright (C) 2001-2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICU-
LAR PURPOSE.
SEE ALSO
The full documentation for msguniq is maintained as a Texinfo manual. If the info and msguniq programs are properly installed at your
site, the command
info msguniq
should give you access to the complete manual.
GNU gettext-tools 0.14.4 April 2005 MSGUNIQ(1)