02-08-2002
Filtering duplicate lines
Does anybody know a command that filters duplicate lines out of a file. Similar to the uniq command but can handle duplicate lines no matter where they occur in a file?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi All,
I have huge trade file with milions of trades.I need to remove duplicate records (e.g I have following records)
30/10/2009,trdeId1,..,..
26/10/2009.tradeId1,..,..,,
30/10/2009,tradeId2,..
In the above case i need to filter duplicate recods and I should get following output.... (2 Replies)
Discussion started by: nmumbarkar
2 Replies
2. UNIX for Dummies Questions & Answers
I received this question for homework:
We have to write our program into a .sh file, with "#!/bin/bash" as the first line. And we have the list of access logs in a file, looking like this (it's nearly 10,000 lines long):
65.214.44.112 - - "GET /~user0/cgg/msg08400.html HTTP/1.0" 304 -... (1 Reply)
Discussion started by: Andrew9191
1 Replies
3. Shell Programming and Scripting
My input contains a single word lines.
From each line
data.txt
prjtestBlaBlatestBlaBla
prjthisBlaBlathisBlaBla
prjthatBlaBladpthatBlaBla
prjgoodBlaBladpgoodBlaBla
prjgood1BlaBla123dpgood1BlaBla123
Desired output -->
data_out.txt
prjtestBlaBla
prjthisBlaBla... (8 Replies)
Discussion started by: kchinnam
8 Replies
4. Homework & Coursework Questions
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
The uniq command excludes consecutive duplicate lines. It has a -c option to display a count of the number... (1 Reply)
Discussion started by: billydeanmak
1 Replies
5. UNIX for Advanced & Expert Users
Hi All,
I have a very huge file (4GB) which has duplicate lines. I want to delete duplicate lines leaving unique lines. Sort, uniq, awk '!x++' are not working as its running out of buffer space.
I dont know if this works : I want to read each line of the File in a For Loop, and want to... (16 Replies)
Discussion started by: krishnix
16 Replies
6. Shell Programming and Scripting
Hi I have a file like this. I need to eliminate lines with first column having the same value 10 times.
13 18 1 + chromosome 1, 122638287 AGAGTATGGTCGCGGTTG
13 18 1 + chromosome 1, 128904080 AGAGTATGGTCGCGGTTG
13 18 1 - chromosome 14, 13627938 CAACCGCGACCATACTCT
13 18 1 + chromosome 1,... (5 Replies)
Discussion started by: polsum
5 Replies
7. UNIX for Dummies Questions & Answers
I have a table to data which one of the columns include string of text
from within that, I am searching to include few lines but not others
for example I want to to include some combination of word address such as (address.| address? |the address | your address) but not (ip address | email... (17 Replies)
Discussion started by: A-V
17 Replies
8. Shell Programming and Scripting
Hi Guys,
Would need your expert help with the following situation..
I have a comma seperated .csv file, with a header row and data as follows
H1,H2,H3,H4,H5..... (header row)
0,0,0,0,0,1,2.... (data rows follow)
0,0,0,0,0,0,1
.........
.........
i need a code... (10 Replies)
Discussion started by: dev.devil.1983
10 Replies
9. Shell Programming and Scripting
Experts Good day,
I want to filter multiple lines of same error of same day , to only 1 error of each day, the first line from the log.
Here is the file:
May 26 11:29:19 cmihpx02 vmunix: NFS write failed for server cmiauxe1: error 5 (RPC: Timed out)
May 26 11:29:19 cmihpx02 vmunix: NFS... (4 Replies)
Discussion started by: rveri
4 Replies
10. Shell Programming and Scripting
Hi,
I am trying to compare epoch time in a huge log file (2 million lines) with todays date. I have to create two files one which has lines older than 10 days and another file with less than 10 days. I am using while do but it takes forever to complete the script. It would be helpful if you can... (12 Replies)
Discussion started by: shunya
12 Replies
UNIQ(1) User Commands UNIQ(1)
NAME
uniq - report or omit repeated lines
SYNOPSIS
uniq [OPTION]... [INPUT [OUTPUT]]
DESCRIPTION
Filter adjacent matching lines from INPUT (or standard input), writing to OUTPUT (or standard output).
With no options, matching lines are merged to the first occurrence.
Mandatory arguments to long options are mandatory for short options too.
-c, --count
prefix lines by the number of occurrences
-d, --repeated
only print duplicate lines, one for each group
-D print all duplicate lines
--all-repeated[=METHOD]
like -D, but allow separating groups with an empty line; METHOD={none(default),prepend,separate}
-f, --skip-fields=N
avoid comparing the first N fields
--group[=METHOD]
show all items, separating groups with an empty line; METHOD={separate(default),prepend,append,both}
-i, --ignore-case
ignore differences in case when comparing
-s, --skip-chars=N
avoid comparing the first N characters
-u, --unique
only print unique lines
-z, --zero-terminated
line delimiter is NUL, not newline
-w, --check-chars=N
compare no more than N characters in lines
--help display this help and exit
--version
output version information and exit
A field is a run of blanks (usually spaces and/or TABs), then non-blank characters. Fields are skipped before chars.
Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use 'sort -u' without
'uniq'. Also, comparisons honor the rules specified by 'LC_COLLATE'.
AUTHOR
Written by Richard M. Stallman and David MacKenzie.
REPORTING BUGS
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Report uniq translation bugs to <http://translationproject.org/team/>
COPYRIGHT
Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
comm(1), join(1), sort(1)
Full documentation at: <http://www.gnu.org/software/coreutils/uniq>
or available locally via: info '(coreutils) uniq invocation'
GNU coreutils 8.28 January 2018 UNIQ(1)