For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt:
The output I desire is this:
I have attempted 'sort' with appropriate flags which should work, but for some reason I cannot get it to. For example:
I have also tried an 'awk' solution:
Both of the latter seem to give me the first of the two repeated values in $1, such as:
However, this is not correct. Any help would be greatly appreciated.
I want to find the top N entries for a certain field based on the values of another field.
For example if N=3, we want the 3 best values for each entry:
Entry1 ||| 100
Entry1 ||| 95
Entry1 ||| 30
Entry1 ||| 80
Entry1 ||| 50
Entry2 ||| 40
Entry2 ||| 20
Entry2 ||| 10
Entry2 ||| 50... (1 Reply)
Hi.
I have a tab separated file that has a couple nearly identical lines. When doing:
sort file | uniq > file.new
It passes through the nearly identical lines because, well, they still are unique.
a)
I want to look only at field x for uniqueness and if the content in field x is the... (1 Reply)
Hi,
Is there any short method to print from a particular field till another filed using awk?
Example File:
File1
====
1|2|acv|vbc|......|100|342
2|3|afg|nhj|.......|100|346
Expected output:
File2
====
acv|vbc|.....|100
afg|nhj|.....|100 (8 Replies)
Hey,
I'm sure this is answered somewhere but my Googling has turned up nothing. I have a file with data in the following format:
<desription of event> at <time and date>The desription of the event is variable length and hence when the list is displayed it is hard to easily see the date (and... (8 Replies)
I have a need to print nth field based on the parameter passed. Suppose I have 3 fields in a file, passing 1 to the function should print 1st field and so on.
I have attempted below function but this throws an error due to incorrect awk syntax.
function calcmaxlen
{
FIELDMAXLEN=0
... (5 Replies)
I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
I am trying to use awk to print the unique entries in $2
So in the example below there are 3 lines but 2 of the lines match in $2 so only one is used in the output.
File.txt
chr17:29667512-29667673 NF1:exon.1;NF1:exon.2;NF1:exon.38;NF1:exon.4;NF1:exon.46;NF1:exon.47 703.807... (5 Replies)
Trying to print the unique values in $2 before the -, currently the count is displayed. Hopefully, the below is close. Thank you :).
file
chr2:46603668-46603902 EPAS1-902|gc=54.3 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 195.8... (3 Replies)
Hello experts,
I am converting a number into its binary output as :
read n
echo "obase=2;$n" | bc
I wish to count the maximum continuous occurrences of the digit 1.
Example :
1. The binary equivalent of 5 = 101. Hence the output must be 1.
2. The binary... (3 Replies)
In the awk below I am trying to print the entire line, along with the header row, if $2 is SNV or MNV or INDEL. If that condition is met or is true, and $3 is less than or equal to 0.05, then in $7 the sub pattern :GMAF= is found and the value after the = sign is checked. If that value is less than... (0 Replies)
Discussion started by: cmccabe
0 Replies
LEARN ABOUT SUSE
uniq
UNIQ(1) User Commands UNIQ(1)NAME
uniq - report or omit repeated lines
SYNOPSIS
uniq [OPTION]... [INPUT [OUTPUT]]
DESCRIPTION
Discard all but one of successive identical lines from INPUT (or standard input), writing to OUTPUT (or standard output).
Mandatory arguments to long options are mandatory for short options too.
-c, --count
prefix lines by the number of occurrences
-d, --repeated
only print duplicate lines
-D, --all-repeated[=delimit-method]
print all duplicate lines delimit-method={none(default),prepend,separate} Delimiting is done with blank lines.
-f, --skip-fields=N
avoid comparing the first N fields
-i, --ignore-case
ignore differences in case when comparing
-s, --skip-chars=N
avoid comparing the first N characters
-u, --unique
only print unique lines
-z, --zero-terminated
end lines with 0 byte, not newline
-w, --check-chars=N
compare no more than N characters in lines
--help display this help and exit
--version
output version information and exit
A field is a run of blanks (usually spaces and/or TABs), then non-blank characters. Fields are skipped before chars.
Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use `sort -u' without
`uniq'.
AUTHOR
Written by Richard M. Stallman and David MacKenzie.
REPORTING BUGS
Report uniq bugs to bug-coreutils@gnu.org
GNU coreutils home page: <http://www.gnu.org/software/coreutils/>
General help using GNU software: <http://www.gnu.org/gethelp/>
COPYRIGHT
Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
The full documentation for uniq is maintained as a Texinfo manual. If the info and uniq programs are properly installed at your site, the
command
info coreutils 'uniq invocation'
should give you access to the complete manual.
GNU coreutils 7.1 July 2010 UNIQ(1)