When I use the below awk to count the unique lines in $4 for the input it seems to work. The answer is 3 because $4 is only unique 3 times in all the entries. However, when I use the same on actual data I get 56,536 and I know the answer should be 56,548. My question is there a better way to count the unique lines? Thank you .
Hi.
I have a tab separated file that has a couple nearly identical lines. When doing:
sort file | uniq > file.new
It passes through the nearly identical lines because, well, they still are unique.
a)
I want to look only at field x for uniqueness and if the content in field x is the... (1 Reply)
I have this input file
tilenet_test:clar_r5_performance:server_2:4.80762:0%:APM00083103999-009E,APM00083103999-009F
tilenet_int:clar_r5_performance:server_2:4.80762:0%:APM00083103999-00C4... (3 Replies)
I would like to print unique lines without sort or unique. Unfortunately the server I am working on does not have sort or unique. I have not been able to contact the administrator of the server to ask him to add it for several weeks. (7 Replies)
Im looking for an awk script that will take the unique values in column 5, then print and count the unique values in column 6.
CA001011500 11111 11111 -9999 201301 AAA
CA001012040 11111 11111 -9999 201301 AAA
CA001012573 11111 11111 -9999 201301 BBB
CA001012710 11111 11111 -9999 201301... (4 Replies)
Hello Team,
I need your help on the following:
My input file a.txt is as below:
3330690|373846|108471
3330690|373846|108471
0640829|459725|100001
0640829|459725|100001
3330690|373847|108471
Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Hi Folks,
I have a file with fields as follows which has last field in multiple lines. I would like to combine a line which has three fields with single field line for as shown in expected output. Please help.
INPUT
hname01 windows appnamec1eda_p1, ... (5 Replies)
I am trying to remove all the lines and spaces where the count in $4 or $5 is greater than 1 (more than 1 letter). The file and the output are tab-delimited. Thank you :).
file
X 5811530 . G C NLGN4X
17 10544696 . GA G MYH3
9 96439004 . C ... (1 Reply)
Hello,
I am trying to count unique rows in my file based on 4 columns (2-5) and to output its frequency in a sixth column. My file is tab delimited
My input file looks like this:
Colum1 Colum2 Colum3 Colum4 Coulmn5
1.1 100 100 a b
1.1 100 100 a c
1.2 200 205 a d
1.3 300 301 a y
1.3 300... (6 Replies)
For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt:
PS003,001 MZMWR/ L-DWD// *
PS003,001... (4 Replies)
Hi,
Sure it's an easy one, but it drives me insane.
input ("|" separated):
1|A,B,C,A
2|A,D,D
3|A,B,B
I would like to count the occurence of each capital letters in $2 across the entire file, knowing that duplicates in each record count as 1.
I am trying to get this output... (5 Replies)
Discussion started by: beca123456
5 Replies
LEARN ABOUT NETBSD
uniq
UNIQ(1) BSD General Commands Manual UNIQ(1)NAME
uniq -- report or filter out repeated lines in a file
SYNOPSIS
uniq [-cdu] [-f fields] [-s chars] [input_file [output_file]]
DESCRIPTION
The uniq utility reads the standard input comparing adjacent lines, and writes a copy of each unique input line to the standard output. The
second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are
not adjacent, so it may be necessary to sort the files first.
The following options are available:
-c Precede each output line with the count of the number of times the line occurred in the input, followed by a single space.
-d Don't output lines that are not repeated in the input.
-f fields
Ignore the first fields in each input line when doing comparisons. A field is a string of non-blank characters separated from adja-
cent fields by blanks. Field numbers are one based, i.e. the first field is field one.
-s chars
Ignore the first chars characters in each input line when doing comparisons. If specified in conjunction with the -f option, the
first chars characters after the first fields fields will be ignored. Character numbers are one based, i.e. the first character is
character one.
-u Don't output lines that are repeated in the input.
If additional arguments are specified on the command line, the first such argument is used as the name of an input file, the second is used
as the name of an output file.
The uniq utility exits 0 on success, and >0 if an error occurs.
COMPATIBILITY
The historic +number and -number options have been deprecated but are still supported in this implementation.
SEE ALSO sort(1)STANDARDS
The uniq utility is expected to be IEEE Std 1003.2 (``POSIX.2'') compatible.
BSD January 6, 2007 BSD