Count unique column Post: 302997595

Sponsored Content

Top Forums UNIX for Beginners Questions & Answers Count unique column Post 302997595 by Don Cragun on Wednesday 17th of May 2017 05:40:38 AM

05-17-2017

Registered User

Quote:

Originally Posted by nans

Ah yes, thank you. Though the output looks

Code:

Colum1 Colum2 Colum3 Colum4 Column5 Column6
1.1 100 100 a b^M 1
1.1 100 100 a c^M 1
1.2 200 205 a d^M 1
1.3 300 301 a y^M 2
1.4 400 410 a b^M 1
1.5 500 510 a c^M  1
1.5 500 500 a d^M  1
1.5 500 500 a y^M  2

But that should be okay, I can always use sed to remove the ^M characters. Thank you.

I don't see how this code prints out the heading line, but you can get rid of the carriage return characters in the awk script without needing to also invoke sed:

Code:

awk '{gsub(/\r/,"")}FNR>1 && FNR==NR{A[$2,$3,$4,$5]++;next} (($2,$3,$4,$5) in A){print $0,A[$2,$3,$4,$5];delete A[$2,$3,$4,$5];next}'   Input_file  Input_file

If you want the augmented header line, you might try the following (in a formatI find it a little bit easier to read):

Code:

awk '
{	gsub(/\r/, "")
}
NR==1 {	print $0, "Column6"
	next
}
FNR>1 && FNR==NR {
	A[$2, $3, $4, $5]++
	next
}
(($2, $3, $4, $5) in A) {
	print $0, A[$2, $3, $4, $5]
	delete A[$2, $3, $4, $5]
}'   OFS='\t' Input_file  Input_file

Note that the sample input and output you provided used <space> as a field delimiter but you said your files were <tab> delimited. I specified <tab> as the output field separator here assuming that your real data is <tab> delimited.

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to count unique strings

How do I count the total number of unique strings from a file using Perl? Any help is appreciated..

2. Shell Programming and Scripting

Unique count from flat file

Hello Guys I have a flat file with '|~|' delimited When I use to record count using below command awk -FS"+" ' {print $colno}' filename | wc -l the count is fine But when I am trying to find the unique number of record the o/p is always 1 awk -FS"+" ' {print $colno}'...

3. Shell Programming and Scripting

awk pattern match and count unique in column

Hi all I have a need of searching some pattern in file by month and then count unique records D11 G11 R11 -------> Pattern available in file S11 Jan$1 to $5 column contains some records in which I want to find unique for this purpose I have written script like below awk '/Jan/ ||...

4. Shell Programming and Scripting

Count frequency of unique values in specific column

Hi, I have tab-deliminated data similar to the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows: dot 3 cat 1 hot 1 is...

5. Shell Programming and Scripting

awk to count using each unique value

Im looking for an awk script that will take the unique values in column 5, then print and count the unique values in column 6. CA001011500 11111 11111 -9999 201301 AAA CA001012040 11111 11111 -9999 201301 AAA CA001012573 11111 11111 -9999 201301 BBB CA001012710 11111 11111 -9999 201301...

6. Shell Programming and Scripting

Count of unique lines in field 4

When I use the below awk to count the unique lines in $4 for the input it seems to work. The answer is 3 because $4 is only unique 3 times in all the entries. However, when I use the same on actual data I get 56,536 and I know the answer should be 56,548. My question is there a better way to...

7. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are...

8. Shell Programming and Scripting

Print count of unique values

Hello experts, I am converting a number into its binary output as : read n echo "obase=2;$n" | bc I wish to count the maximum continuous occurrences of the digit 1. Example : 1. The binary equivalent of 5 = 101. Hence the output must be 1. 2. The binary...

9. UNIX for Beginners Questions & Answers

Count unique words

Dear all, I would like to know how to list and count unique words in thousands number of text files. Please help me out thanks in advance

10. Shell Programming and Scripting

Count number of unique values in each column of array

What is an efficient way of counting the number of unique values in a 400 column by 1000 row array and outputting the counts per column, assuming the unique values in the array are: A, B, C, D In other words the output should look like: Value COL1 COL2 COL3 A 50 51 52...

LEARN ABOUT CENTOS

gsl-histogram

GSL-HISTOGRAM(1)					      General Commands Manual						  GSL-HISTOGRAM(1)

NAME

       gsl-histogram - compute histogram of data on stdin

SYNOPSYS

       gsl-histogram xmin xmax [n]

DESCRIPTION

       gsl-histogram  is  a demonstration program for the GNU Scientific Library.  It takes three arguments, specifying the upper and lower bounds
       of the histogram and the number of bins.  It then reads numbers from `stdin', one line at a time, and adds them	to  the  histogram.   When
       there  is no more data to read it prints out the accumulated histogram using gsl_histogram_fprintf.  If n is unspecified then bins of inte-
       ger width are used.

EXAMPLE

       Here is an example.  We generate 10000 random samples from a Cauchy distribution with a width of 30 and histogram them over the range  -100
       to 100, using 200 bins.

	    gsl-randist 0 10000 cauchy 30 | gsl-histogram -100 100 200 > histogram.dat

       A  plot	of  the  resulting histogram will show the familiar shape of the Cauchy distribution with fluctuations caused by the finite sample
       size.

	    awk '{print $1, $3 ; print $2, $3}' histogram.dat | graph -T X

SEE ALSO

       gsl(3), gsl-randist(1).

AUTHOR

       gsl-histogram was written by Brian Gough.  Copyright 1996-2000; for copying conditions see the GNU General Public Licence.

       This manual page was added by the Dirk Eddelbuettel <edd@debian.org>, the Debian GNU/Linux maintainer for GSL.

GNU
																  GSL-HISTOGRAM(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to count unique strings

Discussion started by: my_Perl

2. Shell Programming and Scripting

Unique count from flat file

Discussion started by: Pratik4891

3. Shell Programming and Scripting

awk pattern match and count unique in column

Discussion started by: nex_asp

4. Shell Programming and Scripting

Count frequency of unique values in specific column

Discussion started by: owwow14

5. Shell Programming and Scripting

awk to count using each unique value

Discussion started by: ncwxpanther

6. Shell Programming and Scripting

Count of unique lines in field 4

Discussion started by: cmccabe

7. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Discussion started by: angshuman

8. Shell Programming and Scripting

Print count of unique values

Discussion started by: H squared

9. UNIX for Beginners Questions & Answers

Count unique words

Discussion started by: imranrasheedamu

10. Shell Programming and Scripting

Count number of unique values in each column of array

Discussion started by: Geneanalyst

LEARN ABOUT CENTOS

gsl-histogram