count number of distinct values in each column with awk


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers count number of distinct values in each column with awk
# 1  
Old 08-24-2012
count number of distinct values in each column with awk

Hi !

input:
Code:
A|B|C|D
A|F|C|E
A|B|I|C
A|T|I|B

As the title of the thread says, I would need to get:
Code:
1|3|2|4

I tried different variants of this command, but I don't manage to obtain what I need:
Code:
gawk 'BEGIN{FS=OFS="|"}{for(i=1; i<=NF; i++) a[$i]++} END {for (b in a) print b}' input

Please Heeeeelp !!!!Smilie
# 2  
Old 08-24-2012
A little bit lengthy, but it works Smilie

Code:
gawk '

BEGIN {
  FS=OFS="|"
}

{
  for (i = 1; i <= NF; i++)
    a[i, $i]++;
}

END {
  for (i = 1; i <= NF; i++)
  {
     $i = 0;
     for (j in a)
     {
         split (j, k, SUBSEP);
         if (i == k[1]) $i++;
     }
  }
  print
}

' input

This User Gave Thanks to hergp For This Post:
# 3  
Old 08-24-2012
Code:
awk 'BEGIN{FS="|"}
{
 for(i=1;i<=NF;i++)
  c[i,$i]++
 if(NF>max)
  max=NF
}
END{
 for(i in c)
 {
  split(i,b,SUBSEP)
  d[b[1]]++
 }
 for(i=1;i<=max;i++)
  printf "%s|", d[i]
 print ""
}' file

This User Gave Thanks to elixir_sinari For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count Repetitive Number in a column and renumbering using awk

Unable to get the desired output. Need only the rows which has repeated values in column 5. Input File <tab separated file> chr1 3773797 3773797 CEP10 1 chr1 3773797 3773797 CEP104 2 chr1 3689350 3689350 SMIM1 2 chr1 3773797 3773797 CEP4 3 chr1 3773797 3773797 EP104 ... (7 Replies)
Discussion started by: himanshu
7 Replies

2. Shell Programming and Scripting

Count number of unique values in each column of array

What is an efficient way of counting the number of unique values in a 400 column by 1000 row array and outputting the counts per column, assuming the unique values in the array are: A, B, C, D In other words the output should look like: Value COL1 COL2 COL3 A 50 51 52... (16 Replies)
Discussion started by: Geneanalyst
16 Replies

3. Shell Programming and Scripting

Splitting the numeric vs alpha values in a column to distinct columns

How could i take an input file and split the numeric values from the alpha values (123 vs abc) to distinc columns, and if the source is blank to keep it blank (null) in both of the new columns: So if the source file had a column like: Value: |1 | |2.3| | | |No| I would... (7 Replies)
Discussion started by: driftlogic
7 Replies

4. Shell Programming and Scripting

Count specific column values

Hi all: quick question! I have the following data that resembles some thing like this: i am tired tired am i what is up hello people cool I want to count (or at least isolate) all of the unique elements in the 2nd column. I have tried this: cut -f 2 | uniq 'input' which does... (3 Replies)
Discussion started by: owwow14
3 Replies

5. Shell Programming and Scripting

Word count of values in a column

Hi friends, I have an input file of the following format a b c 1.11112 d e f 4.5767 g h i 19.098 k i l 87.9999 I am looking for an awk one liners that would help me in giving the following output output.txt Range of the column: 1.11112 to 87.9999 Total records between 1 and 10 - 2... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

6. Shell Programming and Scripting

average of distinct values with awk

Hi guys, I am not an expert in shell and I need help with awk command. I have a file with values like 200 1 1 200 7 2 200 6 3 200 5 4 300 3 1 300 7 2 300 6 3 300 4 4 I need resulting file with averages of... (3 Replies)
Discussion started by: saif
3 Replies

7. UNIX for Dummies Questions & Answers

count number of rows based on other column values

Could anybody help with this? I have input below ..... david,39 david,39 emelie,40 clarissa,22 bob,42 bob,42 tim,32 bob,39 david,38 emelie,47 what i want to do is count how many names there are with different ages, so output would be like this .... david,2 emelie,2 clarissa,1... (3 Replies)
Discussion started by: itsme999
3 Replies

8. UNIX for Dummies Questions & Answers

how to count number of rows and sum of column using awk

Hi All, I have the following input which i want to process using AWK. Rows,NC,amount 1,1202,0.192387 2,1201,0.111111 3,1201,0.123456 i want the following output count of rows = 3 ,sum of amount = 0.426954 Many thanks (2 Replies)
Discussion started by: pistachio
2 Replies

9. Shell Programming and Scripting

have to retrieve the distinct values (not duplicate) from 2nd column and display

I have a text file names test2 with 3 columns as below . We have to retrieve the distinct values (not duplicate) from 2nd column and display. I have used the below command but giving some error. NS3303 NS CRAFT LTD NS3303 NS CHIRON VACCINES LTD NS3303 NS ALLIED MEDICARE LTD NS3303 NS... (16 Replies)
Discussion started by: shirdi
16 Replies

10. Shell Programming and Scripting

Awk to print distinct col values

Hi Guys... I am newbie to awk and would like a solution to probably one of the simple practical questions. I have a test file that goes as: 1,2,3,4,5,6 7,2,3,8,7,6 9,3,5,6,7,3 8,3,1,1,1,1 4,4,2,2,2,2 I would like to know how AWK can get me the distinct values say for eg: on col2... (22 Replies)
Discussion started by: anduzzi
22 Replies
Login or Register to Ask a Question