Sponsored Content
Top Forums Shell Programming and Scripting Count number of unique values in each column of array Post 303011252 by Geneanalyst on Monday 15th of January 2018 07:01:25 PM
Old 01-15-2018
Quote:
Originally Posted by Don Cragun
Copying text containing <tab> characters into a bash shell with filename completion enabled can cause all sorts of strange things to happen. It may fail for other reasons as well. (I usually use ksh instead of bash.)

Have you tried running the code I suggested in post #8 which seems to provide output in a format closer to what you now seem to want? The code Yoda and Chubler_XL suggested in posts #2 and post #3 is in the format you specified in post #1 (which is totally different from your latest output format).
Again thanks for all you do to help out all the folks out there like myself. Great forum BTW.

I am sure many people in addition to geneticists will find this code useful.

Yes, I like your latest code, however, I need the actual sample names printed instead of the header consisting of COL1, COL2, ....

To do this I am grabbing the header with sample names (columns 10 thru N) from a different file. I use:

Code:
awk 'FNR==28 {for(i=9;i<=NF;i++){printf "%s ", $i}; printf "\n"}' a.txt

in the shell right before your code to print a header containing actual sample names which are contained in a.txt. So this is the last half of my code:

Code:
# Strip columns 1-9 and write to a2.txt
awk '{for(i=10;i<=NF;i++){printf "%s ", $i}; printf "\n"}' a1.txt > a2.txt
# PRINT HEADER
awk 'FNR==28 {for(i=9;i<=NF;i++){printf "%s ", $i}; printf "\n"}' a.txt
# Print count of 0/0 0/1 1/1 ./. for each sample in the run
awk '
{ 
   mc = NF > mc ? NF : mc
   for(i=NF; i; i--) {
      T[$i]
      C[i FS $i]++
   }
}
END {
  for(v in T) {
     printf "\n%s", v
     for(i=1; i<=mc;i++) printf "\t%d",C[i FS v]
  }
  printf "\n"
}' a2.txt


If I use your latest code which transposes the output, I would have to modify my code (highlighted above) to transpose the header containing the sample names also. Any ideas.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Not able to read unique values in array

Hi Friends, I am having some trouble reading into an array. Basically, I am trying to grep for a pattern and extract it's value and store the same into an array. For eg., if my input is: <L:RECORD>name=faisel farooq,age=21, company=TCS,project=BT</L:RECORD> <L:RECORD>name=abc... (1 Reply)
Discussion started by: faiz1985
1 Replies

2. Shell Programming and Scripting

print unique values of a column and sum up the corresponding values in next column

Hi All, I have a file which is having 3 columns as (string string integer) a b 1 x y 2 p k 5 y y 4 ..... ..... Question: I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Discussion started by: amigarus
6 Replies

3. UNIX for Dummies Questions & Answers

count number of rows based on other column values

Could anybody help with this? I have input below ..... david,39 david,39 emelie,40 clarissa,22 bob,42 bob,42 tim,32 bob,39 david,38 emelie,47 what i want to do is count how many names there are with different ages, so output would be like this .... david,2 emelie,2 clarissa,1... (3 Replies)
Discussion started by: itsme999
3 Replies

4. Shell Programming and Scripting

How to count Unique Values from a file.

Hi I have the following info in a file - <Cell id="25D"/> <Cell id="26A"/> <Cell id="26B"/> <Cell id="26C"/> <Cell id="27A"/> <Cell id="27B"/> <Cell id="27C"/> <Cell id="28A"/> I would like to know how would you go about counting all... (4 Replies)
Discussion started by: Prega
4 Replies

5. UNIX for Dummies Questions & Answers

count number of distinct values in each column with awk

Hi ! input: A|B|C|D A|F|C|E A|B|I|C A|T|I|B As the title of the thread says, I would need to get: 1|3|2|4 I tried different variants of this command, but I don't manage to obtain what I need: gawk 'BEGIN{FS=OFS="|"}{for(i=1; i<=NF; i++) a++} END {for (b in a) print b}' input ... (2 Replies)
Discussion started by: beca123456
2 Replies

6. Shell Programming and Scripting

Count frequency of unique values in specific column

Hi, I have tab-deliminated data similar to the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows: dot 3 cat 1 hot 1 is... (5 Replies)
Discussion started by: owwow14
5 Replies

7. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Discussion started by: angshuman
4 Replies

8. Shell Programming and Scripting

Print count of unique values

Hello experts, I am converting a number into its binary output as : read n echo "obase=2;$n" | bc I wish to count the maximum continuous occurrences of the digit 1. Example : 1. The binary equivalent of 5 = 101. Hence the output must be 1. 2. The binary... (3 Replies)
Discussion started by: H squared
3 Replies

9. UNIX for Beginners Questions & Answers

Awk: count unique element of array

Hi, tab-separated input: blabla_1 A,B,C,C blabla_2 A,E,G blabla_3 R,Q,A,B,C,R,Q output: blabla_1 3 blabla_2 3 blabla_3 5 After splitting $2 in an array, I am trying to store the number of unique elements in a variable, but have some difficulties resetting the variable to 0 before... (6 Replies)
Discussion started by: beca123456
6 Replies

10. UNIX for Beginners Questions & Answers

Count unique column

Hello, I am trying to count unique rows in my file based on 4 columns (2-5) and to output its frequency in a sixth column. My file is tab delimited My input file looks like this: Colum1 Colum2 Colum3 Colum4 Coulmn5 1.1 100 100 a b 1.1 100 100 a c 1.2 200 205 a d 1.3 300 301 a y 1.3 300... (6 Replies)
Discussion started by: nans
6 Replies
shells(4)							   File Formats 							 shells(4)

NAME
shells - shell database SYNOPSIS
/etc/shells DESCRIPTION
The shells file contains a list of the shells on the system. Applications use this file to determine whether a shell is valid. See getuser- shell(3C). For each shell a single line should be present, consisting of the shell's path, relative to root. A hash mark (#) indicates the beginning of a comment; subsequent characters up to the end of the line are not interpreted by the routines which search the file. Blank lines are also ignored. The following default shells are used by utilities: /bin/bash, /bin/csh, /bin/jsh, /bin/ksh, /bin/ksh93, /bin/pfcsh, /bin/pfksh, /bin/pfsh, /bin/sh, /bin/tcsh, /bin/zsh, /sbin/jsh, /sbin/sh, /usr/bin/bash, /usr/bin/csh, /usr/bin/jsh, /usr/bin/ksh, /usr/bin/ksh93, /usr/bin/pfcsh, /usr/bin/pfksh, /usr/bin/pfsh, and /usr/bin/sh, /usr/bin/tcsh, /usr/bin/zsh, and /usr/sfw/bin/zsh. /etc/shells overrides the default list. Invalid shells in /etc/shells could cause unexpected behavior, such as being unable to log in by way of ftp(1). FILES
/etc/shells list of shells on system SEE ALSO
vipw(1B), ftpd(1M), sendmail(1M), getusershell(3C), aliases(4) SunOS 5.11 20 Nov 2007 shells(4)
All times are GMT -4. The time now is 04:17 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy