Read first column and count lines in second column using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Read first column and count lines in second column using awk
# 1  
Old 10-24-2015
Read first column and count lines in second column using awk

Hello all,

I would like to ask your help here:

I've a huge file that has 2 columns. A part of it is:

sorted.txt:

Code:
kss23 rml.67lkj
kss23 zhh.6gf
kss23 nhd.09.fdd
kss23 hp.767.88.89
fl67 nmdsfs.56.df.67
fl67 kk.fgf.98.56.n
fl67 bgdgdfg.hjj.879.d
fl66 kl..hfh.76.ghg
fl66 loedg.fdgdfg.hdfh
fl66 pi.ccxb.879..fh
fl66 jy.dggdg.8.76.436.dgdf
fl66 rt.dgd.577
nk45 uyfdhgfh.36.65
nk45 ihddfsdg.346

I want to count the second column lines for the identical first column values. O/p should be like:

Code:
kss23 4
fl67 3
fl66 5
nk45 2

So I would like to awk command to dynamically read all the 1st column name values and count 2nd column lines accordingly and throw the result as shown.

Please help. Thanks in advance !

Last edited by Scrutinizer; 10-24-2015 at 03:17 PM.. Reason: code tags
# 2  
Old 10-24-2015
Hi, try something like:
Code:
awk '{T[$1]++} END{for(i in T) print i,T[i]}' file

or
Code:
awk '$1!=p{if(NR>1)print p,t; t=0; p=$1}{t++} END{print p,t}' file

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 10-24-2015
Hi Scrutinizer,

Unfortunately, none of your solution works Smilie Any other suggestion?

Thank you in advance for your kind help.
# 4  
Old 10-24-2015
Code:
count=0
#gt=0
first_sw="Y"
prev_code=""
while read col1 col2
do
if [ "$first_sw" = "Y" ]
then
    prev_code=$col1
    first_sw="N"
fi
if [ "$col1" != "$prev_code" ]
then
    echo  $prev_code $count
    #gt=`expr $gt + $count`
    count=0
    prev_code=$col1
fi
count=`expr $count + 1`
done <sorted.txt
echo $prev_code $count
#gt=`expr $gt + $count`
#echo total records processed $gt

Elapsed time 3 minutes.
This User Gave Thanks to jgt For This Post:
# 5  
Old 10-24-2015
Quote:
Originally Posted by Padavan
Hi Scrutinizer,

Unfortunately, none of your solution works Smilie Any other suggestion?

Thank you in advance for your kind help.
Telling us "none of your solution works" without telling us how they don't work gives us no way to help resolve your problem.

If Scrutinizer's suggestions printed diagnostic messages, what were they?

If Scrutinizer's suggestions produced no diagnostics and produced output different from what you wanted, what output did they produce?

Whenever you ask for help in these forums, it helps us help you if Scrutinizer tell us what operating system and shell you're using. If you're using a Solaris/SunOS system, Scrutinizer would have told you to change awk in both of his suggestions to /usr/xpg4/bin/awk or nawk. What operating system and shell are you using?

Last edited by Don Cragun; 10-24-2015 at 07:02 PM.. Reason: Fix typos.
This User Gave Thanks to Don Cragun For This Post:
# 6  
Old 10-25-2015
Would this help:
Code:
cut -d" " -f1 file | sort | uniq -c
      5 fl66
      3 fl67
      4 kss23
      2 nk45

This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count Repetitive Number in a column and renumbering using awk

Unable to get the desired output. Need only the rows which has repeated values in column 5. Input File <tab separated file> chr1 3773797 3773797 CEP10 1 chr1 3773797 3773797 CEP104 2 chr1 3689350 3689350 SMIM1 2 chr1 3773797 3773797 CEP4 3 chr1 3773797 3773797 EP104 ... (7 Replies)
Discussion started by: himanshu
7 Replies

2. Shell Programming and Scripting

Awk: Print count for column in a file using awk

Hi, I have the following input in a file & need output as mentioned below(need counter of every occurance of field which is to be increased by 1). Input: 919143110065 919143110065 919143110052 918648846132 919143110012 918648873782 919143110152 919143110152 919143110152... (2 Replies)
Discussion started by: siramitsharma
2 Replies

3. Shell Programming and Scripting

Count occurrence of string in a column using awk

Hi, I want to count the occurrences of strings in a column and display as in example below: Input: get1 345 789 098 get2 567 982 090 fet4 777 610 632 get1 800 544 230 get1 600 788 451 get2 892 321 243 get1 673 111 235 fet3 789 220 278 fet4 768 222 341 output: 4 get1 345 789... (7 Replies)
Discussion started by: aydj
7 Replies

4. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

5. Shell Programming and Scripting

awk pattern match and count unique in column

Hi all I have a need of searching some pattern in file by month and then count unique records D11 G11 R11 -------> Pattern available in file S11 Jan$1 to $5 column contains some records in which I want to find unique for this purpose I have written script like below awk '/Jan/ ||... (4 Replies)
Discussion started by: nex_asp
4 Replies

6. UNIX for Dummies Questions & Answers

count number of distinct values in each column with awk

Hi ! input: A|B|C|D A|F|C|E A|B|I|C A|T|I|B As the title of the thread says, I would need to get: 1|3|2|4 I tried different variants of this command, but I don't manage to obtain what I need: gawk 'BEGIN{FS=OFS="|"}{for(i=1; i<=NF; i++) a++} END {for (b in a) print b}' input ... (2 Replies)
Discussion started by: beca123456
2 Replies

7. UNIX for Dummies Questions & Answers

Count the lines with the same values in a column and write the output to a file

Hey everyone! I have a tab delimited data set which I want to create an output contained the calculation of number of those lines with a certain value in 2nd and 3rd column. my input file is like this: ID1 1 10M AAATTTCCGG ID2 5 4M ACGT ID3 5 8M ACCTTGGA ID4 5 ... (7 Replies)
Discussion started by: @man
7 Replies

8. Shell Programming and Scripting

awk count characters, sum, and divide by another column

Hi All, I am another biologist attempting to parse a large txt file containing several million lines like: tucosnp 56762 T Y 228 228 60 23 .CcCcc,,..c.c,cc,,.C... What I need to do is get the frequency of periods (.) plus commas (,) in column 9, and populate this number into another... (1 Reply)
Discussion started by: peromhc
1 Replies

9. UNIX for Dummies Questions & Answers

how to count number of rows and sum of column using awk

Hi All, I have the following input which i want to process using AWK. Rows,NC,amount 1,1202,0.192387 2,1201,0.111111 3,1201,0.123456 i want the following output count of rows = 3 ,sum of amount = 0.426954 Many thanks (2 Replies)
Discussion started by: pistachio
2 Replies

10. Shell Programming and Scripting

How to read different lines into column

Dear experts, Need your great help ! The input file is as follows: Name: test01 Name UID: C7:9D:79:52:2A:3F:DF:11:A8:64:00:60:16:36:04:02 Shareable: YES HBA/SP Pairs: HBA UID SP Name SPPort ------- ... (9 Replies)
Discussion started by: tojzz
9 Replies
Login or Register to Ask a Question