Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 07-03-2012
Registered User
 
Join Date: Jul 2012
Location: Sweden
Posts: 50
Thanks: 24
Thanked 0 Times in 0 Posts
Count the lines with the same values in a column and write the output to a file

Hey everyone!

I have a tab delimited data set which I want to create an output contained the calculation of number of those lines with a certain value in 2nd and 3rd column.

my input file is like this:

Code:
ID1   1   10M   AAATTTCCGG
ID2   5    4M    ACGT
ID3   5    8M    ACCTTGGA
ID4   5    8M    ACCTTGGA
ID5   5    8M    ACCTTGGA
ID6   20   3M   TCG
ID7   20   3M   TCG
ID8   20   12M   AACCTTGGCCTT
ID9   20   12M   AACCTTGGCCTT
ID10   20   12M   AACCTTGGCCTT

I want my output to be like this:

Code:
1    10M    1    AAATTTCCGG
5    4M     1    ACGT
5    8M     3    ACCTTGGA
20   3M    2    TCG
20   12M   3   AACCTTGGCCTT

Thanks in advance!
Sponsored Links
    #2  
Old 07-03-2012
Registered User
 
Join Date: Mar 2009
Location: Chennai, India
Posts: 12
Thanks: 0
Thanked 1 Time in 1 Post
Try the below,


Code:
cut -f2- FileName | sort | uniq -c

The first column will give you the count of the occurrences.
Sponsored Links
    #3  
Old 07-03-2012
radoulov's Avatar
--
 
Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 5,468
Thanks: 139
Thanked 538 Times in 506 Posts
Let us know if the order matters (or just add | sort -nk1 -k2 after the script).

Code:
awk 'END {
  for (k in c) 
    print k, c[k], d[k] 
  }
{
  k = $2 OFS $3
  c[k]++; d[k] = $NF
  }' infile

The Following User Says Thank You to radoulov For This Useful Post:
@man (07-03-2012)
    #4  
Old 07-03-2012
Registered User
 
Join Date: Jul 2012
Location: Sweden
Posts: 50
Thanks: 24
Thanked 0 Times in 0 Posts
Thanks Athix. I tried this code. But it didn't work. It just gives value 1 for all lines in the first column which is not true. And it keeps repeating the lines with the same values in 2nd and 3rd and 4th columns which I don't want.

---------- Post updated at 03:49 PM ---------- Previous update was at 03:46 PM ----------

Thnaks radoulov. I'm really newbie! I need more explanation. I tried to copy and paste what you said into terminal but I couldn't manage to give the path for the input file. Let me why although I know it is stupid! BTW, I already sorted my file using this script:

Code:
sort -n -k1 -k2 <filename>

Sponsored Links
    #5  
Old 07-03-2012
radoulov's Avatar
--
 
Join Date: Jan 2007
Location: Варна, България / Milano, Italia
Posts: 5,468
Thanks: 139
Thanked 538 Times in 506 Posts

Code:
awk 'END {  
   for (k in c)      
    print k, c[k], d[k]    
    } 
{   
  k = $2 OFS $3  
  c[k]++; d[k] = $NF   
    }' <filename>

The Following User Says Thank You to radoulov For This Useful Post:
@man (07-03-2012)
Sponsored Links
    #6  
Old 07-03-2012
itkamaraj's Avatar
^Kamaraj^
 
Join Date: Apr 2010
Posts: 3,025
Thanks: 33
Thanked 647 Times in 625 Posts

Code:
$ awk '{print $3,$2,$4}' input.txt | sort | uniq -c | awk '{print $3,$2,$1,$4}' | sort -n
1 10M 1 AAATTTCCGG
5 4M 1 ACGT
5 8M 3 ACCTTGGA
20 12M 3 AACCTTGGCCTT
20 3M 2 TCG

The Following User Says Thank You to itkamaraj For This Useful Post:
@man (07-03-2012)
Sponsored Links
    #7  
Old 07-03-2012
Registered User
 
Join Date: Jul 2012
Location: Sweden
Posts: 50
Thanks: 24
Thanked 0 Times in 0 Posts
Thanks radoulov! It works perfectly! It just prints the last column of my dataset as the last column of output. In my real data I want the 10th column for the last column in the output! Thanks alot!
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Word count of values in a column jacobs.smith Shell Programming and Scripting 5 06-12-2012 03:21 PM
How to compare the values of a column in awk in a same file and consecutive lines.. manuswami Shell Programming and Scripting 4 04-04-2012 07:23 AM
Count Number Of lines in text files and append values to beginning of file motoxeryz125 UNIX for Dummies Questions & Answers 7 04-28-2011 02:36 AM
Help with script to read lines from file and count values gman2010 Shell Programming and Scripting 2 04-27-2011 08:37 PM
count number of rows based on other column values itsme999 UNIX for Dummies Questions & Answers 3 08-29-2010 05:11 PM



All times are GMT -4. The time now is 12:34 AM.