Sponsored Content
Top Forums Shell Programming and Scripting count the unique records based on certain columns Post 302691067 by miclow on Thursday 23rd of August 2012 08:34:39 PM
Old 08-23-2012
count the unique records based on certain columns

Hi everyone,

I have a file result.txt with records as following and another file mirna.txt with a list of miRNAs e.g. miR22, miR123, miR13 etc.

Gene Transcript miRNA

Gar Nm_111233 miR22
Gar Nm_123440 miR22
Gar Nm_129939 miR22
Hel Nm_233900 miR13
Hel Nm_678900 miR13
Bart Nm_178181 miR22
Gar Nm_789999 miR43

Now I want to count the number of gene for each miRNA in mirna.txt


e.g.
miR22 2
miR13 1
miR15 0
miR43 1



Previously, I used the following command but it counts every occurence of miRNA.

for gene in `cat mirna.txt`; do awk -v gene=$gene '{for(i=1; i<=NF; i++) if ($i==gene) c++} END {print c}' result.txt>>output.txt; done;


Any help is appreciated. Thanks in advance.


Mic

Last edited by miclow; 08-23-2012 at 10:00 PM..
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Record count based on a keyword in the records

Hi, Am having files with many records, i need to count and display the number of records based on the keyword in one of the column of the records. for e.g THE FILE CONTAINS TWO RECORDS LIKE. 200903031143150 0 1236060795054357lessrv1 BSNLSERVICE1 BSNLSERVICE1 ... (4 Replies)
Discussion started by: aemunathan
4 Replies

2. Shell Programming and Scripting

using awk to count no of records based on conditions

Hi I am having files with date and time stamp as the folder names like 200906051400,200906051500,200906051600 .....hence everyday 24 files will be generated i need to do certain things on this 24 files daily file contains the data like 200906050016370 0 1244141195225298lessrv3 ... (13 Replies)
Discussion started by: aemunathan
13 Replies

3. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

4. UNIX for Dummies Questions & Answers

How to count specific columns and merge with unique ones?

Hi. I am not sure the title gives an optimal description of what I want to do. I have several text files that contain data in many columns. All the files are organized the same way, but the data in the columns might differ. I want to count the number of times data occur in specific columns,... (0 Replies)
Discussion started by: JamesT
0 Replies

5. Shell Programming and Scripting

Print unique records in 2 columns using awk

Is it possible to print the records that has only 1 value in 2nd column. Ex: input awex1 1 awex1 2 awex1 3 assww 1 ader34 1 ader34 2 output assww 1 (5 Replies)
Discussion started by: quincyjones
5 Replies

6. Shell Programming and Scripting

Find and count unique date values in a file based on position

Hello, I need some sort of way to extract every date contained in a file, and count how many of those dates there are. Here are the specifics: The date format I'm looking for is mm/dd/yyyy I only need to look after line 45 in the file (that's where the data begins) The columns of... (2 Replies)
Discussion started by: ronan1219
2 Replies

7. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

8. Shell Programming and Scripting

Merge records based on multiple columns

Hi, I have a file with 16 columns and out of these 16 columns 14 are key columns, 15 th is order column and 16th column is having information. I need to concate the 16th column based on value of 1-14th column as key in order of 15th column. Here are the example file Input File (multiple... (3 Replies)
Discussion started by: Ravi Agrawal
3 Replies

9. Shell Programming and Scripting

Insert Columns before the last Column based on the Count of Delimiters

Hi, I have a requirement where in I need to insert delimiters before the last column of the total delimiters is less than a specified number. Say if the delimiters is less than 139, I need to insert 2 columns ( with blanks) before the last field awk -F 'Ç' '{ if (NF-1 < 139)} END { "Insert 2... (5 Replies)
Discussion started by: arunkesi
5 Replies
Bio::LiveSeq::Gene(3pm) 				User Contributed Perl Documentation				   Bio::LiveSeq::Gene(3pm)

NAME
Bio::LiveSeq::Gene - Range abstract class for LiveSeq SYNOPSIS
# documentation needed DESCRIPTION
This is used as storage for all object references concerning a particular gene. AUTHOR - Joseph A.L. Insana Email: Insana@ebi.ac.uk, jinsana@gmx.net APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ new Title : new Usage : $gene = Bio::LiveSeq::Gene->new(-name => "name", -features => $hashref -upbound => $min -downbound => $max); Function: generates a new Bio::LiveSeq::Gene Returns : reference to a new object of class Gene Errorcode -1 Args : one string and one hashreference containing all features defined for the Gene and the references to the LiveSeq objects for those features. Two labels for defining boundaries of the gene. Usually the boundaries will reflect max span of transcript, exon... features, while the DNA sequence will be created with some flanking regions (e.g. with the EMBL_SRS::gene2liveseq routine). If these two labels are not given, they will default to the start and end of the DNA object. Note : the format of the hash has to be like DNA => reference to LiveSeq::DNA object Transcripts => reference to array of transcripts objrefs Transclations => reference to array of transcripts objrefs Exons => .... Introns => .... Prim_Transcripts => .... Repeat_Units => .... Repeat_Regions => .... Only DNA and Transcripts are mandatory verbose Title : verbose Usage : $self->verbose(0) Function: Sets verbose level for how ->warn behaves -1 = silent: no warning 0 = reduced: minimal warnings 1 = default: all warnings 2 = extended: all warnings + stack trace dump 3 = paranoid: a warning becomes a throw and the program dies Note: a quick way to set all LiveSeq objects at the same verbosity level is to change the DNA level object, since they all look to that one if their verbosity_level attribute is not set. But the method offers fine tuning possibility by changing the verbose level of each object in a different way. So for example, after $loader= and $gene= have been retrieved by a program, the command $gene->verbose(0); would set the default verbosity level to 0 for all objects. Returns : the current verbosity level Args : -1,0,1,2 or 3 perl v5.14.2 2012-03-02 Bio::LiveSeq::Gene(3pm)
All times are GMT -4. The time now is 07:36 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy