Sponsored Content
Top Forums Shell Programming and Scripting count the unique records based on certain columns Post 302691067 by miclow on Thursday 23rd of August 2012 08:34:39 PM
Old 08-23-2012
count the unique records based on certain columns

Hi everyone,

I have a file result.txt with records as following and another file mirna.txt with a list of miRNAs e.g. miR22, miR123, miR13 etc.

Gene Transcript miRNA

Gar Nm_111233 miR22
Gar Nm_123440 miR22
Gar Nm_129939 miR22
Hel Nm_233900 miR13
Hel Nm_678900 miR13
Bart Nm_178181 miR22
Gar Nm_789999 miR43

Now I want to count the number of gene for each miRNA in mirna.txt


e.g.
miR22 2
miR13 1
miR15 0
miR43 1



Previously, I used the following command but it counts every occurence of miRNA.

for gene in `cat mirna.txt`; do awk -v gene=$gene '{for(i=1; i<=NF; i++) if ($i==gene) c++} END {print c}' result.txt>>output.txt; done;


Any help is appreciated. Thanks in advance.


Mic

Last edited by miclow; 08-23-2012 at 10:00 PM..
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Record count based on a keyword in the records

Hi, Am having files with many records, i need to count and display the number of records based on the keyword in one of the column of the records. for e.g THE FILE CONTAINS TWO RECORDS LIKE. 200903031143150 0 1236060795054357lessrv1 BSNLSERVICE1 BSNLSERVICE1 ... (4 Replies)
Discussion started by: aemunathan
4 Replies

2. Shell Programming and Scripting

using awk to count no of records based on conditions

Hi I am having files with date and time stamp as the folder names like 200906051400,200906051500,200906051600 .....hence everyday 24 files will be generated i need to do certain things on this 24 files daily file contains the data like 200906050016370 0 1244141195225298lessrv3 ... (13 Replies)
Discussion started by: aemunathan
13 Replies

3. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

4. UNIX for Dummies Questions & Answers

How to count specific columns and merge with unique ones?

Hi. I am not sure the title gives an optimal description of what I want to do. I have several text files that contain data in many columns. All the files are organized the same way, but the data in the columns might differ. I want to count the number of times data occur in specific columns,... (0 Replies)
Discussion started by: JamesT
0 Replies

5. Shell Programming and Scripting

Print unique records in 2 columns using awk

Is it possible to print the records that has only 1 value in 2nd column. Ex: input awex1 1 awex1 2 awex1 3 assww 1 ader34 1 ader34 2 output assww 1 (5 Replies)
Discussion started by: quincyjones
5 Replies

6. Shell Programming and Scripting

Find and count unique date values in a file based on position

Hello, I need some sort of way to extract every date contained in a file, and count how many of those dates there are. Here are the specifics: The date format I'm looking for is mm/dd/yyyy I only need to look after line 45 in the file (that's where the data begins) The columns of... (2 Replies)
Discussion started by: ronan1219
2 Replies

7. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

8. Shell Programming and Scripting

Merge records based on multiple columns

Hi, I have a file with 16 columns and out of these 16 columns 14 are key columns, 15 th is order column and 16th column is having information. I need to concate the 16th column based on value of 1-14th column as key in order of 15th column. Here are the example file Input File (multiple... (3 Replies)
Discussion started by: Ravi Agrawal
3 Replies

9. Shell Programming and Scripting

Insert Columns before the last Column based on the Count of Delimiters

Hi, I have a requirement where in I need to insert delimiters before the last column of the total delimiters is less than a specified number. Say if the delimiters is less than 139, I need to insert 2 columns ( with blanks) before the last field awk -F 'Ç' '{ if (NF-1 < 139)} END { "Insert 2... (5 Replies)
Discussion started by: arunkesi
5 Replies
Bio::Tools::Fgenesh(3pm)				User Contributed Perl Documentation				  Bio::Tools::Fgenesh(3pm)

NAME
Bio::Tools::Fgenesh - parse results of one Fgenesh run SYNOPSIS
use Bio::Tools::Fgenesh; $fgenesh = Bio::Tools::Fgenesh->new(-file => 'result.fgenesh'); # filehandle: $fgenesh = Bio::Tools::Fgenesh->new( -fh => *INPUT ); # parse the results # note: this class is-a Bio::Tools::AnalysisResult which implements # Bio::SeqAnalysisParserI, i.e., $fgensh->next_feature() is the same while($gene = $fgenesh->next_prediction()) { # $gene is an instance of Bio::Tools::Prediction::Gene, which inherits # off Bio::SeqFeature::Gene::Transcript. # # $gene->exons() returns an array of # Bio::Tools::Prediction::Exon objects # all exons: @exon_arr = $gene->exons(); # initial exons only @init_exons = $gene->exons('Initial'); # internal exons only @intrl_exons = $gene->exons('Internal'); # terminal exons only @term_exons = $gene->exons('Terminal'); # singleton exons: ($single_exon) = $gene->exons(); } # essential if you gave a filename at initialization (otherwise the file # will stay open) $fgenesh->close(); DESCRIPTION
The Fgenesh module provides a parser for Fgenesh (version 2) gene structure prediction output. It parses one gene prediction into a Bio::SeqFeature::Gene::Transcript- derived object. This module also implements the Bio::SeqAnalysisParserI interface, and thus can be used wherever such an object fits. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHOR - Chris Dwan Email chris-at-dwan.org APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ analysis_method Usage : $genscan->analysis_method(); Purpose : Inherited method. Overridden to ensure that the name matches /genscan/i. Returns : String Argument : n/a next_feature Title : next_feature Usage : while($gene = $fgenesh->next_feature()) { # do something } Function: Returns the next gene structure prediction of the Fgenesh result file. Call this method repeatedly until FALSE is returned. The returned object is actually a SeqFeatureI implementing object. This method is required for classes implementing the SeqAnalysisParserI interface, and is merely an alias for next_prediction() at present. Example : Returns : A Bio::Tools::Prediction::Gene object. Args : next_prediction Title : next_prediction Usage : while($gene = $fgenesh->next_prediction()) { ... } Function: Returns the next gene structure prediction of the Genscan result file. Call this method repeatedly until FALSE is returned. Example : Returns : A Bio::Tools::Prediction::Gene object. Args : _parse_predictions Title : _parse_predictions() Usage : $obj->_parse_predictions() Function: Parses the prediction section. Automatically called by next_prediction() if not yet done. Example : Returns : _prediction Title : _prediction() Usage : $gene = $obj->_prediction() Function: internal Example : Returns : _add_prediction Title : _add_prediction() Usage : $obj->_add_prediction($gene) Function: internal Example : Returns : _predictions_parsed Title : _predictions_parsed Usage : $obj->_predictions_parsed Function: internal Example : Returns : TRUE or FALSE _has_cds Title : _has_cds() Usage : $obj->_has_cds() Function: Whether or not the result contains the predicted CDSs, too. Example : Returns : TRUE or FALSE _read_fasta_seq Title : _read_fasta_seq() Usage : ($id,$seqstr) = $obj->_read_fasta_seq(); Function: Simple but specialised FASTA format sequence reader. Uses $self->_readline() to retrieve input, and is able to strip off the traling description lines. Example : Returns : An array of two elements: fasta_id & sequence perl v5.14.2 2012-03-02 Bio::Tools::Fgenesh(3pm)
All times are GMT -4. The time now is 10:31 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy