08-23-2012
count the unique records based on certain columns
Hi everyone,
I have a file result.txt with records as following and another file mirna.txt with a list of miRNAs e.g. miR22, miR123, miR13 etc.
Gene Transcript miRNA
Gar Nm_111233 miR22
Gar Nm_123440 miR22
Gar Nm_129939 miR22
Hel Nm_233900 miR13
Hel Nm_678900 miR13
Bart Nm_178181 miR22
Gar Nm_789999 miR43
Now I want to count the number of gene for each miRNA in mirna.txt
e.g.
miR22 2
miR13 1
miR15 0
miR43 1
Previously, I used the following command but it counts every occurence of miRNA.
for gene in `cat mirna.txt`; do awk -v gene=$gene '{for(i=1; i<=NF; i++) if ($i==gene) c++} END {print c}' result.txt>>output.txt; done;
Any help is appreciated. Thanks in advance.
Mic
Last edited by miclow; 08-23-2012 at 10:00 PM..
9 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
Am having files with many records, i need to count and display the number of records based on the keyword in one of the column of the records.
for e.g THE FILE CONTAINS TWO RECORDS LIKE.
200903031143150 0 1236060795054357lessrv1 BSNLSERVICE1 BSNLSERVICE1 ... (4 Replies)
Discussion started by: aemunathan
4 Replies
2. Shell Programming and Scripting
Hi
I am having files with date and time stamp as the folder names like 200906051400,200906051500,200906051600 .....hence everyday 24 files will be generated
i need to do certain things on this 24 files daily
file contains the data like
200906050016370 0 1244141195225298lessrv3 ... (13 Replies)
Discussion started by: aemunathan
13 Replies
3. Shell Programming and Scripting
Hi,
snp.txt
CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B
5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081
5 ... (12 Replies)
Discussion started by: genehunter
12 Replies
4. UNIX for Dummies Questions & Answers
Hi. I am not sure the title gives an optimal description of what I want to do.
I have several text files that contain data in many columns. All the files are organized the same way, but the data in the columns might differ. I want to count the number of times data occur in specific columns,... (0 Replies)
Discussion started by: JamesT
0 Replies
5. Shell Programming and Scripting
Is it possible to print the records that has only 1 value in 2nd column.
Ex:
input
awex1 1
awex1 2
awex1 3
assww 1
ader34 1
ader34 2
output
assww 1 (5 Replies)
Discussion started by: quincyjones
5 Replies
6. Shell Programming and Scripting
Hello,
I need some sort of way to extract every date contained in a file, and count how many of those dates there are.
Here are the specifics:
The date format I'm looking for is mm/dd/yyyy
I only need to look after line 45 in the file (that's where the data begins)
The columns of... (2 Replies)
Discussion started by: ronan1219
2 Replies
7. Linux
cat sample.csv
ID,Name,no
1,AAA,1
2,BBB,1
3,AAA,1
4,BBB,1
cut -d',' -f2 sample.csv | sort | uniq
this gives only the 2nd column values
Name
AAA
BBB
How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies
8. Shell Programming and Scripting
Hi,
I have a file with 16 columns and out of these 16 columns 14 are key columns, 15 th is order column and 16th column is having information. I need to concate the 16th column based on value of 1-14th column as key in order of 15th column. Here are the example file
Input File (multiple... (3 Replies)
Discussion started by: Ravi Agrawal
3 Replies
9. Shell Programming and Scripting
Hi,
I have a requirement where in I need to insert delimiters before the last column of the total delimiters is less than a specified number.
Say if the delimiters is less than 139, I need to insert 2 columns ( with blanks) before the last field
awk -F 'Ç' '{ if (NF-1 < 139)} END { "Insert 2... (5 Replies)
Discussion started by: arunkesi
5 Replies
LEARN ABOUT DEBIAN
bio::tools::prediction::gene
Bio::Tools::Prediction::Gene(3pm) User Contributed Perl Documentation Bio::Tools::Prediction::Gene(3pm)
NAME
Bio::Tools::Prediction::Gene - a predicted gene structure feature
SYNOPSIS
#See documentation of methods.
DESCRIPTION
A feature representing a predicted gene structure. This class actually inherits off Bio::SeqFeature::Gene::Transcript and therefore has all
that functionality, plus a few methods supporting predicted sequence features, like a predicted CDS and a predicted translation.
Exons held by an instance of this class will usually be instances of Bio::Tools::Prediction::Exon, although they do not have to be. Refer
to the documentation of the class that produced the instance.
Normally, you will not want to create an instance of this class yourself. Instead, classes representing the results of gene structure
prediction programs will do that.
FEEDBACK
Mailing Lists
User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one
of the Bioperl mailing lists. Your participation is much appreciated.
bioperl-l@bioperl.org - General discussion
http://bioperl.org/wiki/Mailing_lists - About the mailing lists
Support
Please direct usage questions or support issues to the mailing list:
bioperl-l@bioperl.org
rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address
it. Please include a thorough description of the problem with code and data examples if at all possible.
Reporting Bugs
Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the
web:
https://redmine.open-bio.org/projects/bioperl/
AUTHOR - Hilmar Lapp
Email hlapp-at-gmx.net or hilmar.lapp-at-pharma.novartis.com
APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _
predicted_cds
Title : predicted_cds
Usage : $predicted_cds_dna = $gene->predicted_cds();
$gene->predicted_cds($predicted_cds_dna);
Function: Get/Set the CDS (coding sequence) as predicted by a program.
This method is independent of an attached_seq. There is no
guarantee whatsoever that the returned CDS has anything to do
(e.g., matches) with the sequence covered by the exons as annotated
through this object.
Example :
Returns : A Bio::PrimarySeqI implementing object holding the DNA sequence
defined as coding by a prediction of a program.
Args : On set, a Bio::PrimarySeqI implementing object holding the DNA
sequence defined as coding by a prediction of a program.
predicted_protein
Title : predicted_protein
Usage : $predicted_protein_seq = $gene->predicted_protein();
$gene->predicted_protein($predicted_protein_seq);
Function: Get/Set the protein translation as predicted by a program.
This method is independent of an attached_seq. There is no
guarantee whatsoever that the returned translation has anything to
do with the sequence covered by the exons as annotated
through this object, or the sequence returned by predicted_cds(),
although it should usually be just the standard translation.
Example :
Returns : A Bio::PrimarySeqI implementing object holding the protein
translation as predicted by a program.
Args : On set, a Bio::PrimarySeqI implementing object holding the protein
translation as predicted by a program.
perl v5.14.2 2012-03-02 Bio::Tools::Prediction::Gene(3pm)