Sponsored Content
Top Forums Shell Programming and Scripting count the unique records based on certain columns Post 302691067 by miclow on Thursday 23rd of August 2012 08:34:39 PM
Old 08-23-2012
count the unique records based on certain columns

Hi everyone,

I have a file result.txt with records as following and another file mirna.txt with a list of miRNAs e.g. miR22, miR123, miR13 etc.

Gene Transcript miRNA

Gar Nm_111233 miR22
Gar Nm_123440 miR22
Gar Nm_129939 miR22
Hel Nm_233900 miR13
Hel Nm_678900 miR13
Bart Nm_178181 miR22
Gar Nm_789999 miR43

Now I want to count the number of gene for each miRNA in mirna.txt


e.g.
miR22 2
miR13 1
miR15 0
miR43 1



Previously, I used the following command but it counts every occurence of miRNA.

for gene in `cat mirna.txt`; do awk -v gene=$gene '{for(i=1; i<=NF; i++) if ($i==gene) c++} END {print c}' result.txt>>output.txt; done;


Any help is appreciated. Thanks in advance.


Mic

Last edited by miclow; 08-23-2012 at 10:00 PM..
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Record count based on a keyword in the records

Hi, Am having files with many records, i need to count and display the number of records based on the keyword in one of the column of the records. for e.g THE FILE CONTAINS TWO RECORDS LIKE. 200903031143150 0 1236060795054357lessrv1 BSNLSERVICE1 BSNLSERVICE1 ... (4 Replies)
Discussion started by: aemunathan
4 Replies

2. Shell Programming and Scripting

using awk to count no of records based on conditions

Hi I am having files with date and time stamp as the folder names like 200906051400,200906051500,200906051600 .....hence everyday 24 files will be generated i need to do certain things on this 24 files daily file contains the data like 200906050016370 0 1244141195225298lessrv3 ... (13 Replies)
Discussion started by: aemunathan
13 Replies

3. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

4. UNIX for Dummies Questions & Answers

How to count specific columns and merge with unique ones?

Hi. I am not sure the title gives an optimal description of what I want to do. I have several text files that contain data in many columns. All the files are organized the same way, but the data in the columns might differ. I want to count the number of times data occur in specific columns,... (0 Replies)
Discussion started by: JamesT
0 Replies

5. Shell Programming and Scripting

Print unique records in 2 columns using awk

Is it possible to print the records that has only 1 value in 2nd column. Ex: input awex1 1 awex1 2 awex1 3 assww 1 ader34 1 ader34 2 output assww 1 (5 Replies)
Discussion started by: quincyjones
5 Replies

6. Shell Programming and Scripting

Find and count unique date values in a file based on position

Hello, I need some sort of way to extract every date contained in a file, and count how many of those dates there are. Here are the specifics: The date format I'm looking for is mm/dd/yyyy I only need to look after line 45 in the file (that's where the data begins) The columns of... (2 Replies)
Discussion started by: ronan1219
2 Replies

7. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

8. Shell Programming and Scripting

Merge records based on multiple columns

Hi, I have a file with 16 columns and out of these 16 columns 14 are key columns, 15 th is order column and 16th column is having information. I need to concate the 16th column based on value of 1-14th column as key in order of 15th column. Here are the example file Input File (multiple... (3 Replies)
Discussion started by: Ravi Agrawal
3 Replies

9. Shell Programming and Scripting

Insert Columns before the last Column based on the Count of Delimiters

Hi, I have a requirement where in I need to insert delimiters before the last column of the total delimiters is less than a specified number. Say if the delimiters is less than 139, I need to insert 2 columns ( with blanks) before the last field awk -F 'Ç' '{ if (NF-1 < 139)} END { "Insert 2... (5 Replies)
Discussion started by: arunkesi
5 Replies
Ace::Sequence::GappedAlignment(3pm)			User Contributed Perl Documentation		       Ace::Sequence::GappedAlignment(3pm)

NAME
Ace::Sequence::GappedAlignment - Gapped alignment object SYNOPSIS
# open database connection and get an Ace::Sequence object use Ace::Sequence; # get a megabase from the middle of chromosome I $seq = Ace::Sequence->new(-name => 'CHROMOSOME_I, -db => $db, -offset => 3_000_000, -length => 1_000_000); # get all the gapped alignments @alignments = $seq->alignments('EST_GENOME'); # get the aligned segments from the first one @segs = $alignments[0]->segments; # get the position of the first aligned segment on the # source sequence: ($s_start,$s_end) = ($segs[0]->start,$segs[0]->end); # get the target position for the first aligned segment ($t_start,$t_end) = ($segs[0]->target->start,$segs[0]->target->end); DESCRIPTION
Ace::Sequence::GappedAlignment is a subclass of Ace::Sequence::Feature. It inherits all the methods of Ace::Sequence::Feature, but adds the ability to retrieve the positions of the aligned segments. Each segment is an Ace::Sequence::Feature, from which you can retrieve the source and target coordinates. OBJECT CREATION
You will not ordinarily create an Ace::Sequence::GappedAlignment object directly. Instead, objects will be created in response to a alignments() call to an Ace::Sequence object. OBJECT METHODS
Most methods are inherited from Ace::Sequence::Feature. The following methods are also supported: segments() @segments = $gene->segments; Return a list of Ace::Sequence::Feature objects corresponding to similar segments. relative() $relative = $gene->relative; $gene->relative(1); This turns on and off relative coordinates. By default, the exons and intron features will be returned in the coordinate system used by the gene. If relative() is set to a true value, then coordinates will be expressed as relative to the start of the gene. The first exon will (usually) be 1. SEE ALSO
Ace, Ace::Object, Ace::Sequence,Ace::Sequence::Homol, Ace::Sequence::Feature, Ace::Sequence::FeatureList, GFF AUTHOR
Lincoln Stein <lstein@cshl.org> with extensive help from Jean Thierry-Mieg <mieg@kaa.crbm.cnrs-mop.fr> Copyright (c) 1999, Lincoln D. Stein This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See DISCLAIMER.txt for disclaimers of warranty. POD ERRORS
Hey! The above document had some coding errors, which are explained below: Around line 166: You forgot a '=back' before '=head1' perl v5.14.2 2001-11-10 Ace::Sequence::GappedAlignment(3pm)
All times are GMT -4. The time now is 11:05 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy