The number of characters will be same under each subheader and only possible characters are A,T,G,C and -
I want to count the number of A's, T's,G's, C's & -'s vertically for all the positions so that I get the following output
please let me know the best way to do this using awk
I do have another problem as well, which occurred after seeing this output.
This could be an entirely different question.
I have the same formated file as above but now with 4 positions. For each position there are the chances of being 2 types of character, either the 1 type character or 2 type character. For example at position 1, characters should be either T (for 1 type) or C (for 2 type), similiarily for position 2, C (for 1 type) or T (2 type), position 3, A(for 1 type)or G (for 2 type) and position 4, T (for 1 type) or C (for 2 type).
below is the input file
Based on this, I want to characterize all the sub-headers (>A1, A2, A3, A4, A5) in the above file so that I would know which type it is.
the desired output ( No need for the part after #, it is just to make it clearer)
If you really wanted you could embed it into awk itself like
in the BELOW section instead, but when there's more than three lines of it, I tend to put that in files. Just better organization, and far less chance of typoes than doing fiddly [] operations over and over.
Hi All, here's a question from newbie
I have a data like this, which set of small DNA sequences separated by new line
GAATCCGGAAACAGCAACTTCAAANCA
GTNATTCGGGCCAAACTGTCGAA
TTNGGCAACTGTTAGAGCTCATGCGACA
CCTGCTAAACGAGTTCGAGTTGAANGA
TTNCGGAAGTGGTCGCTGGCACGG
ACNTGCATGTACGGAGTGACGAAACCI... (6 Replies)
Ok say I wanted to count every Y in a data file.
Then set Y as my delimiter so that I can separate my file by taking all the contents that occur BEFORE the first Y and store them in a variable so that I may use this content later on in my program. Then I could do the same thing with the next Y's... (5 Replies)
Hi All,
I need some help in counting the number of letters in a big file with separations.
Following is the file I have
>AB_1
MLKKPIIIGVTGGSGGGKTSVSRAILDSFPNARIAMIQHDSYYKDQSHMSFEERVKTNYDHPLAFDTDFM
IQQLKELLAGRPVDIPIYDYKKHTRSNTTFRQDPQDVIIVEGILVLEDERLRDLMDIKLFVDTDDDIRII... (6 Replies)
I want to list the occurence of particular characters in a line. my file looks like this
a,b,c,d
e,f,g
h,y:e,g,y s
f;g,s,w
and I want to count how many commas are in each line so the file in the end looks like this:
a,b,c,d 3
e,f,g 2
h,y:e,g,y s 3
f;g,s,w ... (2 Replies)
I have a comma delimited file that roughly has 300 fields. Not all fields are populated.
This file is fed into another system, what I need to do is count the amount of characters in each field and give me an output similiar to this:
1 - 6,2 - 25
The first number is the field and the second... (2 Replies)
Dears,
I would like to count the number of "(" and ")" that occur in a file.
(syntax checking script). I tried to use "grep -c" and this works fine as long as there is only one character (for which I do a search) on a line.
Has anyone an idea how I can count the number of specific characters... (6 Replies)