|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Count specific characters at specific column positions
Hi all, I need help. I have an input text file (input.txt) like this: Code:
21 GTGCAACACCGTCTTGAGAGG 50 21 GACCGAGACAGAATGAAAATC 73 21 CGGGTCTGTAGTAGCAAACGC 108 21 CGAAAAATGAACCCCTTTATC 220 21 CGTGATCCTGTTGAAGGGTCG 259 Now I need to count A/T/G/C numbers at each character location in column 2, in this case is always 21 characters, but can be variable. Output (output.txt) will need to be: Code:
A 0 1 1 1 3 3 1 2 0 3 1 1 2 1 1 2 3 2 3 0 0 T 0 0 1 0 1 1 1 1 2 0 1 2 0 1 0 1 1 1 1 2 0 G 2 3 2 2 1 0 1 1 1 1 3 0 1 1 1 2 1 2 0 2 2 C 3 0 1 2 0 1 2 1 2 1 0 1 2 1 2 0 0 0 1 1 3 I can do this in Excel, but my file is way bigger than Excel can handle. Thanks! Last edited by Scott; 12-04-2012 at 12:07 PM.. Reason: Please use code tags |
| Sponsored Links | ||
|
|
#2
|
||||
|
||||
|
awk -f thie.awk myFile where thie.awk is: Code:
BEGIN {
if (!chars) chars="A T G C"
nchars=split(chars, charsA, FS)
}
{
width=length($2)
for(i=1;i<=width;i++)
arr[substr($2,i,1),i]++
}
END {
for(i=1;i<=nchars;i++) {
printf("%s", charsA[i])
for(j=1;j<=width;j++)
printf("%s%d%s", OFS, arr[charsA[i],j], (j==width)?ORS:"")
}
} |
| The Following User Says Thank You to vgersh99 For This Useful Post: | ||
thienxho (12-04-2012) | ||
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
Quote:
You solved my problem. I tested your codes and compared them to my Excel count with a file of 800K rows. Both had same output. Really appreciated your help. |
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Can't figure out how to find specific characters in specific columns | Drenhead | Shell Programming and Scripting | 12 | 11-21-2012 04:27 PM |
| How to count occurrences in a specific column | redse171 | Shell Programming and Scripting | 10 | 10-01-2012 05:51 PM |
| Unix command to count the number of files with specific characters in name | murphysm | UNIX for Dummies Questions & Answers | 6 | 05-07-2010 04:12 AM |
| Assigning a specific format to a specific column in a text file using awk and printf | goodbenito | Shell Programming and Scripting | 2 | 04-30-2010 09:25 AM |
| count characters in specific records | littleboyblu | Shell Programming and Scripting | 1 | 02-05-2009 04:10 AM |
|
|