Sponsored Content
Top Forums Shell Programming and Scripting Count specific characters at specific column positions Post 302739575 by thienxho on Tuesday 4th of December 2012 12:06:27 PM
Old 12-04-2012
Count specific characters at specific column positions

Hi all, I need help.

I have an input text file (input.txt) like this:

Code:
21	GTGCAACACCGTCTTGAGAGG	50
21	GACCGAGACAGAATGAAAATC	73
21	CGGGTCTGTAGTAGCAAACGC	108
21	CGAAAAATGAACCCCTTTATC	220
21	CGTGATCCTGTTGAAGGGTCG	259

Now I need to count A/T/G/C numbers at each character location in column 2, in this case is always 21 characters, but can be variable.

Output (output.txt) will need to be:

Code:
A	0	1	1	1	3	3	1	2	0	3	1	1	2	1	1	2	3	2	3	0	0
T	0	0	1	0	1	1	1	1	2	0	1	2	0	1	0	1	1	1	1	2	0
G	2	3	2	2	1	0	1	1	1	1	3	0	1	1	1	2	1	2	0	2	2
C	3	0	1	2	0	1	2	1	2	1	0	1	2	1	2	0	0	0	1	1	3

I can do this in Excel, but my file is way bigger than Excel can handle.

Thanks!

Last edited by Scott; 12-04-2012 at 01:07 PM.. Reason: Please use code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

count characters in specific records

I have a text file which represents a http flow like this: HTTP/1.1 200 OK Date: Fri, 23 Jan 2009 17:16:24 GMT Server: Apache Last-Modified: Fri, 23 Jan 2009 17:08:03 GMT Accept-Ranges: bytes Cache-Control: max-age=540 Expires: Fri, 23 Jan 2009 17:21:31 GMT Vary: Accept-Encoding ... (1 Reply)
Discussion started by: littleboyblu
1 Replies

2. Shell Programming and Scripting

Insert a text from a specific row into a specific column using SED or AWK

Hi, I am having trouble converting a text file. I have been working for this whole day now, still i couldn't make it. Here is how the text file looks: _______________________________________________________ DEVICE STATUS INFORMATION FOR LOCATION 1: OPER STATES: Disabled E:Enabled ... (5 Replies)
Discussion started by: Issemael
5 Replies

3. Shell Programming and Scripting

Assigning a specific format to a specific column in a text file using awk and printf

Hi, I have the following text file: 8 T1mapping_flip02 ok 128 108 30 1 665000-000008-000001.dcm 9 T1mapping_flip05 ok 128 108 30 1 665000-000009-000001.dcm 10 T1mapping_flip10 ok 128 108 30 1 665000-000010-000001.dcm 11 T1mapping_flip15 ok 128 108 30... (2 Replies)
Discussion started by: goodbenito
2 Replies

4. UNIX for Dummies Questions & Answers

Unix command to count the number of files with specific characters in name

Hey all, I'm looking for a command that will search a directory (and all subdirectories) and give me a file count for the number of files that contain specific characters within its filename. e.g. I want to find the number of files that contain "-a.jpg" in their name. All the searching I've... (6 Replies)
Discussion started by: murphysm
6 Replies

5. Shell Programming and Scripting

How to count occurrences in a specific column

Hi, I need help to count the number of occurrences in $3 of file1.txt. I only know how to count by checking one by one and the code is like this: awk '$3 ~ /aku hanya poyo/ {++c} END {print c}' FS="\t" file1.txt But this is not wise to do as i have hundreds of different occurrences in that... (10 Replies)
Discussion started by: redse171
10 Replies

6. Shell Programming and Scripting

Can't figure out how to find specific characters in specific columns

I am trying to find a specific set of characters in a long file. I only want to find the characters in column 265 for 4 bytes. Is there a search for that? I tried cut but couldn't get it to work. Ex. I want to find '9999' in column 265 for 4 bytes. If it is in there, I want it to print... (12 Replies)
Discussion started by: Drenhead
12 Replies

7. Shell Programming and Scripting

Count specific column values

Hi all: quick question! I have the following data that resembles some thing like this: i am tired tired am i what is up hello people cool I want to count (or at least isolate) all of the unique elements in the 2nd column. I have tried this: cut -f 2 | uniq 'input' which does... (3 Replies)
Discussion started by: owwow14
3 Replies

8. Shell Programming and Scripting

Count frequency of unique values in specific column

Hi, I have tab-deliminated data similar to the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows: dot 3 cat 1 hot 1 is... (5 Replies)
Discussion started by: owwow14
5 Replies

9. Shell Programming and Scripting

Overwrite specific column in xml file with the specific column from adjacent line

I have an xml file dumped from rrd file, that I want to "patch" so the xml file doesn't contain any blank hole in the resulting graph of the rrd file. Here is the file. <!-- 2015-10-12 14:00:00 WIB / 1444633200 --> <row><v> 4.0419731265e+07 </v><v> 4.5045912770e+06... (2 Replies)
Discussion started by: rk4k
2 Replies

10. Shell Programming and Scripting

Search and replace specific positions of specific lines

Hi, I have a file with hundreds of lines. I want to search for particular lines starting with 4000, search and replace the 137-139 position characters; which will be '000', with '036'. Can all of this be done without opening a temp file and then moving that temp file to the original file name. ... (7 Replies)
Discussion started by: dsid
7 Replies
TEXT2PS(L)																TEXT2PS(L)

NAME
text2ps - convert text files to PostScript SYNOPSIS
text2ps [ options ] [ files ] DESCRIPTION
Text2ps reads the input files (standard input if none are specified) and produces PostScript code which, when fed to a PostScript printer, will print the files. With text2ps it is possible to select any font, point size and number of columns. Options and files can be inter- mixed on the command line. Options are effective for all following files until they are overridden. Options Here follows a list of options that text2ps recognizes. Most numeric arguments are significant to one decimal place. Options are evalu- ated from left to right. Later options override earlier ones. -# n Print n copies of each page. (Default 1.) -c n Print in n columns. (Default 1.) -f font Print using font font. (Default Courier.) -p n Print with point size n. (Default 9.) -v n Use a vertical spacing of n points. If the vertical spacing is set to 0, the spacing will be 1.2 times the point size. (Default 0.) -l n Print n lines per column. When the line count is 0, print as many lines as will fit. (Default 0.) -r [p|l] Set the orientation to either portrait mode (p) or landscape mode (l). (Default p.) -b [+|-] Set page break mode. An argument + will force new files to be always printed on a new page (this is the default). After - new files will be put on the same page if there are still empty columns and the number of columns, the orientation or the number of copies didn't change. New files always start new columns. (Default -.) -mt n The top margin is n points. (Default 63.) -mb n The bottom margin is n points. (Default 63.) -ml n The left margin is n points. (Default 59.) -mr n The right margin is n points. (Default 59.) -mg n The inter-column gap is n points. (Default 25.) -t [+|-] If the argument is + the name of the file being printed will be printed on each page. If the argument is - the file name will not be printed. -t + implies -b +. -T text Print text as title on each page. This implies -t - and -b +. This option can be switched off by specifying -t - or -t +. (Default no title.) -F font Set the title font to font. (Default Helvetica.) -P n Set the title point size to n. (Default 12.) -B n Draw borders around each page. The number n specifies how to draw borders. N can have any of the following values or-ed in: 1 Draw a line along the left of the page. 2 Draw a line along the bottom of the page. 4 Draw a line along the right of the page. 8 Draw a line along the top of the page. 16 Draw a line between columns. This line does not connect to the lines along the top or bottom. 32 Draw a connecting line between the line between columns and the line along the top. 64 Draw a connecting line between the line between columns and the line along the bottom. When n is 0, no border lines are drawn. (Default no bordering lines.) -w n Tab stops are set every n spaces. Set the width of the TAB character. (Default 8.) -1 Sets up options to print in one column in portrait mode with the Courier font, so that you get 66 lines on a page. Equivalent to specifying the options -c 1 -f Courier -p 9 -v 0 -r p -l 0 -mt 63 -mb 63 -ml 59 -mr 59. This is the default. -2 Sets up options to print in two columns in landscape mode with the Courier font, so that you get two 66-line columns on a page. Equivalent to specifying the options -c 2 -f Courier -p 6 -v 0 -r l -l 0 -mt 63 -mb 63 -ml 59 -mr 59 -mg 25. Together with the -1 option, this is probably the most useful option. The name - means standard input. BUGS
Too many options. There is no way to specify where the title will be placed. If the font being used is not a constant width font and there are other characters than just tabs and spaces in front of a tab, the next character may not align properly. TEXT2PS(L)
All times are GMT -4. The time now is 10:46 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy