Sponsored Content
Top Forums Shell Programming and Scripting Count specific characters at specific column positions Post 302739575 by thienxho on Tuesday 4th of December 2012 12:06:27 PM
Old 12-04-2012
Count specific characters at specific column positions

Hi all, I need help.

I have an input text file (input.txt) like this:

Code:
21	GTGCAACACCGTCTTGAGAGG	50
21	GACCGAGACAGAATGAAAATC	73
21	CGGGTCTGTAGTAGCAAACGC	108
21	CGAAAAATGAACCCCTTTATC	220
21	CGTGATCCTGTTGAAGGGTCG	259

Now I need to count A/T/G/C numbers at each character location in column 2, in this case is always 21 characters, but can be variable.

Output (output.txt) will need to be:

Code:
A	0	1	1	1	3	3	1	2	0	3	1	1	2	1	1	2	3	2	3	0	0
T	0	0	1	0	1	1	1	1	2	0	1	2	0	1	0	1	1	1	1	2	0
G	2	3	2	2	1	0	1	1	1	1	3	0	1	1	1	2	1	2	0	2	2
C	3	0	1	2	0	1	2	1	2	1	0	1	2	1	2	0	0	0	1	1	3

I can do this in Excel, but my file is way bigger than Excel can handle.

Thanks!

Last edited by Scott; 12-04-2012 at 01:07 PM.. Reason: Please use code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

count characters in specific records

I have a text file which represents a http flow like this: HTTP/1.1 200 OK Date: Fri, 23 Jan 2009 17:16:24 GMT Server: Apache Last-Modified: Fri, 23 Jan 2009 17:08:03 GMT Accept-Ranges: bytes Cache-Control: max-age=540 Expires: Fri, 23 Jan 2009 17:21:31 GMT Vary: Accept-Encoding ... (1 Reply)
Discussion started by: littleboyblu
1 Replies

2. Shell Programming and Scripting

Insert a text from a specific row into a specific column using SED or AWK

Hi, I am having trouble converting a text file. I have been working for this whole day now, still i couldn't make it. Here is how the text file looks: _______________________________________________________ DEVICE STATUS INFORMATION FOR LOCATION 1: OPER STATES: Disabled E:Enabled ... (5 Replies)
Discussion started by: Issemael
5 Replies

3. Shell Programming and Scripting

Assigning a specific format to a specific column in a text file using awk and printf

Hi, I have the following text file: 8 T1mapping_flip02 ok 128 108 30 1 665000-000008-000001.dcm 9 T1mapping_flip05 ok 128 108 30 1 665000-000009-000001.dcm 10 T1mapping_flip10 ok 128 108 30 1 665000-000010-000001.dcm 11 T1mapping_flip15 ok 128 108 30... (2 Replies)
Discussion started by: goodbenito
2 Replies

4. UNIX for Dummies Questions & Answers

Unix command to count the number of files with specific characters in name

Hey all, I'm looking for a command that will search a directory (and all subdirectories) and give me a file count for the number of files that contain specific characters within its filename. e.g. I want to find the number of files that contain "-a.jpg" in their name. All the searching I've... (6 Replies)
Discussion started by: murphysm
6 Replies

5. Shell Programming and Scripting

How to count occurrences in a specific column

Hi, I need help to count the number of occurrences in $3 of file1.txt. I only know how to count by checking one by one and the code is like this: awk '$3 ~ /aku hanya poyo/ {++c} END {print c}' FS="\t" file1.txt But this is not wise to do as i have hundreds of different occurrences in that... (10 Replies)
Discussion started by: redse171
10 Replies

6. Shell Programming and Scripting

Can't figure out how to find specific characters in specific columns

I am trying to find a specific set of characters in a long file. I only want to find the characters in column 265 for 4 bytes. Is there a search for that? I tried cut but couldn't get it to work. Ex. I want to find '9999' in column 265 for 4 bytes. If it is in there, I want it to print... (12 Replies)
Discussion started by: Drenhead
12 Replies

7. Shell Programming and Scripting

Count specific column values

Hi all: quick question! I have the following data that resembles some thing like this: i am tired tired am i what is up hello people cool I want to count (or at least isolate) all of the unique elements in the 2nd column. I have tried this: cut -f 2 | uniq 'input' which does... (3 Replies)
Discussion started by: owwow14
3 Replies

8. Shell Programming and Scripting

Count frequency of unique values in specific column

Hi, I have tab-deliminated data similar to the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows: dot 3 cat 1 hot 1 is... (5 Replies)
Discussion started by: owwow14
5 Replies

9. Shell Programming and Scripting

Overwrite specific column in xml file with the specific column from adjacent line

I have an xml file dumped from rrd file, that I want to "patch" so the xml file doesn't contain any blank hole in the resulting graph of the rrd file. Here is the file. <!-- 2015-10-12 14:00:00 WIB / 1444633200 --> <row><v> 4.0419731265e+07 </v><v> 4.5045912770e+06... (2 Replies)
Discussion started by: rk4k
2 Replies

10. Shell Programming and Scripting

Search and replace specific positions of specific lines

Hi, I have a file with hundreds of lines. I want to search for particular lines starting with 4000, search and replace the 137-139 position characters; which will be '000', with '036'. Can all of this be done without opening a temp file and then moving that temp file to the original file name. ... (7 Replies)
Discussion started by: dsid
7 Replies
PR(1)							      General Commands Manual							     PR(1)

NAME
pr - print file SYNOPSIS
pr [ option ... ] [ file ... ] DESCRIPTION
Pr produces a printed listing of one or more files on its standard output. The output is separated into pages headed by a date, the name of the file or a specified header, and the page number. With no file arguments, pr prints its standard input. Options apply to all following files but may be reset between files: -n Produce n-column output. +n Begin printing with page n. -b Balance columns on last page, in case of multi-column output. -d Double space. -en Set the tab stops for input text every n spaces. -h Take the next argument as a page header (file by default). -in Replace sequences of blanks in the output by tabs, using tab stops set every n spaces. -f Use formfeeds to separate pages. -ln Take the length of the page to be n lines instead of the default 66. -m Print all files simultaneously, each in one column. -n Number the lines of each file. -on Offset the left margin n character positions. -sc Separate columns by the single character c instead of aligning them with white space. A missing c is taken to be a tab. -t Do not print the 5-line header or the 5-line trailer normally supplied for each page. -wn For multi-column output, take the width of the page to be n characters instead of the default 72. SOURCE
/sys/src/cmd/pr.c SEE ALSO
cat(1), lp(1) PR(1)
All times are GMT -4. The time now is 11:28 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy