Extract text between two character positions


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract text between two character positions
# 1  
Old 01-31-2012
Extract text between two character positions

Greetings.

I need to extract text between two character positions, e.g: all text between character 4921 and 6534.

The text blocks are FASTA-format sequence of whole chromosomes, so basically a million A, T, G, C, combinations. E.g:
Code:
>Chr_1
ACCTGTTCAACTCTCAGGACTCTCAGGTCAACTCTCAG
CAACTCTCAGGAACTCTCAGGTCAACTCTCACTCTCAG
GTCAACTCTCCAGGAACTCTCCACTCTCAGAGGTCAAC
.......

I need to extract a region of genes, I know the character positions that are the boundaries.

I need the equivalent of what this does for lines:
Code:
sed -n 'line1,line2p" > new_file.txt

But for character positions.

Thanks!
Moderator's Comments:
Mod Comment Please use next time code tags for your code and data

Last edited by vbe; 01-31-2012 at 11:39 AM..
# 2  
Old 01-31-2012
See if this works:
Code:
awk 'NR>1{p=$0;sub($1 ORS,x,p);sub(ORS,x,p); print RS $1 ORS substr(p, 4921,6534-4921+1)}' RS=\> OFS= infile

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 01-31-2012
Perfect! Het werkt! Bedankt!

Moderator's Comments:
Mod Comment Please use ONLY english language. Thank you.

Last edited by DukeNuke2; 01-31-2012 at 12:17 PM..
# 4  
Old 01-31-2012
Nothing to thank Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extracting strings at various positions of text file

Hi Team - I hope everyone has been well! I export a file from one of our source systems that gives me more information than I need. The way the file outputs, I need to extract certain strings at different positions on the file and echo them to another file. I can do this in batch easily,... (2 Replies)
Discussion started by: SIMMS7400
2 Replies

2. Shell Programming and Scripting

seds to extract fields based on positions

Hi My file has a series of rows up to 160 characters in length. There are 7 columns for each row. In each row, column 1 starts at position 4 column 2 starts at position 12 column 3 starts at position 43 column 4 starts at position 82 column 5 starts at... (7 Replies)
Discussion started by: malts18
7 Replies

3. Shell Programming and Scripting

Insert text with Sed (in various positions)

Hello. I'm trying to insert text in various positions and I could only do that using pipes for each position. Example: cat file | sed -e 's#\(.\{5\}\)\(.*\)#\1:\2#g' | sed -e 's#\(.\{26\}\)\(.*\)#\1:\2#g' Insert ":" at position 5 and 26. it can be done in the same sentence, without using... (4 Replies)
Discussion started by: </kida>
4 Replies

4. UNIX for Dummies Questions & Answers

find positions of a letter in a text file

Hi, I would like to know how can I get all the positions of a letter, let say letter C in a text file. sample input file: hcck pgog hlhhc desired output file: 2 3 13 Many thanks! (2 Replies)
Discussion started by: fadista
2 Replies

5. Shell Programming and Scripting

awk script replace positions if certain positions equal prescribed value

I am attempting to replace positions 44-46 with YYY if positions 48-50 = XXX. awk -F "" '{if (substr($0,48,3)=="XXX") $44="YYY"}1' OFS="" $filename > $tempfile But this is not working, 44-46 is still spaces in my tempfile instead of YYY. Any suggestions would be greatly appreciated. (9 Replies)
Discussion started by: halplessProblem
9 Replies

6. Shell Programming and Scripting

Filling positions based on consensus character

I have files with hundreds of sequences with missing characters represented by a dash ("-"), something like this I need to go sequence by sequence and if a dash is found, it should be replaced with the most common character in that particular position. Thus, in my example the dash in the second... (6 Replies)
Discussion started by: Xterra
6 Replies

7. Shell Programming and Scripting

read the text file and print the content character by character..

hello all i request you to give the solution for the following problem.. I want read the text file.and print the contents character by character..like if the text file contains google means..i want to print g go goo goog googl google like this Using unix Shell scripting... without using... (1 Reply)
Discussion started by: samupnl
1 Replies

8. Shell Programming and Scripting

Replace 9-16 positions of a text file.

Hi i am having text file like this 40000201040005200213072009000000700000050744820906904421 40069300240005200713072009000000067400098543630000920442 i want to replace 9-16 positions of my txt file...by 1234567...in a single line command i.e 0400052....should be replaced by... (2 Replies)
Discussion started by: suryanarayana
2 Replies

9. Shell Programming and Scripting

extract character + 1

Hi, I would like extract from a file a character or pattern after ( n + 1) a specific pattern (n) . ( i supposed with awk) how could i do ? Thanks in advance. (1 Reply)
Discussion started by: francis_tom
1 Replies
Login or Register to Ask a Question