extracting substrings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting extracting substrings
# 1  
Old 12-01-2008
Bug extracting substrings

Hi guys,
I am stuck in this problem. Please help.

I have two files.
FILE1 (with records starting from '>' )
>TC1723_3 similar to Scific_A7Q9Q3
EMSPSQDYCDDYFKLTYPCTAGAQYYGRGALPVYWNYNYGAIGEALKLDLLNHPEYIEQN
ATMAFQAAIWRWMNPMKKGQPSAHDAFVGNWKP
>TC214_2 similar to Quiet_Ref100_Q8W2B2 Cluster; Capsule catabar holesome, partial (58%)
S**ELSSCY*QRRKMRYSFLIFLTLALLLTTSSAQQCGKQAGGRVCANKLCCSQYGFCGS
SRNYCGAGCQSNCRSVASGNTESEAANAHRKNLPGHSN*SCYSF*FTMNIIMFHVC*LLR
TTNKN

FILE2 ( with 3 columns, col1 is ID col2 and col3 are the substring co-ordinates). It is a single space separated file but shown with '-' for clarity
TC1723_3 - 10 - 40
TC214_2 - 5 - 115

I need the OUTPUT FILE as -
>TC1723_3 similar to Scific_A7Q9Q3 (Region 10 - 40 of 95)
DYFKLTYPCTAGAQYYGRGALPVYWNYNYGA
>TC214_2 similar to Quiet_Ref100_Q8W2B2 Cluster; n=1; Capsule catabar holesome, partial (58%) (Region 5 - 115 of 125)
SSCY*QRRKMRYSFLIFLTLALLLTTSSAQQCGKQAGGRVCANKLCCSQYGFCGSSRNYC
GAGCQSNCRSVASGNTESEAANAHRKNLPGHSN*SCYSF*FTMNIIMFHV
where (Region 10 - 40 of 95) represents region of substring and 95 is the total length of the subsring following the line beginning with '>'

Thanks in advance. Smilie

Last edited by smriti_shridhar; 12-01-2008 at 02:38 AM.. Reason: formatting
# 2  
Old 12-05-2008
Code:
awk '/^>/ { last=$0; next; } !/^>/ { print last,substr($0,10,30);' }

In the unlikely case that awk reports that it doesn't know about "substr", use "nawk", "mawk", or "gawk".

Also, you said 10 - 40. I'm assuming that means starting at the 10th character and stopping but including the 39th character.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Look for substrings with special characters

Hello gurus, I have a lookup table cat tmp1 \\\erw``~ 1 ^774574574565665f\] 2 ()42543^ and I`m trying to compare a bunch of strings such that, either the lookup table column 1, or the string to be looked up are substrings of each other (and return the second lookup column if yes). ... (2 Replies)
Discussion started by: sheetalk
2 Replies

2. Shell Programming and Scripting

Finding most common substrings

Hello, I would like to know what is the three most abundant substrings of length 6 from col2. The file is quite large and looks like this col1 col2 EN03 typehellobyedogcatcatdog EN09 typehellobyebyebyebye EN08 dogcatcatdogbyebyebyebye EN09 catcattypehellobyebyebyebye... (9 Replies)
Discussion started by: verse123
9 Replies

3. Shell Programming and Scripting

Extracting substrings from a string of variable length

I have a string like Months=jan feb mar april x y .. Here the number of fields in Months is not definite I need to extract each field in the Months string and pass it to awk . Don't want to use for in since it is a loop . How can i do it (2 Replies)
Discussion started by: Nevergivup
2 Replies

4. UNIX for Dummies Questions & Answers

Replace substrings in awk

Hi ! my input looks like that: --AAA-AAAAAAA---------AA- AAA------AAAAAAAAAAAAAA ------A----AAAA-----A------- Using awk, I would need to replace only the "-" located between the last letter and the end of the string by "~" in order to get: --AAA-AAAAAAA---------AA~... (7 Replies)
Discussion started by: beca123456
7 Replies

5. Shell Programming and Scripting

Extract three substrings from a logfile

I have a log file like below. 66.249.73.11 - - "UCiZ7QocVqYAABgwfP8AAHAA" "US" "Mediapartners-Google" "-" www.mahashwetha.com.sg "GET... (2 Replies)
Discussion started by: Tuxidow
2 Replies

6. Shell Programming and Scripting

extracting substrings from variables

Hello Everyone, I am looking for a way to extract substrings to local variables. Here is the format of the string variable i am using : /var/x/www && /usr/x/share/doc && /etc/x/logs where the substrings i must extract are the "/var/x/www" and such. I was originally thinking of using... (15 Replies)
Discussion started by: jimmy75_13
15 Replies

7. AIX

Substrings and the likes in AIX 4.2 ?

In AIX 4.2, are there any shell commands to do substrings and the text like manipulation commands ? I want to take an error log where errors are multi-ligned and convert them into single lines to ease tracking/monitoring. I may need to shorten them out too. If I can manage to put them into an... (2 Replies)
Discussion started by: Browser_ice
2 Replies

8. Shell Programming and Scripting

Extract large list of substrings

I have a very long string (millions of characters). I have a file with start location and length that is thousands of rows long: Start Length 5 10 16 21 44 100 215 37 ... I'd like to extract the substring that corresponds to the start and length from each row of the list: I tried... (7 Replies)
Discussion started by: dcfargo
7 Replies

9. Shell Programming and Scripting

Breaking strings into Substrings

I'm only new to shell programming and have been given a task to do a program in .sh, however I've come to a point where I'm not sure what to do. This is my code so far: # process all arguments (i.e. loop while $1 is present) while ; do # echo "Arg is $1" case $1 in -h*|-H*) echo "help... (4 Replies)
Discussion started by: switch
4 Replies

10. Programming

Accessing substrings by offset and length

Hi, I have a simple question... In C do we have a standard library function which will return the pointer to a substring at certain offset and having certain length... Ofcourse we should take care not to access beyond allocated length in the parent string and don't overwrite beyond allocated... (2 Replies)
Discussion started by: Vishnu
2 Replies
Login or Register to Ask a Question