Help with finding matching position on strings


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Help with finding matching position on strings
# 8  
Old 03-23-2011
Hi Corona - nice code, but Pawan needs these lines concatenated, since this is DNA. Meaning, it is actually one long SINGLE string of information, he just broke it into 3 lines. So, if "GAT" is broken into two lines (at the end of one line going into the beginning of the next line), it has to be found too.
This User Gave Thanks to sgruenwald For This Post:
# 9  
Old 03-24-2011
Code:
# str="GAT"
# awk -F"$str" -v V="$str"  'NF>1{for(i=1;i<NF;i++){match($0,".*"$i);print "Line:"NR,"Pos:",RSTART+RLENGTH,"to",RSTART-1+RLENGTH+length(V)}}' infile

Code:
# cat tst
CAGCAGCAAGATTTGCAGCAACAGCAACAAGTAGTGACTACAGTTGCCTCGCAAAGTCCT
CATGCAACTGCAACGGAAAAGGAGCCAGTACCCGCCGTGGTTGACGACCCACTGGAGAAC
ATGTTCGGAGATTATTCCAATGAGCCGTTCAACACCAATTTCGACGATGAATTTGGAGAT
# str="GAT"
# awk -F"$str" -v V="$str" 'NF>1{for(i=1;i<NF;i++){match($0,".*"$i);print "Line:"NR,"Pos:",RSTART+RLENGTH,"to",RSTART-1+RLENGTH+length(V)}}' tst
Line:1 Pos: 10 to 12
Line:3 Pos: 10 to 12
Line:3 Pos: 46 to 48
Line:3 Pos: 58 to 60
#

---------- Post updated at 06:59 PM ---------- Previous update was at 06:53 PM ----------

@Corona688 :

What was the content of your input file ?

(thiis is because it looks as if you were missing the third "GAT" on the last line)

---------- Post updated 2011-03-24 at 10:06 AM ---------- Previous update was 2011-03-23 at 06:59 PM ----------

Could be adapted easy for 1 display per line :

Code:
# str="GAT"
# awk -F"$str" -v V="$str" 'NF>1{for(i=1;i<NF;i++){match($0,".*"$i);x=RSTART+RLENGTH;y=y" : "x"-"x-1+length(V)};print "Line "NR y;x=y=z}' tst
Line 1 : 10-12
Line 3 : 10-12 : 46-48 : 58-60
#

If on SunOS/Solaris, use nawk instead of awk

Last edited by ctsgnb; 03-24-2011 at 06:12 AM..
This User Gave Thanks to ctsgnb For This Post:
# 10  
Old 03-24-2011
@ctsgnb:

Thanks for the codes but as sgruenwald mentioned I'm working with a DNA sequence so all lines are part of the same string. I want positions not by where it matched in each line but over the entire length of the string composed of A,T, G and C letters, obviously excluding the header line.

Cheers and thanks for the input, but I think Bartus11 has already provided a solution.
I'll be back with more questions.

Hv a nice day all Smilie

---------- Post updated at 04:26 AM ---------- Previous update was at 04:23 AM ----------

By the way can someone explain in laymans terms what $_ $. and $& variables are? .... I have a hard time understanding this concept

Cheers Smilie
# 11  
Old 03-24-2011
Hi Pawan:

Maybe you should concatenate your DNA part. Use this code:
Code:
head -n +1 <dna >dna_concat; tail -n +2 <dna | tr -d '\n' >>dna_concat

It reads in the description/name of your code on the first line, produces a line feed and then concatenates the DNA code that follows.

Last edited by Franklin52; 03-25-2011 at 06:31 AM.. Reason: Please use code tags, thank you
# 12  
Old 03-24-2011
Quote:
Originally Posted by pawannoel
By the way can someone explain in laymans terms what $_ $. and $& variables are? .... I have a hard time understanding this concept

Cheers Smilie
Check: http://www.catonmat.net/download/per....variables.pdf
This User Gave Thanks to bartus11 For This Post:
# 13  
Old 03-24-2011
Did have a look at that already ... thanx though
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print strings from a particular position in each line

I am using bash in Fedora 30 From the below lines (ls -l output), how can I print whatever is between the strings 'status_' and '.log' $ ls -l | grep -i status -rw-rw-r--. 1 sysadmin sysadmin 378530 Nov 11 21:58 status_vsbm1.log -rw-rw-r--. 1 sysadmin sysadmin 428776 Nov 11 21:58... (8 Replies)
Discussion started by: kraljic
8 Replies

2. UNIX for Dummies Questions & Answers

String pattern matching and position

I am not an expert with linux, but following various posts on this forum, I have been trying to write a script to match pattern of charters occurring together in a file. My file has approximately 200 million characters (upper and lower case), with about 50 characters per line. I have merged all... (5 Replies)
Discussion started by: biowizz
5 Replies

3. Shell Programming and Scripting

awk usage for position matching

i have a requirement like this if the line contains from position 294 to 299 is equal to "prabhu" ,then print entire line . i want to use awk awk '{if(substr(294-299) == 'prabhu') print "line" }' filename (1 Reply)
Discussion started by: ptappeta
1 Replies

4. Shell Programming and Scripting

Finding position of space in a variable

HI All, am trying to find the position of space in a variable, it is working for other characters other than space ulab="ulab1|ulab2" find_pos=`expr index $ulab '|'` echo $find_pos above code worked fine but below one says syntax error ulab="ulab ulab2" find_pos=`expr index $ulab ' '`... (2 Replies)
Discussion started by: ulab
2 Replies

5. Shell Programming and Scripting

Finding relative position in a file

Hi, I have a file like 123 aaaaaaaaa ddddddddd vvvvvvvvv 345 ssssssssssss dddddddddd fffffffffff dddd ff 567 --------- sssssssss ddddddd eeeeeeeee (4 Replies)
Discussion started by: saltysumi
4 Replies

6. Shell Programming and Scripting

Search for multiple strings in specific position

Hi, I need to search for some strings in specific positions in a file. If the strings: "foo1", "foo2" or "foo3" is on position 266 or position 288 in a file i want the whole line printed. Any idea how to do it? (5 Replies)
Discussion started by: HugoH
5 Replies

7. Shell Programming and Scripting

Finding character mismatch position in two strings

Hello, I would like to find an efficient way to compare a pair of strings that differ at one position, and return the difference and position. For example: String1 123456789 String2 123454789 returning something - position 6, 6/4 Thanks in advance, Mike (5 Replies)
Discussion started by: etherite
5 Replies

8. Shell Programming and Scripting

Help in finding the max and min position

Hi, I have this input file called ttbitnres (which is catenated and sorted):- 8 0.4444 213 10 0.5555 342 11 0.5555 321 12 0.5555 231 13 0.4444 400 My code is at :- #!/bin/bash echo -e Version "\t" Number of Pass "\t" Number of Fail "\t" Rank Position "\t"Min "\t" Max... (1 Reply)
Discussion started by: ahjiefreak
1 Replies

9. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies

10. Shell Programming and Scripting

How to insert strings at certain position

Hi, I need to insert strings "0000 00" at the each line within the file. The postion is 37 to 42. ex. name1 name2 0000 00 nam name 0000 00 The "0000 00" in two lines should be lined up. I don't know why it's not lined up when I posted it. Can anyone help? (14 Replies)
Discussion started by: whatisthis
14 Replies
Login or Register to Ask a Question