find if a position is between a given start and end position


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers find if a position is between a given start and end position
# 8  
Old 12-04-2008
Tools

Well, pretty obvious that you are going to need to read through each file at least once; but there is the issue - to only have to read through each file once.

My thought would be to do something like the following (in loose coding)
Notes:
In the first pass, since you said each range was 36, I just wrote to do all thirty-six manually. I suppose you could augment and have it loop through from the first parameter $1 to the second parameter $2
In the second pass, I check to see if the second parameter has its array value been already set. If not, then I print the two parameters of the second file.

Code:
awk {
  FILENAME=file1
  valid[$1]=1
  valid[$1+1]=1
  ...
  valid[$1+35]=1

  FILENAME=file2
  if valid[$2]!=1
    then
       print $1 $2
  fi
} file1 file2

# 9  
Old 12-04-2008
Hi,

Thanks for the reply but I canīt seem to get it to work. Here is what I did and the error message that it posted:

[jofa@em64node02 ~]$ awk {
FILENAME=/mgs/projects/sequencing/joao/cow_Solexa_16.31SDM/Solexa_data/NovoCraft/cow31_perfectReads_trimmed_start_end.txt
valid[$1]=1
awk: cmd. line:1: valid[$1+1]=1
{
awk: cmd. line:1: ^ unexpected newline or end of string
valid[$1+3]=1
[jofa@em64node02 ~]$ FILENAME=/mgs/projects/sequencing/joao/cow_Solexa_16.31SDM/Solexa_data/NovoCraft/cow31_perfectReads_trimmed_start_end.txt
[jofa@em64node02 ~]$ valid[$1]=1
[jofa@em64node02 ~]$ valid[$1+1]=1
[jofa@em64node02 ~]$ valid[$1+2]=1
[jofa@em64node02 ~]$ valid[$1+3]=1
[jofa@em64node02 ~]$ valid[$1+4]=1
[jofa@em64node02 ~]$ valid[$1+5]=1
[jofa@em64node02 ~]$ valid[$1+6]=1
[jofa@em64node02 ~]$ valid[$1+7]=1
[jofa@em64node02 ~]$ valid[$1+8]=1
[jofa@em64node02 ~]$ valid[$1+9]=1
[jofa@em64node02 ~]$ valid[$1+10]=1
[jofa@em64node02 ~]$ valid[$1+11]=1
[jofa@em64node02 ~]$ valid[$1+12]=1
[jofa@em64node02 ~]$ valid[$1+13]=1
[jofa@em64node02 ~]$ valid[$1+14]=1
[jofa@em64node02 ~]$ valid[$1+15]=1
[jofa@em64node02 ~]$ valid[$1+16]=1
[jofa@em64node02 ~]$ valid[$1+17]=1
[jofa@em64node02 ~]$ valid[$1+18]=1
[jofa@em64node02 ~]$ valid[$1+19]=1
[jofa@em64node02 ~]$ valid[$1+20]=1
[jofa@em64node02 ~]$ valid[$1+21]=1
[jofa@em64node02 ~]$ valid[$1+22]=1
[jofa@em64node02 ~]$ valid[$1+23]=1
[jofa@em64node02 ~]$ valid[$1+24]=1
[jofa@em64node02 ~]$ valid[$1+25]=1
[jofa@em64node02 ~]$ valid[$1+26]=1
[jofa@em64node02 ~]$ valid[$1+27]=1
[jofa@em64node02 ~]$ valid[$1+28]=1
[jofa@em64node02 ~]$ valid[$1+29]=1
[jofa@em64node02 ~]$ valid[$1+30]=1
[jofa@em64node02 ~]$ valid[$1+31]=1
[jofa@em64node02 ~]$ valid[$1+32]=1
[jofa@em64node02 ~]$ valid[$1+33]=1
[jofa@em64node02 ~]$ valid[$1+34]=1
[jofa@em64node02 ~]$ valid[$1+35]=1
[jofa@em64node02 ~]$ FILENAME=/mgs/projects/sequencing/joao/cow_Solexa_16.31SDM/Solexa_data/NovoCraft/cow31_var_Readpos.txt
[jofa@em64node02 ~]$ if valid[$2]!=1
> then
> print $1 $2
> fi
-bash: valid[]!=1: command not found
[jofa@em64node02 ~]$ } /mgs/projects/sequencing/joao/cow_Solexa_16.31SDM/Solexa_data/NovoCraft/cow31_perfectReads_trimmed_start_end.txt /mgs/projects/sequencing/joao/cow_Solexa_16.31SDM/Solexa_data/NovoCraft/cow31_var_Readpos.txt > /mgs/projects/sequencing/joao/cow_Solexa_16.31SDM/Solexa_data/NovoCraft/cow31_realVar.txt
-bash: syntax error near unexpected token `}'
[jofa@em64node02 ~]$
# 10  
Old 12-04-2008
Tools What I wrote was "pseudo-code"

I was just proposing a thought process to solve the problem using awk and an array within awk. I was not trying to deal with all of the syntax of awk; hence my "in loose coding".

If this apprach seems reasonable, then take a look at the syntax of awk (or maybe someone else might propose the correct syntax) and see what you come up with. I would suggest you start with a couple of small test files to verify proper programming, and NOT to do a first test against those two very large files.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Retrieving sequences corresponding to start and end position

Hi all, I have a fasta file of a reference sequnce, I will like to retrieve sequences corresponding to a list of start and end position in another file >my_ref_seq GCCCTATAAGGGCAGAAGCTTGTCCTTCTTGTGCCAGTTATGACGTTTGTCCTAACTGCACATCTGGTAG... (4 Replies)
Discussion started by: Ibk
4 Replies

2. Shell Programming and Scripting

Inserting value at a particular position without changing the position of other characters

Hi All, I wanted a sed/awk command to add a value/character on a particular position without disturbing the position of other characters. I have file a.txt OL 10031 Day Black Midi Good Value P01 P07 OL 10031 Day Black Short Good Value P01 P07 I want to get the output as... (2 Replies)
Discussion started by: rahulsk
2 Replies

3. Shell Programming and Scripting

Search for a string at a particular position and replace with blank based on position

Hi, I have a file with multiple lines(fixed width dat file). I want to search for '02' in the positions 45-46 and if available, in that lines, I need to replace value in position 359 with blank. As I am new to unix, I am not able to figure out how to do this. Can you please help me to achieve... (9 Replies)
Discussion started by: Pradhikshan
9 Replies

4. Shell Programming and Scripting

Need command or script to print all lines from 2nd position to last but one position

hi guys, i want command or script to display the content of file from 2nd position to last but one position of a file abcdefghdasdasdsd 123,345,678,345,323 434,656,656,656,656 678,878,878,989,545 4565656667,65656 i want to display the same above file without first and... (2 Replies)
Discussion started by: hemanthsaikumar
2 Replies

5. UNIX for Dummies Questions & Answers

extract regions of file based on start and end position

Hi, I have a file1 of many long sequences, each preceded by a unique header line. file2 is 3-columns list: headers name, start position, end position. I'd like to extract the sequence region of file1 specified in file2. Based on a post elsewhere, I found the code: awk... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

6. Shell Programming and Scripting

Remove text from n position to n position sed/awk

I want to remove text from nth position to nth position couple of times in same line my line is "hello is there anyone can help me with this question" I need like this ello is there anyone can help me with question 'h' is removed and 'this' removed from the line. I want to do this... (5 Replies)
Discussion started by: elamurugu
5 Replies

7. Shell Programming and Scripting

Subsitute from a position till end of line.

Hi, Having a following file's content, lets say: ABC|ANA|LDJ|||||DKD|||||| AJJ|KKDD||KKDK|||||||||||| KKD||KD|||LLLD||||LLD||||| Problem: Need to replace pipes from 8th occurrence of pipe till end. so the result should be: ABC|ANA|LDJ|||||DKD AJJ|KKDD||KKDK|||| ------- ------- ... (12 Replies)
Discussion started by: _Noprofi
12 Replies

8. Shell Programming and Scripting

how to find a position and print some string in the next and same position

I need a script for... how to find a position of column data and print some string in the next line and same position position should find based on *HEADER8* in text for ex: ord123 abs 123 987HEADER89 test234 ord124 abc 124 987HEADER88 test235 ... (1 Reply)
Discussion started by: naveenkcl
1 Replies

9. Shell Programming and Scripting

Add 'ENDEND' on end of each record at position is 14-20

I have file format like below and I'm trying to modify this file. I need to add 'ENDEND' end of each record. 01 ASH01 1CTCTL EDPPOO STAND 01 ASH08 0020 A1TH 101 01 ASH09 0022 A1TH 102 01 ASH09 0022 A1TH 103 01 ASH02 2CTCTL ... (5 Replies)
Discussion started by: naveenkcl
5 Replies

10. Shell Programming and Scripting

check position of end of line(URGENT PLS)

I Have to check in a file that all the lines are ending at same posiotin. Ex : line 1 is ending at position 88 line 2 should at same position i.e 88 Thanks in advance (6 Replies)
Discussion started by: evvander
6 Replies
Login or Register to Ask a Question