awk to print out lines that do not fall between range in file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to print out lines that do not fall between range in file
# 1  
Old 06-01-2017
awk to print out lines that do not fall between range in file

In the awk below I am trying to print out those lines in file2 that are no between $2 and $3 in file1. Both files are
tab-delimeted and I think it's close but currently it is printeing out the matches. The --- are not part of the files they are just to show what lines match or fall into
the range and don't need to be printed. Thank you Smilie.

file1
Code:
chr1 948953 948956 chr1:948953-948956 . ISG15
chr1 949363 949858 chr1:949363-949858 . ISG15
chr19 42373737 42373856 chr19:42373737-42373856 . RPS19

file2
Code:
chr1 948796 949006 chr1:948796-949006 . ISG15                  ---- line1 of file1
chr1 949313 949969 chr1:949313-949969 . ISG15                  ---- line2 of file1
chr19 42363937 42364409 chr19:42363937-42364409 . RPS19
chr19 42364286 42364565 chr19:42364286-42364565 . RPS19
chr19 42364465 42364614 chr19:42364465-42364614 . RPS19
chr19 42364794 42364965 chr19:42364794-42364965 . RPS19
chr19 42365130 42365331 chr19:42365130-42365331 . RPS19
chr19 42373050 42373334 chr19:42373050-42373334 . RPS19
chr19 42373718 42373873 chr19:42373718-42373873 . RPS19    ---- line3 of file1
chr19 42375368 42375534 chr19:42375368-42375534 . RPS19

awk
Code:
awk '
    NR==FNR{for(i=$2;i<=$3;++i) d[$1,i] = $6; next}
    d[$1,$2]{print $0}' file1 file2

current output
Code:
chr1 948953 948956 chr1:948953-948956 . ISG15
chr1 949363 949858 chr1:949363-949858 . ISG15
chr19 42373737 42373856 chr19:42373737-42373856 . RPS19

desired output
Code:
chr19 42363937 42364409 chr19:42363937-42364409 . RPS19
chr19 42364286 42364565 chr19:42364286-42364565 . RPS19
chr19 42364465 42364614 chr19:42364465-42364614 . RPS19
chr19 42364794 42364965 chr19:42364794-42364965 . RPS19
chr19 42365130 42365331 chr19:42365130-42365331 . RPS19
chr19 42373050 42373334 chr19:42373050-42373334 . RPS19
chr19 42375368 42375534 chr19:42375368-42375534 . RPS19


Last edited by cmccabe; 06-01-2017 at 01:16 PM.. Reason: fixed format
# 2  
Old 06-01-2017
Your requirements aren't clear. Are you trying to:
  1. print all lines where $1, $2 in file2 does appear in the range $1, [$2-$3] in file1 (which is what your code is currently doing),
  2. print all lines where $1, $2 in file2 does NOT appear in the range $1, [$2-$3] in file1,
  3. print all lines where no element in the range $1, [$2-$3] in file2 appears in the range $1, [$2-$3] in file1, or
  4. print all lines where at least one element in the range $1, [$2-$3] in file2 does not appear in the range $1, [$2-$3] in file1?
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 06-01-2017
Quote:
print all lines where $1, $2 in file2 does NOT appear in the range $1, [$2-$3] in file1,
The above is what I am trying to do as each element is treated as a pair, so it. That is each $2 is combined with a $3. Basically, the opposite of my code. I can seem to print the lines in the range, but not the lines not in the range. Thank you Smilie.

Last edited by cmccabe; 06-01-2017 at 07:04 PM.. Reason: added details
# 4  
Old 06-01-2017
Quote:
Originally Posted by cmccabe
The above is what I am trying to do as each element is treated as a pair, so it. That is each $2 is combined with a $3. Basically, the opposite of my code. I can seem to print the lines in the range, but not the lines not in the range. Thank you Smilie.
You have confused the matter more. You are not looking at $3 in file2 so it can't possibly affect the output produced by your script. If you just want to reverse the output produced by your script change it to:
Code:
awk '
    NR==FNR{for(i=$2;i<=$3;++i) d[$1,i] = $6; next}
    !d[$1,$2]{print $0}' file1 file2

or, using the default action when a condition is met:
Code:
awk '
    NR==FNR{for(i=$2;i<=$3;++i) d[$1,i] = $6; next}
    !d[$1,$2]' file1 file2

or to take less space:
Code:
awk '
    NR==FNR{for(i=$2;i<=$3;++i) d[$1,i]; next}
    !(($1,$2) in d)' file1 file2

This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 06-01-2017
Sorry for the typo, how does the shorter, less space awk work? Thank you Smilie.
# 6  
Old 06-01-2017
When reading the 1st input file, it creates empty array elements instead of assigning values to them (so you don't need space to store the strings you were assigning to those elements). When reading the 2nd input file, it checks to see if an element with the given index has been created instead of checking to see whether the value of the array element with that index has been assigned a non-empty string, non-zero value.
This User Gave Thanks to Don Cragun For This Post:
# 7  
Old 06-02-2017
Thank you very much Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

TCL script to print range of lines between patterns

Hi I am having a code as stated below module abcd( a , b , c ,da , fa, na , ta , ma , ra , ta, la , pa ); input a , b, da ,fa , na , ta , ma; output c , ra ,ta , la ,pa ; wire a , b , da , fa ,na , ta , ma; endmodule I need to match the string... (1 Reply)
Discussion started by: kshitij
1 Replies

2. UNIX for Beginners Questions & Answers

Advise on how to print range of lines above and below a number?

Hi, I have attached an output file which is some kind of database file mapping. It is basically like an allocation mapping of a tablespace and its datafile/s. The output is generated by the SQL script that I found from 401 Authorization Required Excerpts of the file are as below: ... (2 Replies)
Discussion started by: newbie_01
2 Replies

3. Shell Programming and Scripting

Grep range of lines to print a line number on match

Hi Guru's, I am trying to grep a range of line numbers (based on match) and then look for another match which starts with a special character '$' and print the line number. I have the below code but it is actually printing the line number counting starting from the first line of the range i am... (15 Replies)
Discussion started by: Kevin Tivoli
15 Replies

4. Shell Programming and Scripting

Sed print range of lines between line number and pattern

Hi, I have a file as below This is the line one This is the line two <\XMLTAG> This is the line three This is the line four <\XMLTAG> Output of the SED command need to be as below. This is the line one This is the line two <\XMLTAG> Please do the need to needful to... (4 Replies)
Discussion started by: RMN
4 Replies

5. Shell Programming and Scripting

print range of lines matching pattern and previous line

Hi all, on Solaris 10, I'd like to print a range of lines starting at pattern but also including the very first line before pattern. the following doesn't print the range starting at pattern and going down to the end of file: cat <my file> | sed -n -e '/<pattern>{x;p;}/' I need to include the... (1 Reply)
Discussion started by: siriche
1 Replies

6. Shell Programming and Scripting

awk to print range of fields

Hi file.in and file.out are in csv format. the code I have now is, cat file.in | awk -F"," '!($1$2$3$4$5$6$7$8 in a){a;print $0}' > file.out Here, I am printing entire line using $0. however, I want to print $1 to $150 and it should be in csv format. Cut -d is not good in performace.... (3 Replies)
Discussion started by: krishnix
3 Replies

7. Shell Programming and Scripting

awk print lines in a file

Dear All, a.txt A 1 Z A 1 ZZ B 2 Y B 2 AA how can i use awk one line to achieve the result: A Z|ZZ B Y|AA Thanks (5 Replies)
Discussion started by: jimmy_y
5 Replies

8. Shell Programming and Scripting

How to print first matching range in awk?

following is input - <Schema> <schema_name>admin</schema_name> <Version>1.1</Version> <schema_name>admin</schema_name> <Version>1.2</Version> </Schema> ... (12 Replies)
Discussion started by: thearpit
12 Replies

9. Shell Programming and Scripting

Print lines matching value(s) in other file using awk

Hi, I have two comma separated files. I would like to see field 1 value of File1 exact match in field 2 of File2. If the value matches, then it should print matched lines from File2. I have achieved the results using cut, paste and egrep -f but I would like to use awk as it is efficient way and... (7 Replies)
Discussion started by: SBC
7 Replies

10. Shell Programming and Scripting

retrieve lines from file which fall under the given date range

Hi, I need to retrieve the lines which fall under the given date range. eg:In a log file,i have the lines which will have the timestamp. the input will be some date range.eg: from date:03/Jan/2008,to date:24/Jul/2008.so now i want to retrieve the lines which have the timestamp between these... (5 Replies)
Discussion started by: Sharmila_P
5 Replies
Login or Register to Ask a Question