Extracting range of characters if pattern matches


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Extracting range of characters if pattern matches
# 1  
Old 04-15-2014
Extracting range of characters if pattern matches

Im trying compare values between files and if they match I want to extract some characters in between those values for many files. They are in two directories and have the name filename but one ends in .enr. They look like this.

Code:
 cat bat.1.enr

Code:
name,start,end
bat.1,231, 234

and another that looks like this.

Code:
head bat.1

Code:
  V1 V2  V3 V4 V5    V6 V7 V8
  0 230 -77 -1  D     xx  0  0
  1 231 -77  0  R      tt  0  0
  2 232 -77  1  T     yy  0  0
  3 233 -76 -1  Y     uu  0  0
  4 234 -76  0  U     re  0  0
  5 235 -76  1  I      dd  0  0

Im essentially trying to extract the characters from column V5 in bat.1 that fall between the lines bat.1.enr that contain the values in column V2 that bat.1.enr has. The output should look like this for this example.

Code:
bat.1.out

Code:
bat.1
RTYU

# 2  
Old 04-15-2014
something along these lines (with a bit of useless ls):
Code:
#!/bin/ksh
#set -x

while IFS='[, ]' read file start end junk
do
   if ((!c)); then
    ((c=c+1))
    continue
  fi
   awk -v start="$start" -v end="$end" '
     FNR==1 {next}
     $2==start,$2==end {out=(!out)?$5:out $5}
     END { print out > (FILENAME ".out") }
   ' "$file"
done < $(ls *.enr)

or a long-winded awk one-liner 'feeding on itself':
Code:
 awk -F'[,]' -v q="'" 'FNR==1 {next} {print "awk -v start=" q $2 q " -v end=" q $3 q " -v ext=.out " q "FNR==1{next};$2==int(start),$2==int(end) {out=(!out)?$5:out $5} END{print out > (FILENAME ext)}" q " "$1}' *enr | sh


Last edited by vgersh99; 04-15-2014 at 05:15 PM..
# 3  
Old 04-15-2014
try also:
Code:
# dir1 : dir with .enr files
# dir2 : dir with data files
# odir : output dir
[[ ! -d odir ]] && mkdir odir
for f in dir1/*.enr
do
   infile=${f##*/}
   infile="${infile%[.]enr}"
   awk '
   NR==2 { b=$2; e=$3;}
   NR>FNR {$0=" " $0; if (($3>=b) && ($3<=e)) o=o $6;}
   END {of=FILENAME; sub(".*/", "", of) ;  print of; print o}
   ' FS=, "$f" FS="[ \t]*" dir2/"$infile" > odir/"$infile".out
done


Last edited by rdrtx1; 04-15-2014 at 05:57 PM.. Reason: include location of files directories
# 4  
Old 04-15-2014
Hey there,

This one liner should work for you:

Code:
for i  in bat\.[0-9]; do awk -F"," 'FNR==NR {for (x=$2;x<=$3;x++)  {val[x]++;file=$1};next}{FS=" ";line[$2]=$5};END{print file;for (x in  line) if (x in val) {printf line[x]}print}' $i.enr $i >>  $i.out;done

Hope this helps.
# 5  
Old 04-16-2014
You might want to try this:
Code:
awk -F, 'NR>1   {FN=$1;LL=$2;UL=$3;
                 FS=" "
                        while (getline < FN)
                          if ($2>=LL && $2<=UL) printf "%s", $5 > FN".out"
                          printf "\n" > FN".out" 
                 FS=","
                }
        ' *.enr
cat bat.1.out 
RTYU

# 6  
Old 05-22-2014
Thanks these worked. In the case that I have many input rows how can I generate an output for each row with the appropriate matches?
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to get a 1st line which matches the particular pattern?

Hi all, I have file on which I do grep on "/tmp/data" then I get 5 lines as dir Path: /tmp/data/20162343134 Starting to listen on ports logging: -- Moving results files from local storage: /tmp/resultsFiles/20162343134/*.gz to NFS: /data/temp/20162343134/outgoing from above got to get... (7 Replies)
Discussion started by: girijajoshi
7 Replies

2. Shell Programming and Scripting

sed Range Pattern and 2 lines before Start Pattern

Hi all, I have been searching all over Google but I am unable to find a solution for a particular result that I am trying to achieve. Consider the following input: 1 2 3 4 5 B4Srt1--Variable-0000 B4Srt2--Variable-1111 Srt 6 7 8 9 10 End (3 Replies)
Discussion started by: y2jacky
3 Replies

3. Shell Programming and Scripting

awk with range but matches pattern

To match range, the command is: awk '/BEGIN/,/END/' but what I want is the range is printed only if there is additional pattern that matches in the range itself? maybe like this: awk '/BEGIN/,/END/ if only in that range there is /pattern/' Thanks (8 Replies)
Discussion started by: zorrox
8 Replies

4. Shell Programming and Scripting

Searching for a pattern and extracting records related to that pattern

Hi there, Looking forward to your advice for the below: I have a file which contains 2 paragraphs related to a particular pattern. I have to search for those paragraphs from a log file and then print a particular line from those paragraphs. Sample: I have one file with the fixed... (3 Replies)
Discussion started by: danish0909
3 Replies

5. Shell Programming and Scripting

exit after extracting range if lines - awk

Hello, I was wondering how is it possible if I use this command: awk 'NR >= 998 && NR <= 1000' file.txtTo exit after parsing the 1000th line ( last line targeted) ??? I observed that when executing this command for a large file, if the range of lines is at the beginning of the file it is... (2 Replies)
Discussion started by: black_fender
2 Replies

6. Shell Programming and Scripting

How Not to Delete Words that matches a PATTERN

Hi, I have a test file name test.txt with its contents string 21345 qwee strinn strriin striin i need to delete all the words except the word STRING I used the command cat test.txt | sed 's/^..*$/**/g' but the output entries still contain strinn strriin striin. Plz Help me out.... (5 Replies)
Discussion started by: Ananth12
5 Replies

7. UNIX for Dummies Questions & Answers

extracting lates pattern match from multiple matches in log

Hi, I have a large, multiline log file. I have used pcregrep to extract all entries in that log that match a particular pattern - where that pattern spans multiple lines. However, because the log file is large, and these entries occur every few minutes, I still output a very large amount... (6 Replies)
Discussion started by: dbrb2
6 Replies

8. Shell Programming and Scripting

get value that matches file name pattern

Hi I have files with names that contain the date in several formats as, YYYYMMDD, DD-MM-YY,DD.MM.YY or similar combinations. I know if a file fits in one pattern or other, but i donīt know how to extract the substring contained in the file that matches the pattern. For example, i know that ... (1 Reply)
Discussion started by: pjrm
1 Replies

9. Shell Programming and Scripting

print range between two patterns if it contains a pattern within the range

I want to print between the range two patterns if a particular pattern is present in between the two patterns. I am new to Unix. Any help would be greatly appreciated. e.g. Pattern1 Bombay Calcutta Delhi Pattern2 Pattern1 Patna Madras Gwalior Delhi Pattern2 Pattern1... (2 Replies)
Discussion started by: joyan321
2 Replies

10. Shell Programming and Scripting

Extract if pattern matches

Hi All, I have an input below. I tried to use the awk below but it seems that it ;s not working. Can anybody help ? My concept here is to find the 2nd field of the last occurrence of such pattern " ** XXX ccc ccc cc cc ccc 2007 " . In this case, the 2nd field is " XXX ". With this "XXX" term... (20 Replies)
Discussion started by: Raynon
20 Replies
Login or Register to Ask a Question