Perl: Printing Multiple Lines after pattern match


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Perl: Printing Multiple Lines after pattern match
# 1  
Old 07-12-2009
Perl: Printing Multiple Lines after pattern match

[SIZE=2][SIZE=2]Hello People,
Need some assistance/guidance.
OUTLINE:
Two files (File1 and File2)
File1 has some ids such as
009463_3922_1827
897654_8764_5432
File2 has things along the lines of:
Query= 009463_3922_1827 length=252
(252 letters)

More stufff here

Query= 009525_3967_2963 length=249 uaccno=FIFOXZ216JYL81
(249 letters)
AND MORE STUFF HERE
-----------
PROBLEM:
Capture/finding the Ids stored in File1 from file2 is trivial.
What I need to capture "also" is the remaining part.
For example:
This part of the code gives me the line when it has found the match: Query= 009463_3922_1827 length=252 uaccno=FIFOXZ216JUM5H
while ($line2=<INFILE2>)
{

if ($line2 =~ /$line1/)
{
print $line2;
}

Now how can I get to the other lines below this (Query= 009463_3922_1827 length=252 uaccno=FIFOXZ216JUM5H) line.
For example, everything until
Query= 009525_3967_2963 length=249 uaccno=FIFOXZ216JYL81
(249 letters)
>>>>>>>>>>>>>>>
a) Few ideas I can think of is using SEEK/tell.
Will this be a efficient way, how much to SEEK, the while loop is reading one line at a time so, some how buffer everything until see the pattern as Query=.....
How to find the bytes until then?

b) Using read()
How to find the number of byes after the pattern match?

c)Using the metacharacters to read ahead after the pattern match /ID (?=SOMETHING)/
Tried this but with until, but its not working. May be my regex is incorrect.


If any one can just be a push in the write direction--pseudocode etc. it would be much appreciated.

I am not reading the files or going to use array (copying the contents of a file to an array) as the files are big.

Last edited by Deep9000; 07-17-2009 at 09:57 AM..
# 2  
Old 07-12-2009
Quote:
Originally Posted by Deep9000

...
PROBLEM:
Capture/finding the Ids stored in File1 from file2 is trivial.
What I need to capture "also" is the remaining part.
For example:
This part of the code gives me the line when it has found the match: Query= 009463_3922_1827 length=252 uaccno=FIFOXZ216JUM5H
...
...
Now how can I get to the other lines below this (Query= 009463_3922_1827 length=252 uaccno=FIFOXZ216JUM5H) line.
For example, everything until
Query= 009525_3967_2963 length=249 uaccno=FIFOXZ216JYL81
(249 letters)
...
Since you've solved the first part of the problem, given below is one way to solve the second part.

Assuming that you *know* your ID (from file1), the perl one-liner below fetches the multiline pattern associated with it:

Code:
$ 
$ cat file2
Query= 009463_3922_1827 length=252
(252 letters)

Sequences producing significant alignments: (bits) Value

ref|NZ_ACCL02000008.1| Bryantella formatexigens DSM 14469 B_form... 153 2e-37

Query: 243 cccgcacacg 252
|||||||||
Sbjct: 89219 accgcacacg 89228
More stufff here

Query= 009525_3967_2963 length=249 uaccno=FIFOXZ216JYL81
(249 letters)
AND MORE STUFF HERE 
$ 
$ # try the first ID
$ perl -ne 'BEGIN{undef $/; $x="009463_3922_1827"}{while (/(Query= $x.*)Query=|(Query= $x.*)/msg){ print $1 eq "" ? $2 : $1 }}' file2
Query= 009463_3922_1827 length=252
(252 letters)

Sequences producing significant alignments: (bits) Value

ref|NZ_ACCL02000008.1| Bryantella formatexigens DSM 14469 B_form... 153 2e-37

Query: 243 cccgcacacg 252
|||||||||
Sbjct: 89219 accgcacacg 89228
More stufff here

$ 
$ # another ID
$ perl -ne 'BEGIN{undef $/; $x="009525_3967_2963"}{while (/(Query= $x.*)Query=|(Query= $x.*)/msg){ print $1 eq "" ? $2 : $1 }}' file2
Query= 009525_3967_2963 length=249 uaccno=FIFOXZ216JYL81
(249 letters)
AND MORE STUFF HERE 
$ 
$

tyler_durden
# 3  
Old 07-12-2009
Hi Tyler,
Hi did not quite get the regex you have said. Why is there a "bit" operator (|) in the regular expression ((Query= $x.*)Query=|(Query= $x.*)
The idea was IDs in File1 and lots of Stuff in File2 ^^
Based on the IDs in File1 find the IDs in File2 and then print the whole stuff until other IDs start.

Last edited by Deep9000; 07-12-2009 at 08:17 PM..
# 4  
Old 07-12-2009
Quote:
Originally Posted by Deep9000
...
Why is there a "bit" operator (|) in the regular expression
...
To handle the cases when the record's location is intermediate as well as the last.

Quote:
...
The idea was IDs in File1 and lots of Stuff in File2 ^^
Based on the IDs in File1 find the IDs in File2 and then print the whole stuff until other IDs start.
Right, but this "lots of stuff" is not structured as it does not have a true record separator. Things would've been simpler otherwise.

tyler_durden
# 5  
Old 07-12-2009
Thanks, I really appreciate it.
There is one other thing. You have undef $\. Which is good if the file is small. But the files are going to grow and this is why I am trying to go line by line using the while loops. The only trouble is catching the next "n" lines before the start of other Query=
# 6  
Old 07-14-2009
Incomplete solution

I have it working now...

Last edited by Deep9000; 07-17-2009 at 09:55 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing multiple lines from input file, if multiple lines match a pattern.

GM, I have an issue at work, which requires a simple solution. But, after multiple attempts, I have not been able to hit on the code needed. I am assuming that sed, awk or even perl could do what I need. I have an application that adds extra blank page feeds, for multiple reports, when... (7 Replies)
Discussion started by: jxfish2
7 Replies

2. Shell Programming and Scripting

Remove multiple lines that match pattern

Not sure how I can accomplish this. I would like to remove all interfaces that have the commands I would like to see: switchport port-security, spanning-tree portfast. One line is no problem. interface FastEthernet0/8 spanning-tree portfast interface FastEthernet0/9 spanning-tree... (4 Replies)
Discussion started by: mrlayance
4 Replies

3. Shell Programming and Scripting

Match Pattern and print pattern and multiple lines into one line

Hello Experts , require help . See below output: File inputs ------------------------------------------ Server Host = mike id rl images allocated last updated density vimages expiration last read <------- STATUS ------->... (4 Replies)
Discussion started by: tigerhills
4 Replies

4. Shell Programming and Scripting

Sed: printing lines AFTER pattern matching EXCLUDING the line containing the pattern

'Hi I'm using the following code to extract the lines(and redirect them to a txt file) after the pattern match. But the output is inclusive of the line with pattern match. Which option is to be used to exclude the line containing the pattern? sed -n '/Conn.*User/,$p' > consumers.txt (11 Replies)
Discussion started by: essem
11 Replies

5. Shell Programming and Scripting

Printing next 6 lines from of pattern match

Hi, i have a big file having many opcodes. if (opcode="01110000000100000000" ) then --fadd result.opcode := "01110000000100000000"; result.s0 := '1'; result.s1 := '1'; result.s2 := '0'; result.inst := '0'; result.scalar := '1';... (7 Replies)
Discussion started by: twistedpair
7 Replies

6. Shell Programming and Scripting

Sed/awk/perl command to replace pattern in multiple lines

Hi I know sed and awk has options to give range of line numbers, but I need to replace pattern in specific lines Something like sed -e '1s,14s,26s/pattern/new pattern/' file name Can somebody help me in this.... I am fine with see/awk/perl Thank you in advance (9 Replies)
Discussion started by: dani777
9 Replies

7. Shell Programming and Scripting

shell script: grep multiple lines after pattern match

I have sql file containing lot of queries on different database table. I have to filter specific table queries. Let say i need all queries of test1,test2,test3 along with four lines above it and sql queries can be multi lines or in single line. Input file contains. set INSERT_ID=1; set... (1 Reply)
Discussion started by: mirfan
1 Replies

8. Shell Programming and Scripting

Perl script to match a pattern and print lines

Hi I have a file (say 'file1')and I want to search for a first occurence of pattern (say 'ERROR') and print ten lines in the file below pattern. I have to code it in PERL and I am using Solaris 5.9. I appreciate any help with code Thanks Ammu (6 Replies)
Discussion started by: ammu
6 Replies

9. Shell Programming and Scripting

read and match multiple lines in perl

Could any one tell me how to read and match multiple lines in perl? Did this code below still work in this situation? while (<FILE>) { if (/ /) { } } Thanks a lot! (5 Replies)
Discussion started by: zx1106
5 Replies

10. Shell Programming and Scripting

Concatenating multiple lines to one line if match pattern

Hi all, I've been working on a script which I have hit a road block now. I have written a script using sed to extract the below data and pumped into another file: Severity............: MAJORWARNING Summary: System temperature is out of normal range. Severity............: MAJORWARNING... (13 Replies)
Discussion started by: phixsius
13 Replies
Login or Register to Ask a Question