Extracting strings from a log file.

Extracting strings from a log file.
Old 09-05-2011
Extracting strings from a log file.

I'm new to all this and I've been fiddling with this problem for HOURS and feel silly that I can't work it out!

I have a .log file that VERY long and looks like this:
2011-08-31 10:03:34      SUESTART AG Amndmnt Client WebRequest DNU [1-7661VJ]SUEEND Sequence: 600, 
2011-08-31 10:03:34      SUESTART AG Amndmnt Client WebRequest DNU [1-7661VJ]SUEEND Person Candidates: 
2011-08-31 10:03:34      SUESTART AG CRM SR - Email/SR Email Surveys [1-6NUS6A]SUEEND Sequence: 671, 
2011-08-31 10:03:34      SUESTART AG CRM SR - Email/SR Email Surveys [1-6NUS6A]SUEEND Person 
2011-08-31 10:03:34      SUESTART AG CRM SR - ServiceFailure/Complaint/SurveyResponse [1-67MD2A]SUEEND Sequence: 
31 10:03:34      SUESTART AG CRM SR -  ServiceFailure/Complaint/SurveyResponse [1-67MD2A]SUEEND Person  Candidates: From Rule, Organization Candidates: From Rule, Non-Exclusive

Note that this is not exactly the file - I've manipulated it a bit, but you'll get the idea. I've managed to amend the file to put in SUESTART and SUEEND to demarcate the data that I'm interested in. I'd like to output a file that only returns the text inbetween SUESTART and SUEEND that looks like this:
AG Amndmnt Client WebRequest DNU [1-7661VJ]
AG Amndmnt Client WebRequest DNU [1-7661VJ]
AG CRM SR - Email/SR Email Surveys [1-6NUS6A]
AG CRM SR - Email/SR Email Surveys [1-6NUS6A]
AG CRM SR - ServiceFailure/Complaint/SurveyResponse [1-67MD2A]
AG CRM SR -  ServiceFailure/Complaint/SurveyResponse [1-67MD2A]

This is the closest I've got:

sed -e 's/.*SUESTART //g' -e 's/SUEEND.*$//g' < rules.log > output.log

But this doesn't work because it only returns the first line and then just trims off the rest of the entire log file:
AG Amndmnt Client WebRequest DNU [1-7661VJ]

Any help will be much appreciated!
Thank you.

Last edited by Scott; 09-05-2011 at 11:03 AM..
Old 09-05-2011
$ sed 's/.*SUESTART\(.*\)SUEEND.*/\1/' file
 AG Amndmnt Client WebRequest DNU [1-7661VJ]
 AG Amndmnt Client WebRequest DNU [1-7661VJ]
 AG CRM SR - Email/SR Email Surveys [1-6NUS6A]
 AG CRM SR - Email/SR Email Surveys [1-6NUS6A]
 AG CRM SR - ServiceFailure/Complaint/SurveyResponse [1-67MD2A]
 AG CRM SR - ServiceFailure/Complaint/SurveyResponse [1-67MD2A]

Old 09-05-2011
A Perl solution using lookahead and look behind
 perl -e   'while(<>){print "$1\n" if /(?<=SUESTART)(.+)(?=SUEEND)/;}' ~/src/Perl/data.tmp

Old 09-05-2011
perl -nle '/SUESTART (.*)SUEEND/ && print $1' file

Old 09-05-2011

Wow - what a quick response. I've tried all 3 recommendations:

1. sed 's/.*SUESTART\(.*\)SUEEND.*/\1/' rules.log > output.log

returns just the first line:
AG SR Assignment - Unit Trust - Bulk - Private Client [1-6NUS86]

2. perl -e 'while(<>){print "$1\n" if /(?<=SUESTART)(.+)(?=SUEEND)/;}' rules.log > output.log

interesting output, but not what I need. It returns everything inbetween the first SUESTART and the last SUEEND:

AG Amndmnt Client WebRequest DNU [1-7661VJ]SUEEND Sequence: 600, 
2011-08-31 10:03:34      SUESTART AG Amndmnt Client WebRequest DNU [1-7661VJ]SUEEND Person Candidates: 
2011-08-31 10:03:34      SUESTART AG CRM SR - Email/SR Email Surveys [1-6NUS6A]SUEEND Sequence: 671, 
2011-08-31 10:03:34      SUESTART AG CRM SR - Email/SR Email Surveys [1-6NUS6A]SUEEND Person 
2011-08-31 10:03:34      SUESTART AG CRM SR - ServiceFailure/Complaint/SurveyResponse [1-67MD2A]SUEEND Sequence: 
31 10:03:34      SUESTART AG CRM SR -  ServiceFailure/Complaint/SurveyResponse [1-67MD2A]

3. perl -nle '/SUESTART (.*)SUEEND/ && print $1' rules.log > output.log

does the same as 2.

Any more suggestions?

Thanks so much!

---------- Post updated at 03:24 PM ---------- Previous update was at 03:22 PM ----------

just to add the icon to show that suggestion didn't work... thanks anyway!

Last edited by radoulov; 09-05-2011 at 11:30 AM..
Old 09-05-2011
I think you should post the data that you are using this code on, without any modifications...
Old 09-05-2011
attaching file


I attached the rules.txt file (I wasn't able to attach a .log file but the .txt works in the same way.

Thanks for any responses!
