06-22-2007
4,
0
Join Date: Jun 2007
Last Activity: 14 May 2008, 3:00 PM EDT
Posts: 4
Thanks Given: 0
Thanked 0 Times in 0 Posts
Sed/awk gods, I need your Help! Fancy log extraction
Hi! I'm trying to find a way to extract a certain amount of lines from a log file. This would allow me to "follow" a web user through our log files.
Here is a sample fake log file to explain what i want to accomplish :
[2007-06-22 09:33:15,843][thread-1][BEG_]BEGIN REQUEST sessionID=123456
[2007-06-22 09:33:15,844][thread-1][DEB_]boatload of lines for thread-1 detailing the whereabouts of the customer
[2007-06-22 09:33:15,844][thread-2][DEB_]Here is activity from another customer - we don't need that
[2007-06-22 09:33:15,844][thread-1][DEB_]boatload of lines for thread-1 detailing the whereabouts of the customer
[2007-06-22 09:33:15,844][thread-3][DEB_]more activity from yet another customer- we don't need that
[2007-06-22 09:33:15,844][thread-1][DEB_]boatload of lines for thread-1 detailing the whereabouts of the customer
[2007-06-22 09:33:15,843][thread-1][BEG_]END REQUEST
[2007-06-22 09:33:15,843][thread-34][BEG_]BEGIN REQUEST sessionID=123456
[2007-06-22 09:33:15,844][thread-1][DEB_]Another customer took thread-1! We don't want that log entry either
[2007-06-22 09:33:15,844][thread-34][DEB_]yet more activity from the customer but under a different thread!
[2007-06-22 09:33:15,843][thread-34][BEG_]END REQUEST
What i need is a request that, using sessionID=123456, will identify the appropriate thread ID and extract the lines containing the thread ID between the BEGIN REQUEST and END REQUEST tags.
So basically, the result would be :
[2007-06-22 09:33:15,843][thread-1][BEG_]BEGIN REQUEST sessionID=123456
[2007-06-22 09:33:15,844][thread-1][DEB_]boatload of lines for thread-1 detailing the whereabouts of the customer
[2007-06-22 09:33:15,844][thread-1][DEB_]boatload of lines for thread-1 detailing the whereabouts of the customer
[2007-06-22 09:33:15,844][thread-1][DEB_]boatload of lines for thread-1 detailing the whereabouts of the customer
[2007-06-22 09:33:15,843][thread-1][BEG_]END REQUEST
[2007-06-22 09:33:15,843][thread-34][BEG_]BEGIN REQUEST sessionID=123456
[2007-06-22 09:33:15,844][thread-34][DEB_]yet more activity from the customer but under a different thread!
[2007-06-22 09:33:15,843][thread-34][BEG_]END REQUEST
what the expression would need to do :
1 - locate sessionID=123456
2 - grab threadID from the same line
3 - dump all threadID lines up to threadID.*END REQUEST
4 - rinse and repeat
Unfortunately, i'm only a neophyte in using sed or awk so i have no idea how to proceed...
Not even sure this can be done. If not i'll use perl, but having a nice expression that could do that (and understanding it) would be a big help for me.
If someone can lend me a hand or at least give me pointers, that'd be very appreciated. Hope my question is clear enough!
Thanks
Last edited by gnagus; 06-22-2007 at 04:09 PM..
Reason: Edit : removed references to UniqueID, replaced by sessionID