String Manipulation in a text file | Unix Linux Forums | Programming

  Go Back    


Programming Post questions about C, C++, Java, SQL, and other programming languages here.

String Manipulation in a text file

Programming


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 01-30-2013
John Trevor John Trevor is offline
Registered User
 
Join Date: Jan 2013
Last Activity: 31 January 2013, 2:15 AM EST
Posts: 2
Thanks: 2
Thanked 0 Times in 0 Posts
String Manipulation in a text file

Hi

I have a requirement to write a script but not sure which is the best way to approach this

I have not worked in sed but I'm aware that its robust for file extraction requirements

I have a scripting task. I already developed the code in perl but the script is taking almost 2 mins for execution . (Input file size is almost 5 MB).
I believe this can be done in a better way using awk or sed.
Which is the best way to approach this ?

Expecting expert advise from the awk and sed gurus here


Requirement in Detail:
a) Capture all the Ticket IDs from a text file in a linux box
Ticket ID will be the 16 digit string (beginning with TT) after the string "TicketNumber===>" (highlighted in red in the sample input file content below)

b)Based on each Ticket Id, search the entire file and calculate the response time delays (difference of the date and time fields at the beginning of each line) for every alert

eg: For 00000052605, the requirement is to pull the delay b/w Request and Response Time for (highlighted in red in the sample input file content below)

1)acknowledged
2)assigned
3)analyse
4)Task Set

Also, It's not necessary that the line containing the Request log text should be followed by the the Response log text. .Multiple Alarm info could be mixed up..Ticket ID is the the only unique field..

c) Finally, the delays for each alert should be saved into another file in csv format

Let me know if you need any other relevant info

Sample Input File:


Code:
28 Jan 2013 21:45:22,279: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Create ticket Request===> 1359431122 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:26,710: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Create ticket Response===> 1359431126 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:27,256: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> TicketNumber===> 000000052605 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:27,731: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Request===> TT-000000052605 openactive.assigned 1359431127 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:30,328: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Response===> TT-000000052605 openactive.assigned 1359431130 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:32,633: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Request===> TT-000000052605 openactive.acknowledged 1359431132 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:34,608: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Response===> TT-000000052605 openactive.acknowledged 1359431134 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:35,093: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Request===> TT-000000052605 openactive.analyzed 1359431135 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:37,021: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Response===> TT-000000052605 openactive.analyzed 1359431137 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:37,477: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket - Task Set Request===> TT-000000052605 1359431137 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:39,688: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket - Task Set Response===> TT-000000052605 1359431139 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111

Sponsored Links
    #2  
Old 01-30-2013
DGPickett DGPickett is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 8 July 2014, 12:19 PM EDT
Location: Southern NJ, USA (Nord)
Posts: 4,378
Thanks: 8
Thanked 535 Times in 514 Posts
Well, you need to avoid n^2 slowness, scanning the entire file for every ticket id, so why not use sed to prefix each line with ticket id and time, sort, and process serially. This is very old school sort of processing, but robust. Really, you can reduce each line to three fields: id, time, stage?
The Following User Says Thank You to DGPickett For This Useful Post:
John Trevor (01-31-2013)
Sponsored Links
    #3  
Old 01-30-2013
neutronscott's Avatar
neutronscott neutronscott is offline Forum Advisor  
script kiddie
 
Join Date: Jun 2011
Last Activity: 16 July 2014, 4:03 PM EDT
Location: McMurdo Station, Antarctica
Posts: 776
Thanks: 28
Thanked 229 Times in 216 Posts
this may get you started?

Code:
mute@clt:~/temp/JohnTrevor$ cat script
#!/usr/bin/awk -f

$11 == "Request===>" && $13 == "openactive.assigned" { ass[$12]=$14 }
$11 == "Request===>" && $13 == "openactive.acknowledged" { ack[$12]=$14 }
$11 == "Request===>" && $13 == "openactive.analyzed" { anal[$12]=$14 }

END {
        for (tt in ass) {
                printf("[%s] ass:%d ack:%d analyse:%d\n", tt, ass[tt], ack[tt], anal[tt]);
        }
}
mute@clt:~/temp/JohnTrevor$ ./script log
[TT-000000052605] ass:1359431127 ack:1359431132 analyse:1359431135

The Following User Says Thank You to neutronscott For This Useful Post:
John Trevor (01-31-2013)
    #4  
Old 01-31-2013
John Trevor John Trevor is offline
Registered User
 
Join Date: Jan 2013
Last Activity: 31 January 2013, 2:15 AM EST
Posts: 2
Thanks: 2
Thanked 0 Times in 0 Posts
Thanks.. I'll try the awk code

---------- Post updated at 02:15 AM ---------- Previous update was at 02:05 AM ----------

Yes, Ticket ID, Date/Time and Ticket Status are the only 3 fields that matters but I'm not sure how to do this prefixing in sed. Can you give me a sample code snippet to start with ?
Sponsored Links
    #5  
Old 01-31-2013
bakunin bakunin is offline Forum Staff  
Bughunter Extraordinaire
 
Join Date: May 2005
Last Activity: 22 July 2014, 1:45 PM EDT
Location: In the leftmost byte of /dev/kmem
Posts: 4,110
Thanks: 42
Thanked 761 Times in 601 Posts
This is a perfect example to explain how to tackle such seemingly complex problems. In fact it is simple and straightforward:

Quote:
Originally Posted by John Trevor View Post
but I'm not sure how to do this prefixing in sed. Can you give me a sample code snippet to start with ?
First, we single out all lines with "TicketNumber===>" i them. These are the only lines we need to work on in this step:


Code:
sed -n '/TicketNumber===>/p' /path/to/inputfile | more

This does nothing more than to print the lines we want to work on: first check, if we found all the lines we want to find and didn'tfind all the lines we do not want to find. If the result is OK, we proceed.

Second, we catch the "word" immediately following the string "TicketNumber===>", because this is the ticket number itself. We display this ticket number instead of the original line to make sure we got that right:


Code:
sed -n '/TicketNumber===>/ {
              s/^.*TicketNumber===> \([^ ]*/) .*/\1/p
         }' /path/to/inputfile | more

Check again and compare with your input to make sure this is what you want - in case it isn't you will adapt the regexp until finally getting what you want.

Then proceed to the final step: we prepend every line with a ticket number in it with a field with the ticket number we have just isolated:


Code:
sed '/TicketNumber===>/ {
           s/^.*TicketNumber===> \([^ ]*/) .*$/\1:&/
      }' /path/to/inputfile > resultfile

Some obeservations: First, as you are only interested in the first and the last line of every transaction - that is, for every ticket number the opening and the closing line - you could throw away all lines in between, yes? If you search for both of these lines and prepend both with the ticket number a simple "sort" will provide the ordering so you can process the resulting file line by line:


Code:
sed -n '/TicketNumber===>/ {
              s/^.*TicketNumber===> \([^ ]*/) .*$/\1:&/p
         }
        /Task Set Response===> TT-/ {
              s/^.*Task Set Response===> TT-\([^ ]*/) .*$/\1:&/p
         }' /path/to/inputfile > resultfile


Second: you probably will not need some of the information in the source lines. By adapting the replacement part(s) of the regexps you can further trim down the resultfile to contain only the information you need.

You will probably have to fine-tune this to completely meet your needs, but this should give you a good start.

I hope this helps.

bakunin

Last edited by bakunin; 01-31-2013 at 06:06 AM..
Sponsored Links
    #6  
Old 01-31-2013
DGPickett DGPickett is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 8 July 2014, 12:19 PM EDT
Location: Southern NJ, USA (Nord)
Posts: 4,378
Thanks: 8
Thanked 535 Times in 514 Posts
Assuming bash and $flds set up as array:
Code:
sed '
  s/^\(..\) \(...\) \(....\) \([0-9:.,]*\): .*===> Ticket[- ]*\(.*\)===> [-T]*\([0-9]*\).*/\6 \3 \2 \1 \4 \5/
  t mon
  d
  :mon
  s/*\([0-9: ,.]*\) Jan /\1 01 /
  s/*\([0-9: ,.]*\) Feb /\1 02 /
  s/*\([0-9: ,.]*\) Mar /\1 03 /
  s/*\([0-9: ,.]*\) Apr /\1 04 /
  s/*\([0-9: ,.]*\) May /\1 05 /
  s/*\([0-9: ,.]*\) Jun /\1 06 /
  s/*\([0-9: ,.]*\) Jul /\1 07 /
  s/*\([0-9: ,.]*\) Aug /\1 08 /
  s/*\([0-9: ,.]*\) Sep /\1 09 /
  s/*\([0-9: ,.]*\) Oct /\1 10 /
  s/*\([0-9: ,.]*\) Nov /\1 11 /
  s/*\([0-9: ,.]*\) Dec /\1 12 /
 ' in_file | sort | while read -a flds
 do
  ...
 done >out_file


Last edited by DGPickett; 01-31-2013 at 12:41 PM..
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Awk to convert a text file to CSV file with some string manipulation FreddyDaKing Shell Programming and Scripting 9 08-17-2012 12:19 PM
Text File Manipulation ravigupta2u Shell Programming and Scripting 1 12-09-2010 02:16 PM
File text manipulation nullifx Shell Programming and Scripting 3 11-07-2010 02:53 AM
String manipulation in a file ramse8pc Shell Programming and Scripting 7 01-23-2010 09:28 AM
Text file manipulation svannala UNIX for Dummies Questions & Answers 5 01-20-2006 06:01 PM



All times are GMT -4. The time now is 03:34 PM.