String Manipulation in a text file


 
Thread Tools Search this Thread
Top Forums Programming String Manipulation in a text file
# 1  
Old 01-30-2013
String Manipulation in a text file

Hi

I have a requirement to write a script but not sure which is the best way to approach this

I have not worked in sed but I'm aware that its robust for file extraction requirements

I have a scripting task. I already developed the code in perl but the script is taking almost 2 mins for execution . (Input file size is almost 5 MB).
I believe this can be done in a better way using awk or sed.
Which is the best way to approach this ?

Expecting expert advise from the awk and sed gurus here


Requirement in Detail:
a) Capture all the Ticket IDs from a text file in a linux box
Ticket ID will be the 16 digit string (beginning with TT) after the string "TicketNumber===>" (highlighted in red in the sample input file content below)

b)Based on each Ticket Id, search the entire file and calculate the response time delays (difference of the date and time fields at the beginning of each line) for every alert

eg: For 00000052605, the requirement is to pull the delay b/w Request and Response Time for (highlighted in red in the sample input file content below)

1)acknowledged
2)assigned
3)analyse
4)Task Set

Also, It's not necessary that the line containing the Request log text should be followed by the the Response log text. .Multiple Alarm info could be mixed up..Ticket ID is the the only unique field..

c) Finally, the delays for each alert should be saved into another file in csv format

Let me know if you need any other relevant info

Sample Input File:

Code:
28 Jan 2013 21:45:22,279: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Create ticket Request===> 1359431122 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:26,710: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Create ticket Response===> 1359431126 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:27,256: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> TicketNumber===> 000000052605 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:27,731: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Request===> TT-000000052605 openactive.assigned 1359431127 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:30,328: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Response===> TT-000000052605 openactive.assigned 1359431130 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:32,633: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Request===> TT-000000052605 openactive.acknowledged 1359431132 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:34,608: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Response===> TT-000000052605 openactive.acknowledged 1359431134 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:35,093: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Request===> TT-000000052605 openactive.analyzed 1359431135 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:37,021: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket State Change Response===> TT-000000052605 openactive.analyzed 1359431137 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:37,477: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket - Task Set Request===> TT-000000052605 1359431137 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111
28 Jan 2013 21:45:39,688: [TEP_CreateTicket][MessageProcessor-Dog#8]Parser log: T3E===> Ticket - Task Set Response===> TT-000000052605 1359431139 Trap Forwarder on server01 HCL-INFRA on 192.168.1.9 RJ741RJ 903 environmentfailureevent Minor 1359431111

# 2  
Old 01-30-2013
Well, you need to avoid n^2 slowness, scanning the entire file for every ticket id, so why not use sed to prefix each line with ticket id and time, sort, and process serially. This is very old school sort of processing, but robust. Really, you can reduce each line to three fields: id, time, stage?
This User Gave Thanks to DGPickett For This Post:
# 3  
Old 01-30-2013
this may get you started?
Code:
mute@clt:~/temp/JohnTrevor$ cat script
#!/usr/bin/awk -f

$11 == "Request===>" && $13 == "openactive.assigned" { ass[$12]=$14 }
$11 == "Request===>" && $13 == "openactive.acknowledged" { ack[$12]=$14 }
$11 == "Request===>" && $13 == "openactive.analyzed" { anal[$12]=$14 }

END {
        for (tt in ass) {
                printf("[%s] ass:%d ack:%d analyse:%d\n", tt, ass[tt], ack[tt], anal[tt]);
        }
}
mute@clt:~/temp/JohnTrevor$ ./script log
[TT-000000052605] ass:1359431127 ack:1359431132 analyse:1359431135

This User Gave Thanks to neutronscott For This Post:
# 4  
Old 01-31-2013
Thanks.. I'll try the awk code

---------- Post updated at 02:15 AM ---------- Previous update was at 02:05 AM ----------

Yes, Ticket ID, Date/Time and Ticket Status are the only 3 fields that matters but I'm not sure how to do this prefixing in sed. Can you give me a sample code snippet to start with ?
# 5  
Old 01-31-2013
This is a perfect example to explain how to tackle such seemingly complex problems. In fact it is simple and straightforward:

Quote:
Originally Posted by John Trevor
but I'm not sure how to do this prefixing in sed. Can you give me a sample code snippet to start with ?
First, we single out all lines with "TicketNumber===>" i them. These are the only lines we need to work on in this step:

Code:
sed -n '/TicketNumber===>/p' /path/to/inputfile | more

This does nothing more than to print the lines we want to work on: first check, if we found all the lines we want to find and didn'tfind all the lines we do not want to find. If the result is OK, we proceed.

Second, we catch the "word" immediately following the string "TicketNumber===>", because this is the ticket number itself. We display this ticket number instead of the original line to make sure we got that right:

Code:
sed -n '/TicketNumber===>/ {
              s/^.*TicketNumber===> \([^ ]*/) .*/\1/p
         }' /path/to/inputfile | more

Check again and compare with your input to make sure this is what you want - in case it isn't you will adapt the regexp until finally getting what you want.

Then proceed to the final step: we prepend every line with a ticket number in it with a field with the ticket number we have just isolated:

Code:
sed '/TicketNumber===>/ {
           s/^.*TicketNumber===> \([^ ]*/) .*$/\1:&/
      }' /path/to/inputfile > resultfile

Some obeservations: First, as you are only interested in the first and the last line of every transaction - that is, for every ticket number the opening and the closing line - you could throw away all lines in between, yes? If you search for both of these lines and prepend both with the ticket number a simple "sort" will provide the ordering so you can process the resulting file line by line:

Code:
sed -n '/TicketNumber===>/ {
              s/^.*TicketNumber===> \([^ ]*/) .*$/\1:&/p
         }
        /Task Set Response===> TT-/ {
              s/^.*Task Set Response===> TT-\([^ ]*/) .*$/\1:&/p
         }' /path/to/inputfile > resultfile


Second: you probably will not need some of the information in the source lines. By adapting the replacement part(s) of the regexps you can further trim down the resultfile to contain only the information you need.

You will probably have to fine-tune this to completely meet your needs, but this should give you a good start.

I hope this helps.

bakunin

Last edited by bakunin; 01-31-2013 at 07:06 AM..
# 6  
Old 01-31-2013
Assuming bash and $flds set up as array:
Code:
sed '
  s/^\(..\) \(...\) \(....\) \([0-9:.,]*\): .*===> Ticket[- ]*\(.*\)===> [-T]*\([0-9]*\).*/\6 \3 \2 \1 \4 \5/
  t mon
  d
  :mon
  s/*\([0-9: ,.]*\) Jan /\1 01 /
  s/*\([0-9: ,.]*\) Feb /\1 02 /
  s/*\([0-9: ,.]*\) Mar /\1 03 /
  s/*\([0-9: ,.]*\) Apr /\1 04 /
  s/*\([0-9: ,.]*\) May /\1 05 /
  s/*\([0-9: ,.]*\) Jun /\1 06 /
  s/*\([0-9: ,.]*\) Jul /\1 07 /
  s/*\([0-9: ,.]*\) Aug /\1 08 /
  s/*\([0-9: ,.]*\) Sep /\1 09 /
  s/*\([0-9: ,.]*\) Oct /\1 10 /
  s/*\([0-9: ,.]*\) Nov /\1 11 /
  s/*\([0-9: ,.]*\) Dec /\1 12 /
 ' in_file | sort | while read -a flds
 do
  ...
 done >out_file


Last edited by DGPickett; 01-31-2013 at 01:41 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Text File Manipulation

Hello, Supposing I had a huge list as follows: TAC manufacturer Device Type 1392600 LG D959 LG-D959TS FeaturePhone 1409700 LG V410 FeaturePhone 35150806 LG F350S FeaturePhone 35165206 Samsung GT-E1200 FeaturePhone 35194505 Nokia Asha 200 FeaturePhone but I want to make it look like... (3 Replies)
Discussion started by: Cludgie
3 Replies

2. Shell Programming and Scripting

Text file manipulation

Hi Gurus, I have a question I have a flat file like below with three fields (3 rd field is amt) ad|B|500 cc||100 dd|C|600 ee||900 Need to write a code in such a way that when second field is empty then do sum of third field So in this case it will be 100 +900 I tried but no luck... (1 Reply)
Discussion started by: patricjemmy6
1 Replies

3. UNIX for Dummies Questions & Answers

Mathematical manipulation of a text file

I have a tab delimited file with 4 columns. If the value in the first column, equals the value in the second column, I'd like to have the 4th column multiplied by 2 then add 1. If the value in the first column differs from the value in the second, I'd like to have the 4th column multiplied by 2... (5 Replies)
Discussion started by: evelibertine
5 Replies

4. Shell Programming and Scripting

Awk to convert a text file to CSV file with some string manipulation

Hi , I have a simple text file with contents as below: 12345678900 971,76 4234560890 22345678900 5971,72 5234560990 32345678900 71,12 6234560190 the new csv-file should be like: Column1;Column2;Column3;Column4;Column5 123456;78900;971,76;423456;0890... (9 Replies)
Discussion started by: FreddyDaKing
9 Replies

5. UNIX for Dummies Questions & Answers

Text File Manipulation Help

Hi I've two text files FILE_1 and FILE_2 as shown below: FILE_1.txt CO Contig1 342 12 11 U GGGCTGACGTGGCCGCTAATACGACTCACTATAGGG*AGAGAAGTCATTTTCTTGTTTAG BQ 35 35 35 50 50 50 50 50 50 50 50 60 65 65 65 65 65 65 65 65 65 65 65 65 50 AF GP5UOVN01AOPE0 U 1 AF GP5UOVN01AT8W3 U 1 ... (1 Reply)
Discussion started by: Fahmida
1 Replies

6. Shell Programming and Scripting

Text File Manipulation

Hi, I need to write shell script for the scenario explained below - datafile.txt AcctNum,code,Region,,,, 12345451,AN ,abaab 12345452,AN ,xccxc 76677545,RP ,acxcc 43567878,RP ,afghh 32190900,AN ,afrfrf 87312345,AN ,aqaw I have a text file (datafile.txt)... (1 Reply)
Discussion started by: ravigupta2u
1 Replies

7. Shell Programming and Scripting

File text manipulation

What I am trying to do is make a script that will add a port number within a section of a file if it already doesn't exist in that section of the file. The particular line that I would like to add the port number to in the file is formatted like this: TCPPORTS="25 80 125 443 8080 10000" For... (3 Replies)
Discussion started by: nullifx
3 Replies

8. Shell Programming and Scripting

String manipulation in a file

I have a text file having number of different rows like this.. Action & Adventure|Whiteout| Kids Free|PBS KIDS Sprout|En espanol| Kids Free|PBS KIDS Sprout|En espanol|XYZ| Basically,i want to read the file and write to another file in a tree structure like this.. each row should have 6... (7 Replies)
Discussion started by: ramse8pc
7 Replies

9. UNIX for Dummies Questions & Answers

Text file manipulation

I am a new unix user & I wanted to work with unix as it is very good in text manipulations. I need a little help. I will be grateful if someone can help me out. I need help in grepping a pattern of numbers from one file to another file. Specific details are as follows: File one contains only... (4 Replies)
Discussion started by: Ezy
4 Replies

10. UNIX for Dummies Questions & Answers

Text file manipulation

Hi, I need to remove lines from a text file that are less than certain length in UNIX. For example, test.txt file contains the following lines: abcdefghijklmnopqrstuvwxyz. 123456789009876543211234567 This line to be removed. zyxwvutsrqponmlkjihgfedcba. The length of each line is supposed... (5 Replies)
Discussion started by: svannala
5 Replies
Login or Register to Ask a Question