XML Log Parsing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting XML Log Parsing
# 1  
Old 07-25-2008
XML Log Parsing

I have a log file that is around 300 MB of data having continours soap responses as shown below( I have attached only one sample SOAP). I would require to have the following extracted and written onto a new file.

timestamp
WebPartId
bus:block
bus:unblock
endptSmilieperation

Please help me.

<logRequest xmlns:wsse="http://docs.ddoasis-open.org/wss/2004/01/oasis-200401-wss-1.0.xsd" xmlns:str="http://exslt.org/strings" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:secext="http://schemas.xmlsoap.org/ws/2002/04/secext" xmlns:rrbfunc="urn:schemas:functions:1.0" xmlns:rrbus="urn:schemas:context:1.0" xmlns:regexp="http://exslt.org/regular-expressions" xmlns:metrics20="urn:metrics:2.0" xmlns:metrics10="urn:metrics:1.0" xmlns:exsl="http://exslt.org/common" xmlns:endpt="urn:schemas.bcom/rrbus/1.0/spInfo" xmlns:date="http://exslt.org/dates-and-times" xmlns:common="urn::xslt:common.xsl" xmlns:cam="urn:comsec:authn:1.0"><logHeader><timestamp>2008-07-24T04:17:04.137000-04:00</timestamp><direction>response</direction><logType>SERVICE</logType></logHeader><logPayload><SOAP-ENV:Header xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:s="http://schemas.xmlsoap.org/soap/envelope/"><Security xmlns="http://schemas.xmlsoap.org/ws/2002/04/secext"><wsse:BinarySecurityToken EncodingType="sentry:Base64Binary" ValueType="sentry:CSK1" cam:Username="639025903" cam:OpaqueId="639025903" xmlns:sentry="urn::schemas:security:1.2">pa1044zV0vIpymMC5uPnnpGlsT-aTye3EX0@</wsse:BinarySecurityToken></Security><context xmlns="urn:schemas:context:1.0"><PilotRollout xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><Region xmlns="">R_SAS</Region></PilotRollout><channel xmlns:i="http://www.w3.org/2001/XMLSchema-instance">IO</channel><properties xmlns:i="http://www.w3.org/2001/XMLSchema-instance"><property name="WebPartId">AcctBalances</property><property name="WebPartAction">Default</property><property name="CorrelatorId">54057-d7d7e997a6c7</property><property name="AsyncCall"/></properties></context><currentCorrelId xmlns="urn:hemas:metrics:1.0">7e36a2fc-2cd3-43af-9ca4ae8f</currentCorrelId><metrics10Smilieoint id="873e10cd-3691-4e3b4f190" parent="7e36a29-a31475a4ae8f" node="10.14.56.59" type="srmediary"><metrics10:start>2008-07-24 08:17:03.321000 UTC</metrics10:start><metrics10:block>2008-07-24 08:17:03.520000 UTC</metrics10:block></metrics10Smilieoint><busSmilieoint type=us.provider" parent="7e36a2fc-2cd31475a4ae8f" node="C1VTR6" id="003425861VTR6Z30818001" xmlns:rrbus="urn:schemas:metrics:1.0"><bus:start>2008-07-24 08:17:02.978203 UTC</bus:start><bus:block>2008-07-24 08:17:03.242739 UTC</bus:block><bus:unblock>2008-07-24 08:17:03.583753 UTC</bus:unblock><bus:stop>2008-07-24 08:17:03.588468 UTC</bus:stop></busSmilieoint><endpt:spInfo><endpt:tranId>RRXI</endpt:tranId><endptSmilieperation>GetThirdPartyAcctInfo</endptSmilieperation><endpt:TORName>C1VTRZ3</endpt:TORName><endpt:AORName>C1VA2Z9</endpt:AORName><endpt:taskNum>000690</endpt:taskNum><endpt:UOWID>C2BF750D1B3783</endpt:UOWID></endpt:spInfo></SOAP-ENV:Header></logPayload></logRequest>
# 2  
Old 07-25-2008
Expected output:


timestamp WebPartId bus:block bus:unblock endptSmilieperation
2008-07-24T04:17:04.137000-04:00 AcctBalances 2008-07-24 08:17:03.321000 2008-07-24 08:17:03.421000 GetThirdPartyAcctInfo
# 3  
Old 07-25-2008
The best way to handle this is to write an XML stylesheet to transform the XML "document" into the desired output. (BTW, the provided XML is not valid, the "bus" namspace is not defined.)

Another way, if you want to stick to using UNIX utilities, is to convert the the XML into PYX format (there are a number of tools available e.e. xmlstarlet, xmln, etc., do a Web search for PYX) and use sed, awk or grep to extract the relevant information

For example here is the equivalant PYX for the (corrected to make valid) XML you provided
Code:
(logRequest
Axmlns:exsl http://exslt.org/common
Axmlns:endpt urn:schemas.bcom/rrbus/1.0/spInfo
Axmlns:date http://exslt.org/dates-and-times
Axmlns:common urn::xslt:common.xsl
Axmlns:bus urn::fpmurphy
Axmlns:cam urn:comsec:authn:1.0
Axmlns:wsse http://docs.ddoasis-open.org/wss/2004/01/oasis-200401-wss-1.0.xsd
Axmlns:str http://exslt.org/strings
Axmlns:soapenv http://schemas.xmlsoap.org/soap/envelope/
Axmlns:secext http://schemas.xmlsoap.org/ws/2002/04/secext
Axmlns:rrbfunc urn:schemas:functions:1.0
Axmlns:rrbus urn:schemas:context:1.0
Axmlns:regexp http://exslt.org/regular-expressions
Axmlns:metrics20 urn:metrics:2.0
Axmlns:metrics10 urn:metrics:1.0
-\n
(logHeader
-\n
(timestamp
-2008-0724T04:17:04.13700004:00
)timestamp
-\n
(direction
-response
)direction
-\n
(logType
-SERVICE
)logType
-\n
)logHeader
-\n
(logPayload
-\n
(SOAP-ENV:Header
Axmlns:SOAP-ENV http://schemas.xmlsoap.org/soap/envelope/
Axmlns:s http://schemas.xmlsoap.org/soap/envelope/
-\n
(Security
Axmlns http://schemas.xmlsoap.org/ws/2002/04/secext
-\n
(wsse:BinarySecurityToken
AEncodingType sentry:Base64Binary
AValueType sentry:CSK1
Acam:Username 639025903
Acam:OpaqueId 639025903
Axmlns:sentry urn::schemas:security:1.2
-\n                        pa1044zV0vIpymMC5uPnnpGlsT-aTye3EX0@\n
)wsse:BinarySecurityToken
-\n
)Security
-\n
(context
Axmlns urn:schemas:context:1.0
-\n
(PilotRollout
Axmlns:i http://www.w3.org/2001/XMLSchema-instance
-\n
(Region
Axmlns
-R_SAS
)Region
-\n
)PilotRollout
-\n
(channel
Axmlns:i http://www.w3.org/2001/XMLSchema-instance
-IO
)channel
-\n
(properties
Axmlns:i http://www.w3.org/2001/XMLSchema-instance
-\n
(property
Aname WebPartId
-AcctBalances
)property
-\n
(property
Aname WebPartAction
-Default
)property
-\n
(property
Aname CorrelatorId
-54057-d7d7e997a6c7
)property
-\n
(property
Aname AsyncCall
)property
-\n
)properties
-\n
)context
-\n
(currentCorrelId
Axmlns urn:hemas:metrics:1.0
-7e36a2fc-2cd3-43af-9ca4ae8f
)currentCorrelId
-\n
(metrics10oint
Aid 873e10cd-3691-4e3b4f190
Aparent 7e36a29-a31475a4ae8f
Anode 10.14.56.59
Atype srmediary
-\n
(metrics10:start
-2008-07-24 08:17:03.321000 UTC
)metrics10:start
-\n
(metrics10:block
-2008-07-24 08:17:03.520000 UTC
)metrics10:block
-\n
)metrics10oint
-\n
(bus:point
Aid 003425861VTR6Z30818001
Aparent 7e36a2fc-2cd31475a4ae8f
Anode C1VTR6
Atype us.provider
Axmlns:rrbus urn:schemas:metrics:1.0
-\n
(bus:start
-2008-07-24 08:17:02.978203 UTC
)bus:start
-\n
(bus:block
-2008-07-24 08:17:03.242739 UTC
)bus:block
-\n
(bus:unblock
-2008-07-24 08:17:03.583753 UTC
)bus:unblock
-\n
(bus:stop
-2008-07-24 08:17:03.588468 UTC
)bus:stop
-\n
)bus:point
-\n
(endpt:spInfo
-\n
(endpt:tranId
-RRXI
)endpt:tranId
-\n
(endpt:operation
-GetThirdPartyAcctInfo
)endpt:operation
-\n
(endpt:TORName
-C1VTRZ3
)endpt:TORName
-\n
(endpt:AORName
-C1VA2Z9
)endpt:AORName
-\n
(endpt:taskNum
-000690
)endpt:taskNum
-\n
(endpt:UOWID
-C2BF750D1B3783
)endpt:UOWID
-\n
)endpt:spInfo
-\n
)SOAP-ENV:Header
-\n
)logPayload
-\n
)logRequest

and here is an example of one way of extracting the first three pieces of information that you are looking for:
Code:
$ sed -n -e '/(timestamp/{n;s/^-//p;}' -e '/Aname WebPartId/{n;s/^-//p;}' -e '/(bus:block/{n;s/^-//p;}' pyxfile
2008-0724T04:17:04.13700004:00
AcctBalances
2008-07-24 08:17:03.242739 UTC
$

# 4  
Old 07-25-2008
help me with this using Unix commands?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

XML parsing

I have an xml file where the format looks like below <SESSIONCOMPONENT REFOBJECTNAME ="pre_session_command" REUSABLE ="NO" TYPE ="Pre-session command"> <TASK DESCRIPTION ="" NAME ="pre_session_command" REUSABLE ="NO" TYPE ="Command" VERSIONNUMBER ="1"> ... (8 Replies)
Discussion started by: r_t_1601
8 Replies

2. Shell Programming and Scripting

XML Parsing :

HI I want to parse below file in to two output :- Input :- ?xml version="1.0" encoding="UTF-8"?> <bulkCmConfigDataFile xmlns:un="utranNrm.xsd" <configData dnPrefix="Undefined"> <xn:SubNetwork id="ONRM_ROOT_MO_R"> <xn:MeContext id="C136"> ... (3 Replies)
Discussion started by: asavaliya
3 Replies

3. Shell Programming and Scripting

XML parsing

i have xml output in below format... <AlertsResponse> <Alert id="11216" name="fgdfg"> <AlertActionLog timestamp="1356521629778" user="admin" detail="Recovery Alert"/> </Alert> <Alert id="11215" name="gdfg <AlertActionLog timestamp="1356430119840" user=""... (12 Replies)
Discussion started by: vivek d r
12 Replies

4. Shell Programming and Scripting

XML: parsing of the Google contacts XML file

I am trying to parse the XML Google contact file using tools like xmllint and I even dived into the XSL Style Sheets using xsltproc but I get nowhere. I can not supply any sample file as it contains private data but you can download your own contacts using this script: #!/bin/sh # imports... (9 Replies)
Discussion started by: ripat
9 Replies

5. Shell Programming and Scripting

Parsing XML

I am trying to parse an xml file and trying to grab certain values and inserting them into database table. I have the following xml that I am parsing: <dd:service name="locator" link="false"> <dd:activation mode="manual" /> <dd:run mode="direct_persistent" proxified="false" managed="true"... (7 Replies)
Discussion started by: $criptKid617
7 Replies

6. UNIX for Advanced & Expert Users

XML Parsing

I had a big XML and from which I have to make a layout as below *TOTAL+CB | *CB+FX | CS |*IR | *TOTAL | -------------------------------------------------------------------------------------------------- |CB FX | | | | DMFXNY EMSGFX... (6 Replies)
Discussion started by: manas_ranjan
6 Replies

7. Shell Programming and Scripting

Parsing XML

Learned People, Hello ! Till today, for the most part, all of the tricky questions/situations that I encountered were already posted by other folks and all I had to do was peruse through these one at a time and I could find some sort of an answer and all I had to do was add some minor tweaks... (5 Replies)
Discussion started by: ManoharMa
5 Replies

8. Shell Programming and Scripting

XML parsing

I have a xml file attached. I need to parse parameterId and its value My output should be like 151515 38 151522 32769 and so on.. Please help me. Its urgent (6 Replies)
Discussion started by: LavanyaP
6 Replies

9. Shell Programming and Scripting

XML Parsing

Hi, Need a script to parse the following xml file content <tag1 Name="val1"> <abc Name="key"/> <abc Name="pass">*********</abc> </tag1> <tag2 Name="Core"> <Host Name="a.b.c"> <tag1 Name="abc"> <abc Name="ac">None</abc> ... (4 Replies)
Discussion started by: Mavericc
4 Replies

10. Programming

XML parsing

Hi I want to take an XML file and transform it into a pipe-delimited format. What is the best tool to use for this? I have libxml2 which seems to be the best xml parser around. The xml file will have the following format. <Txn> <Date>120504</Date> <id>99</id> <Items> <Item>... (1 Reply)
Discussion started by: handak9
1 Replies
Login or Register to Ask a Question