Sponsored Content
Top Forums Shell Programming and Scripting regex/shell script to Parse through XML Records Post 302324699 by Jerrad on Thursday 11th of June 2009 11:59:54 AM
Old 06-11-2009
regex/shell script to Parse through XML Records

Hi All,

I have been working on something that doesn't seem to have a clear regex solution and I just wanted to run it by everyone to see if I could get some insight into the method of solving this problem.

I have a flat text file that contains billing records for users, however the records are stored as XML with each record starting and stopping at <record> and </record> respectively.

What I am trying to do is be able to search for a users id and have it extract the complete record for them.

Sample Data

Quote:
<record>
<recId>xxxxxxxxxxxxxxx</recId>
<created>Wed Dec 17 06:00:16 2008</created>
<userid>jondoe</userid>
<domain>xxxxxxxxxxxxxxxxxxxx</domain>
<type>260</type>
<nasIP>xxxxxxxxxxxxxxxx</nasIP>
<portType>18</portType>
<radIP>xxxxxxxxxxxxxxx</radIP>
<userIP>0.0.0.0</userIP>
<delta>7598</delta>
<gmtOffset>0</gmtOffset>
<bytesIn>3159</bytesIn>
<bytesOut>563</bytesOut>
<packetsIn>52</packetsIn>
<packetsOut>19</packetsOut>
<proxyAuthIPAddr>0</proxyAuthIPAddr>
<proxyAcctIPAddr>xxxxxxxxxxxxxxx</proxyAcctIPAddr>
<proxyAcctAck>1</proxyAcctAck>
<termCause>17</termCause>
<clientIPAddr>xxxxxxxxxxxxxxx</clientIPAddr>
<entityID>955</entityID>
<entityCtxt>1</entityCtxt>
<backupMethod>L</backupMethod>
<sessionCountInfo></sessionCountInfo>
<clientID>xxxxxxxxxxxxxxx</clientID>
<sessionID>xxxxxxxxxxxxxxxxxxxxxx</sessionID>
<nasID>xxxxx</nasID>
<nasVendor>xxxxxx</nasVendor>
<nasModel>xxxxxxxxxxxx</nasModel>
<nasPort>xxxxxxxx</nasPort>
<billingID></billingID>
<startDate>2008/12/17 03:57:06</startDate>
<callingNumber>xxxxxxxxxxxxxxx</callingNumber>
<calledNumber></calledNumber>
<radiusAttr>xxxxxxxxxxxxxxxx</radiusAttr>
<startAttr></startAttr>
<auditID>xxxxxxxxxxxxxxxxxxxxxxxx</auditID>
<seqNum>0</seqNum>
<accountName></accountName>
</record><record>
<recId>xxxxxxxxxxxxxxx</recId>
<created>Wed Dec 17 06:00:16 2008</created>
<userid>janedoe</userid>
<domain>xxxxxxxxxxxxxxxxxxxx</domain>
<type>260</type>
<nasIP>xxxxxxxxxxxxxxxx</nasIP>
<portType>18</portType>
<radIP>xxxxxxxxxxxxxxx</radIP>
<userIP>0.0.0.0</userIP>
<delta>7598</delta>
<gmtOffset>0</gmtOffset>
<bytesIn>3159</bytesIn>
<bytesOut>563</bytesOut>
<packetsIn>52</packetsIn>
<packetsOut>19</packetsOut>
<proxyAuthIPAddr>0</proxyAuthIPAddr>
<proxyAcctIPAddr>xxxxxxxxxxxxxxx</proxyAcctIPAddr>
<proxyAcctAck>1</proxyAcctAck>
<termCause>17</termCause>
<clientIPAddr>xxxxxxxxxxxxxxx</clientIPAddr>
<entityID>955</entityID>
<entityCtxt>1</entityCtxt>
<backupMethod>L</backupMethod>
<sessionCountInfo></sessionCountInfo>
<clientID>xxxxxxxxxxxxxxx</clientID>
<sessionID>xxxxxxxxxxxxxxxxxxxxxx</sessionID>
<nasID>xxxxx</nasID>
<nasVendor>xxxxxx</nasVendor>
<nasModel>xxxxxxxxxxxx</nasModel>
<nasPort>xxxxxxxx</nasPort>
<billingID></billingID>
<startDate>2008/12/17 03:57:06</startDate>
<callingNumber>xxxxxxxxxxxxxxx</callingNumber>
<calledNumber></calledNumber>
<radiusAttr>xxxxxxxxxxxxxxxx</radiusAttr>
<startAttr></startAttr>
<auditID>xxxxxxxxxxxxxxxxxxxxxxxx</auditID>
<seqNum>0</seqNum>
<accountName></accountName>
</record><record>
What I would like to be able to do is search for johndoe and have it spit out all records for johndoe.

So the output would be the following, however there could be multiple records in the file for this user so it would need to write out the record to a text file or standard output each time it found a record.

Quote:
<record>
<recId>xxxxxxxxxxxxxxx</recId>
<created>Wed Dec 17 06:00:16 2008</created>
<userid>jondoe</userid>
<domain>xxxxxxxxxxxxxxxxxxxx</domain>
<type>260</type>
<nasIP>xxxxxxxxxxxxxxxx</nasIP>
<portType>18</portType>
<radIP>xxxxxxxxxxxxxxx</radIP>
<userIP>0.0.0.0</userIP>
<delta>7598</delta>
<gmtOffset>0</gmtOffset>
<bytesIn>3159</bytesIn>
<bytesOut>563</bytesOut>
<packetsIn>52</packetsIn>
<packetsOut>19</packetsOut>
<proxyAuthIPAddr>0</proxyAuthIPAddr>
<proxyAcctIPAddr>xxxxxxxxxxxxxxx</proxyAcctIPAddr>
<proxyAcctAck>1</proxyAcctAck>
<termCause>17</termCause>
<clientIPAddr>xxxxxxxxxxxxxxx</clientIPAddr>
<entityID>955</entityID>
<entityCtxt>1</entityCtxt>
<backupMethod>L</backupMethod>
<sessionCountInfo></sessionCountInfo>
<clientID>xxxxxxxxxxxxxxx</clientID>
<sessionID>xxxxxxxxxxxxxxxxxxxxxx</sessionID>
<nasID>xxxxx</nasID>
<nasVendor>xxxxxx</nasVendor>
<nasModel>xxxxxxxxxxxx</nasModel>
<nasPort>xxxxxxxx</nasPort>
<billingID></billingID>
<startDate>2008/12/17 03:57:06</startDate>
<callingNumber>xxxxxxxxxxxxxxx</callingNumber>
<calledNumber></calledNumber>
<radiusAttr>xxxxxxxxxxxxxxxx</radiusAttr>
<startAttr></startAttr>
<auditID>xxxxxxxxxxxxxxxxxxxxxxxx</auditID>
<seqNum>0</seqNum>
<accountName></accountName>
</record>
I started with some regex trying to grab <record> then johndoe then </record> <record>(\s|\S)+johndoe(\s|\S)+</record>

However this is selecting all records if they contain <record> etc and even if I could just extract the portion I want I am not sure how I can have it remember where it left off and keep chewing through the file without creating duplicates.

Since this is being performed on Solairs 10 I wasn't able to use some of the more advanced grep features like grep -B(x) -A(x)

Thanks in advance for any help you can provide
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parse a string in XML file using shell script

Hi! I'm just new here and don't know much about shell scripting. I just want to ask for help in creating a shell script that will parse a string or value of the status in the xml file. Please sample xml file below. Can you please help me create a simple script to get the value of status? Also it... (46 Replies)
Discussion started by: ayhanne
46 Replies

2. Shell Programming and Scripting

Parse XML file into CSV with shell?

Hi, It's been a few years since college when I did stuff like this all the time. Can someone help me figure out how to best tackle this problem? I need to parse a file full of entries that look like this: <eq action="A" sectyType="0" symbol="PGR" exch="CA" curr="VEF" sess="NORM"... (7 Replies)
Discussion started by: Pcushing
7 Replies

3. Shell Programming and Scripting

Need to Parse XML from bash script

I am completely new to bash scripting and now need to write a bash script that would parse a XML file and take out values from specific tags. I tried using xsltproc, xml_grep commands. But the issue is that the XML i am trying to parse is not UTF 8. so those commands are unable to parse my XML's... (4 Replies)
Discussion started by: shivashankar.g
4 Replies

4. Shell Programming and Scripting

Parse XML file in shell script

Hi Everybody, I have an XML file containing some data and i want to extract it, but the specific issue in my file is that the data is repeated some times like the following example : <section1> <subsection1> X=... Y=... Z=... <\subsection1> <subsection2> X=... Y=... Z=...... (2 Replies)
Discussion started by: yassine
2 Replies

5. Shell Programming and Scripting

Shell script (not Perl) to parse xml with awk

Hi, I have to make an script according to these: - I have couples of files like: xxxxxxxxxxxxx.csv xxxxxxxxxxxxx_desc.xml - every xml file has diferent fields, but keeps this format: ........ <defaultName>2011-02-25T16:43:43.582Z</defaultName> ........... (2 Replies)
Discussion started by: Pluff
2 Replies

6. Shell Programming and Scripting

awk Script to parse a XML tag

I have an XML tag like this: <property name="agent" value="/var/tmp/root/eclipse" /> Is there way using awk that i can get the value from the above tag. So the output should be: /var/tmp/root/eclipse Help will be appreciated. Regards, Adi (6 Replies)
Discussion started by: asirohi
6 Replies

7. Shell Programming and Scripting

How to Parse the XML data along with the URL in Shell Script?

Hi, Can anybody help to solve this. I want to parse some xmldata along with the URL in the Shell. I'm calling the URL via the curl command Given below is my shell script file export... (7 Replies)
Discussion started by: Megala
7 Replies

8. Shell Programming and Scripting

BASH script to parse XML and generate CSV

Hi All, Hope all you are doing good! Need your help. I have an XML file which needs to be converted CSV file. I am not an expert of awk/sed so your help is highly appreciated!! XML file looks like this: <l:event dateTime="2013-03-13 07:15:54.713" layerName="OSB" processName="ABC"... (2 Replies)
Discussion started by: bhaskar_m
2 Replies

9. Shell Programming and Scripting

Using shell command need to parse multiple nested tag value of a XML file

I have this XML file - <gp> <mms>1110012</mms> <tg>988</tg> <mm>LongTime</mm> <lv> <lkid>StartEle=ONE, Desti = Motion</lkid> <kk>12</kk> </lv> <lv> <lkid>StartEle=ONE, Source = Velocity</lkid> <kk>2</kk> </lv> <lv> ... (3 Replies)
Discussion started by: NeedASolution
3 Replies

10. Shell Programming and Scripting

Parse xml in shell script and extract records with specific condition

Hi I have xml file with multiple records and would like to extract records from xml with specific condition if specific tag is present extract entire row otherwise skip . <logentry revision="21510"> <author>mantest</author> <date>2015-02-27</date> <QC_ID>334566</QC_ID>... (12 Replies)
Discussion started by: madankumar.t@hp
12 Replies
All times are GMT -4. The time now is 03:50 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy