Visit Our UNIX and Linux User Community

AWK to Parse XML messages

Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK to Parse XML messages
# 1  
Old 11-08-2011
AWK to Parse XML messages

Hello Guys,

Please help with AWK problem. I have XML file which contains a list of messages for subjects.

Example of the messages:

[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><Subject=”English” Status=”P”>
[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><Subject=”Science” Status=”F”>
[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><Subject=”Science” Status=”F”>
[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><Subject=”Science” Status=”NA”>
[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><Subject=”English” Status=”P”>
[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><Subject=”English” Status=”F”>
[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><Subject=”Maths” Status=”P”>
[date+time], message=[DATA= “<?xml version=”1.0?”><data changeMsg><Subject=”Science” Status=”P”>

I want to use AWK to parse these xml messages but I am new to awk and to programming.

What I want is to get output of these messages to look like this.



Please Help.

Thanks all for any help.

Last edited by radoulov; 11-08-2011 at 03:09 PM.. Reason: Code tags!
# 2  
Old 11-08-2011
Is that what the data really looks like, or have you prettied it up for posting? That makes a big difference to awk.
# 3  
Old 11-08-2011
AWK to Parse XML messages

No this is how the data really looks like
# 4  
Old 11-08-2011
"smart quotes" and all? It looks like it's been through MS Word...
# 5  
Old 11-08-2011
AWK to Parse XML messages

Of course it has been through MS Word, this is only a sample of the messages and secondly i have removed the date and the time.
So this have been edited on MS Word

Last edited by James_Owen; 11-08-2011 at 04:29 PM..
# 6  
Old 11-08-2011
Please post a representative sample of your data. Include timestamps. Don't mangle it in Word.

Never use Word for data. Imagine typing up a shell script in Word, and having all your quotes turned into "smart quotes" and all your backticks being forced into grammatical correctness -- these things do matter. I was able to instantly tell what you'd done by how it'd been scrambled, but can only guess what it looked like before.

Computers are fussy. Anything we write to fit this sample won't work for you due to differences in the number of fields and handling of "smart" quotes.
# 7  
Old 11-08-2011
AWK to Parse XML messages

I can see what MS Word did with double quotes, I have changes them now and added the date and time stamps.

[08-11-2011 13:40:12], message=[DATA= "<?xml version="1.0?"><data changeMsg><Subject="English" Status="P">
[08-11-2011 13:40:12], message=[DATA= "<?xml version="1.0?"><data changeMsg><Subject="Science" Status="F">
[08-11-2011 13:40:12], message=[DATA= "<?xml version="1.0?"><data changeMsg><Subject="Science" Status="F">
[08-11-2011 13:40:12], message=[DATA= "<?xml version="1.0?"><data changeMsg><Subject="Science" Status="NA">
[08-11-2011 13:40:12], message=[DATA= "<?xml version=”1.0?"><data changeMsg><Subject="English" Status="P">
[08-11-2011 13:40:12], message=[DATA= "<?xml version=”1.0?”><data changeMsg><Subject="English" Status="F">
[08-11-2011 13:40:12], message=[DATA= "<?xml version=”1.0?”><data changeMsg><Subject="Maths"   Status="P">
[08-11-2011 13:40:12], message=[DATA= "<?xml version=”1.0?"><data changeMsg><Subject="Science" Status="P">

I have hundreds of lines of messages which I will not be able to post but this is how all the messages look like.

Is there a way to get the output that I am after, is it possible?SmilieSmilie

Last edited by vgersh99; 11-08-2011 at 06:16 PM.. Reason: code tags, please!

Previous Thread | Next Thread
Test Your Knowledge in Computers #342
Difficulty: Easy
Sun's first Unix workstation was called SunOS-1.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk Script to parse a XML tag

I have an XML tag like this: <property name="agent" value="/var/tmp/root/eclipse" /> Is there way using awk that i can get the value from the above tag. So the output should be: /var/tmp/root/eclipse Help will be appreciated. Regards, Adi (6 Replies)
Discussion started by: asirohi
6 Replies

2. Shell Programming and Scripting

Please Help with AWK to parse rapidly changing XML messages

Hi Guy's Can I please get some help with this code. I have xml feed file which rapidly changing temporary file and I need to capture the content of this file as soon as data arrives. Example of the data Required data output Time is current time. This is awk code that I have so far... (4 Replies)
Discussion started by: James_Owen
4 Replies

3. Shell Programming and Scripting

python - wget xml doc and parse with awk

Well, that's what I'd do in bash :) Here's what I have so far: import urllib2 from BeautifulSoup import BeautifulStoneSoup xml = urllib2.urlopen('') soup = BeautifulStoneSoup(xml) print soup.prettify() but all it does is grab the html... (0 Replies)
Discussion started by: unclecameron
0 Replies

4. Shell Programming and Scripting

Shell script (not Perl) to parse xml with awk

Hi, I have to make an script according to these: - I have couples of files like: xxxxxxxxxxxxx.csv xxxxxxxxxxxxx_desc.xml - every xml file has diferent fields, but keeps this format: ........ <defaultName>2011-02-25T16:43:43.582Z</defaultName> ........... (2 Replies)
Discussion started by: Pluff
2 Replies

5. UNIX for Dummies Questions & Answers

Parse XML e report con AWK

Thanks in advance who can answer. I have to make a small shell that after reading an XML file and extract the fields I create a text file with the same fields taken previously in tabular form. I did this parse.awk ---------------------- BEGIN { FS="" } { for(i=2; i<=NF; i+=2) { ... (1 Reply)
Discussion started by: mcarlo65
1 Replies

6. Shell Programming and Scripting

how to parse the file in xml format using awk/nawk

Hi All, I have an xml file with the below format. <a>111</a><b>222</b><c>333<c><d><e>123</e><f>234</f><d><e>456</e><f>789</f> output needed is 111,222,333,123,234 111,222,333,456,789 nawk 'BEGIN{FS="<|>"} {print a,b,c,e,f a="" ... (7 Replies)
Discussion started by: natalie23
7 Replies

7. Shell Programming and Scripting

Parse an XML task list to create each task.xml file

I have an task definition listing xml file that contains a list of tasks such as <TASKLIST <TASK definition="Completion date" id="Taskname1" Some other <CODE name="Code12" <Parameter pname="Dog" input="5.6" units="feet" etc /Parameter> <Parameter... (3 Replies)
Discussion started by: MissI
3 Replies

8. Shell Programming and Scripting

Need AWk To parse XML logs

Hi , I need an Awk script to parse my log file . 2008-04-26 10:00:13,391 INFO Logger - <?xml version="1.0" encoding="UTF-8" standalone="no"?><2dm tmsg... (0 Replies)
Discussion started by: amit1_x
0 Replies

9. Shell Programming and Scripting

How to parse a XML file using PERL and XML::DOm

I need to know the way. I have got parsing down some nodes. But I was unable to get the child node perfectly. If you have code please send it. It will be very useful for me. (0 Replies)
Discussion started by: girigopal
0 Replies

10. Programming

parse xml

Hi, I'm looking for an "easy" way to parse a xml file to a proper structure. The xml looks like this What shall I use? Does anybody has some example-code to share or some good links/book-references? thx for any reply -fe (5 Replies)
Discussion started by: bin-doph
5 Replies

Featured Tech Videos