How to extract text from xml file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to extract text from xml file
# 1  
Old 08-31-2007
How to extract text from xml file

I have some xml files that got created by exporting a website from RedDot. I would like to extract the cost,
course number, description, and meeting information.


<?xml version="1.0" encoding="UTF-16" standalone="yes" ?>
- <PAG PAG0="3AE6FCFD86D34896A82FCA3B7B76FF90" PAG3="525312" PAG7="38574.3936342593" PAG8="48E1DBCD03594F0E8CE93D9736BD5698" PAG9="C8E8FB21EE5343FEBA77C040EF1C9BFC" PAG11="39160.5590162037" PAG12="C8E8FB21EE5343FEBA77C040EF1C9BFC" PAG13="39160.5937384259" PAG14="C8E8FB21EE5343FEBA77C040EF1C9BFC" PAG15="" PAG16="" PAG17="0" PAG18="1" PAG19="48E1DBCD03594F0E8CE93D9736BD5698" PAG20="" PAG21="79EA41233D5F4B36B0BAC07286866783" PAG22="0" PAG23="0" PAG29="39160.5937384259" PAG30="0" PAG31="38574.3936342593" PAG32="0" PAG33="0">
- <IO_VAL>
<VAL VAL1="3AE6FCFD86D34896A82FCA3B7B76FF90" VAL2="2" VAL3="PAG" VAL4="Advanced HVAC Maintenance" VAL6="3AE6FCFD86D34896A82FCA3B7B76FF90" VAL7="0" VAL8="0" VAL9="38748.7126851852" VAL10="0" />
<VAL VAL1="B6FC365A81BA49F6B87D5F83A385FF50" VAL2="1" VAL3="PGE" VAL4="text" VAL6="B6FC365A81BA49F6B87D5F83A385FF50" VAL7="0" VAL8="0" VAL9="39160.5590046296" VAL10="0">$400<BR>$400</VAL>
<VAL VAL1="0DE7DBA40D9C4570AF7E1052369443CF" VAL2="1" VAL3="PGE" VAL4="text" VAL6="CE65E148437444F6BE216C8C6889B241" VAL7="0" VAL8="0" VAL9="38574.3936342593" VAL10="0">XPOB 556-501<BR>XPOB 556-502</VAL>
<VAL VAL1="6407D6626D1F448389C817DABD01C51F" VAL2="1" VAL3="PGE" VAL4="text" VAL6="6407D6626D1F448389C817DABD01C51F" VAL7="0" VAL8="0" VAL9="39160.3767361111" VAL10="0">6/2-8/4 <BR>6/4-7/11*</VAL>
<VAL VAL1="8B3B923981B346B499770E3DCA8230F0" VAL2="1" VAL3="PGE" VAL4="text" VAL6="D1E8B01771824275997556D439647E4E" VAL7="0" VAL8="0" VAL9="38574.3936342593" VAL10="0">S<BR>MW</VAL>
<VAL VAL1="BAA7472ACAD742E1A8BAED1FDABCE2E9" VAL2="1" VAL3="PGE" VAL4="text" VAL6="BAA7472ACAD742E1A8BAED1FDABCE2E9" VAL7="0" VAL8="0" VAL9="38755.6905902778" VAL10="0">This 40-hour course expands upon the topics covered in the Basic HVAC Maintenance course.<EM>Prerequisite: Basic Heating and Air Conditioning Equipment Maintenance course or instructor approval required prior to registering.</EM> Books not included</VAL>
<VAL VAL1="D48131678F254EDF9D8ABDB2C13EDC6A" VAL2="1" VAL3="PGE" VAL4="text" VAL6="8B75B8517379488CBEBD4E55DBD76E7C" VAL7="0" VAL8="0" VAL9="38574.3936342593" VAL10="0">M<BR>M</VAL>
<VAL VAL1="E316E14FFDC94C4CBC856554ADF971C1" VAL2="1" VAL3="PGE" VAL4="text" VAL6="E316E14FFDC94C4CBC856554ADF971C1" VAL7="0" VAL8="0" VAL9="39160.3768287037" VAL10="0">*No class&nbsp;7/2-4</VAL>
<VAL VAL1="DF2EF049448F41A7AC18B4B71BA6F66D" VAL2="1" VAL3="PGE" VAL4="text" VAL6="467A8FEB25964EE2924BC3183C5FB424" VAL7="0" VAL8="0" VAL9="38574.3936342593" VAL10="0">8 a.m.-noon<BR>8 a.m.-noon</VAL>
</IO_VAL>
</PAG>


The text I would like to extract is from this area

VAL10="0">$400<BR>$400</VAL>
VAL10="0">XPOB 556-501<BR>XPOB 556-502</VAL>
VAL10="0">6/2-8/4 <BR>6/4-7/11*</VAL>
VAL10="0">S<BR>MW</VAL>
VAL10="0">This 40-hour course expands upon the topics covered in the Basic HVAC Maintenance course. Course is held in Bldg. <EM>Prerequisite: Basic Heating and Air Conditioning Equipment Maintenance course or instructor approval required prior to registering.</EM> Books not included</VAL>
VAL10="0">M<BR>M</VAL>
VAL10="0">*No class&nbsp;7/2-4</VAL>
VAL10="0">8 a.m.-noon<BR>8 a.m.-noon</VAL>

I have AIX version 5. Any suggestions would be deeply appreciated.
# 2  
Old 09-01-2007
PERL.

Try to write a problem in PERL
# 3  
Old 09-01-2007
Code:
awk '/VAL10="0">/ {	  
	  match($0,"VAL10=\"0\">")
	  v1start=RSTART
	  match($0,"</VAL>")
	  v2start=RSTART
	  print substr($0,v1start,v2start)
	}
' "file"

output:
Code:
# ./test.sh
VAL10="0">$400<BR>$400</VAL>
VAL10="0">XPOB 556-501<BR>XPOB 556-502</VAL>
VAL10="0">6/2-8/4 <BR>6/4-7/11*</VAL>
VAL10="0">S<BR>MW</VAL>
VAL10="0">This 40-hour course expands upon the topics covered in the Basic HVAC Maintenance course.<EM>Prerequisite: Basic Heating and Air Conditioning Equipment Maintenance course or instructor approval required prior to registering.</EM> Books not included</VAL>
VAL10="0">M<BR>M</VAL>
VAL10="0">*No class&nbsp;7/2-4</VAL>
VAL10="0">8 a.m.-noon<BR>8 a.m.-noon</VAL>

# 4  
Old 09-01-2007
That does the trick. Thank you so much for your help.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

How to write in other language in text/xml file by reading english text/xml file using C++?

Hello Team, I have 2 files.one contains english text and another contains Japanese. so i have to read english text and replace the text with Japanesh text in third file. Basically, I need a help to write japanese language in text/xml file.I heard wstring does this.Not sure how do i write... (2 Replies)
Discussion started by: SA_Palani
2 Replies

2. Shell Programming and Scripting

Extract a value from an xml file

I have this XML file format and all in one line: Fri Dec 23 00:14:52 2016 Logged Message:689|<?xml version="1.0" encoding="UTF-8"?><PORT_RESPONSE><HEADER><ORIGINATOR>XMG</ORIGINATOR><DESTINAT... (16 Replies)
Discussion started by: mrn6430
16 Replies

3. UNIX for Dummies Questions & Answers

Extract Element from XML file

<?xml version = '1.0' encoding =... (8 Replies)
Discussion started by: Siva SQL
8 Replies

4. Shell Programming and Scripting

Extract Data from XML file.

Hi Guys, I am in a need to extract data from a xml file. The XML file format is as below. <data jsxnamespace="propsbundle" locales=""> <locale> <!--Error messages starts--> <record jsxid="CHARPAIR001" jsxtext=" must be selected"></record> <record... (1 Reply)
Discussion started by: Showdown
1 Replies

5. Shell Programming and Scripting

Extract a particular xml only from an xml jar file

Hi..need help on how to extract a particular xml file only from an xml jar file... thanks! (2 Replies)
Discussion started by: qwerty000
2 Replies

6. Shell Programming and Scripting

Get extract text from xml file

Hi Collegue, i have a file say a.xml. it has contents <bpelFault><faultType>1</faultType><genericSystemFault xmlns=""><part name="payload"><v2:Fault... (10 Replies)
Discussion started by: Jewel
10 Replies

7. Shell Programming and Scripting

Extract values from an XML File

Hi, I need to capture all the attributes with delete next to it. The source XML file is attached. The output should contain something like this below: Attributes = legacyExchangeDN Action = Delete Username = Hero Joker Loginid = joker09 OU =... (4 Replies)
Discussion started by: prvnrk
4 Replies

8. Shell Programming and Scripting

sed - extract text from xml file

hi, please help, i have an xml file, e.g: ... <tag> test text asdas="${abc}" xvxvbs:asdas${222}sdad asasa="${aa_bb_22}" </tag> ... i want to extract all "${...}", e.g: ${abc} ${222} ${aa_bb_22} thank you. (2 Replies)
Discussion started by: gioni
2 Replies

9. Shell Programming and Scripting

Extract XML content from a file

310439 2012-01-11 03:44:42,291 INFO PutServlet:? - Content of the Message is:="1.0" encoding="UTF-8"?><ESP_SSIA_ACC_FEED> 310440 <BATCH_ID>12345678519</BATCH_ID> 310441 <UID>3498748823</UID> 310442 <FEED_TYPE>FULL</FEED_TYPE> 310443 <MART_NAME>SSIA_DM_TRANSACTIONS</MART_NAME> 310444... (11 Replies)
Discussion started by: arukuku
11 Replies

10. Shell Programming and Scripting

extract a number within an xml file

Hi Everyone, I have an sh script that I am working on and I have run into a little snag that I am hoping someone here can assist me with. I am using wget to retrieve an xml file from thetvdb.com. This part works ok but what I need to be able to do is extract the series ID # from the xml and put... (10 Replies)
Discussion started by: tret
10 Replies
Login or Register to Ask a Question