Sponsored Content
Top Forums Shell Programming and Scripting Extracting tag values from XML using perl Post 302222131 by Steve_altius on Wednesday 6th of August 2008 05:07:49 AM
Old 08-06-2008
Extracting tag values from XML using perl

Hi All,

I'm trying to extract the values for the 'src' and 'alt' tags within an xml file. In the files that I'm searching, the tags are always enclosed within an 'img' tag. Typically:

<img src="diwiz01.gif" width="576" height="254" alt="Out-of-process and In-process COM Objects"><bookmark name="f003"/></img>

I grep for 'img' and pipe to the following perl code that successfully extracts the required data:

Code:
#!/usr/bin/perl
while (<>) {
   while (m/img src=\"(.*?)\"/ig) {
      print $1,"|";
      }
   while (m/alt=\"(.*?)\"/ig) {
      print $1,"\n";
      }
      }

However, the xml source occasionally contains the 'src' and 'alt' tags in a different order within the 'img' tag. For example:

<img width="470" height="321" alt="A Remote COM Object" src="dicwiz02.gif"><bookmark name="f004"/></img>

Consequently, the above code doesn't work.

The basis of the code was originally used for a different problem and I didn't write it. I've modified it in an attempt to satisfy this problem. Unfortunately, although I know the basics of sed and awk (but hardly any perl), I'm not a programmer and I'm struggling a bit.

Any help gratefully received.

Thanks.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting XML Tag Contents

Hi Jean I require your help in writing a shell script. Iam zero in Unix programming. I have a large file about 400 MB of data, which contains about 50000 XML messages seperated by a Tab, I think. I need to extract only 4 values from each XML message and write it onto a new file. Please help me... (2 Replies)
Discussion started by: pk_eee
2 Replies

2. Shell Programming and Scripting

KSH Script to Get the <TAG Values> from an XML file

Hi All, I am new to Unix I need a KSH script to get the values from XML file to write to a temp file. Like the requirement is from the below TAG <MAPPING DESCRIPTION ="Test Mapping" ISVALID ="YES" NAME ="m_test_xml" OBJECTVERSION ="1" VERSIONNUMBER ="1"> I need the MAPPING DESCRIPTION... (3 Replies)
Discussion started by: perlamohan
3 Replies

3. Programming

Extracting Field values for XML file

i have an input file of XML type with data like <nx-charging:additional-parameter name="NX_INTERNATIONALIZED_CLID" value="919427960829"/><nx-charging:finalStatus>RESPONSE , Not/Applicable , OK</nx-charging:finalStatus></nx-charging:process> i want to extract data such that i get the output... (3 Replies)
Discussion started by: junaid.nehvi
3 Replies

4. UNIX for Dummies Questions & Answers

Extracting values from an XML file

Hello People, I have an xml file from which I need to extract the values of the parameters using UNIX shell commands. Ex : Input is like : <Name>Roger</Name> or <Address>MI</Address> I need the output as just : Roger or MI with the tags removed. Please help. (1 Reply)
Discussion started by: sushant172
1 Replies

5. Shell Programming and Scripting

Extracting the value of an attribute tag from XML

Greetings, I am very new to the UNIX shell scripting and would like to learn. However, I am currently stuck on how to process the below sample of code from an XML file using UNIX comands: <ATTRIBUTE NAME="Memory" VALUE="512MB"/> <ATTRIBUTE NAME="CPU Speed" VALUE="3.0GHz"/> <ATTRIBUTE... (5 Replies)
Discussion started by: JesterMania
5 Replies

6. Shell Programming and Scripting

Extracting the value of an middle attribute tag from XML

Hi All, Please help me out in resolving this.. <secondTag enabled='true' processName='test1' pidFile='/tmp/test1.pid' /> From the above tag, I'm trying to retrieve the value of enabled and pidFile attributes by means of processName attribute. Would be thankful in resolving this..... (5 Replies)
Discussion started by: mjavalkar
5 Replies

7. Shell Programming and Scripting

Find out values between xml tag

Find out values between xml tag ....... ABC><name></ABC><xyz>test</xyz>..here some other tag... <ABC><NUMBER></ABC><xyz>12345</xyz>.... ....... I want to take between bewtween ABC><NUMBER></ABC><xyz> to </xyz> that is 12345 (3 Replies)
Discussion started by: Jairaj
3 Replies

8. Shell Programming and Scripting

To search for a particular tag in xml and collate all similar tag values and display them count

I want to basically do the below thing. Suppose there is a tag called object1. I want to display an output for all similar tag values under heading of Object 1 and the count of the xmls. Please help File: <xml><object1>house</object1><object2>child</object2>... (9 Replies)
Discussion started by: srkmish
9 Replies

9. Shell Programming and Scripting

Extracting the tag name from an xml file

Hi, My requirement is something like this, I have a xml file that contains some tags and nested tags, <n:tag_name1> <n:sub_tag1>val1</n:sub_tag1> <n:sub_tag2>val2</n:sub_tag2> </n:tag_name1> <n:tag_name2> <n:sub_tag1>value</n:sub_tag1> ... (6 Replies)
Discussion started by: Little
6 Replies

10. Shell Programming and Scripting

Moving XML tag/contents after specific XML tag within same file

Hi Forum. I have an XML file with the following requirement to move the <AdditionalAccountHolders> tag and its content right after the <accountHolderName> tag within the same file but I'm not sure how to accomplish this through a Unix script. Any feedback will be greatly appreciated. ... (19 Replies)
Discussion started by: pchang
19 Replies
Parser::Style::Stream(3)				User Contributed Perl Documentation				  Parser::Style::Stream(3)

NAME
XML::Parser::Style::Stream - Stream style for XML::Parser SYNOPSIS
use XML::Parser; my $p = XML::Parser->new(Style => 'Stream', Pkg => 'MySubs'); $p->parsefile('foo.xml'); { package MySubs; sub StartTag { my ($e, $name) = @_; # do something with start tags } sub EndTag { my ($e, $name) = @_; # do something with end tags } sub Characters { my ($e, $data) = @_; # do something with text nodes } } DESCRIPTION
This style uses the Pkg option to find subs in a given package to call for each event. If none of the subs that this style looks for is there, then the effect of parsing with this style is to print a canonical copy of the document without comments or declarations. All the subs receive as their 1st parameter the Expat instance for the document they're parsing. It looks for the following routines: o StartDocument Called at the start of the parse . o StartTag Called for every start tag with a second parameter of the element type. The $_ variable will contain a copy of the tag and the %_ variable will contain attribute values supplied for that element. o EndTag Called for every end tag with a second parameter of the element type. The $_ variable will contain a copy of the end tag. o Text Called just before start or end tags with accumulated non-markup text in the $_ variable. o PI Called for processing instructions. The $_ variable will contain a copy of the PI and the target and data are sent as 2nd and 3rd parameters respectively. o EndDocument Called at conclusion of the parse. perl v5.16.2 2011-05-24 Parser::Style::Stream(3)
All times are GMT -4. The time now is 03:45 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy