Sponsored Content
Top Forums Shell Programming and Scripting Extract value inside <text> tag for a particular condition. Post 302270567 by summer_cherry on Monday 22nd of December 2008 07:46:36 AM
Old 12-22-2008
hi,

you may use perl ASX to process it.

input:
Code:
<?xml version="1.0"?>
<data>
<text top="246" left="160" width="84" height="16" font="3">Business purpose</text>
<text top="260" left="506" width="220" height="16" font="3">giving the right information and new insights </text>
<text top="296" left="160" width="67" height="16" font="3">Characteristic</text>
<text top="296" left="278" width="111" height="16" font="3">Operational processing</text>
<text top="296" left="506" width="120" height="16" font="3">Informational processing</text>
<text top="318" left="160" width="55" height="16" font="3">Orientation</text>
<text top="318" left="278" width="56" height="16" font="3">Transaction</text>
<text top="318" left="506" width="42" height="16" font="3">Analysis</text>
<text top="340" left="160" width="43" height="16" font="3">Function</text>
</data>

code:

Code:
package Leo;
use XML::SAX::Base;
@ISA=qw(XML::SAX::Base);
sub start_document{
	my $self=shift;
	my $doc=shift;
}
sub start_element{
	my $self=shift;
	my $element=shift;   
	foreach my $key (keys %{$element->{Attributes}}){
		my $attr=$element->{Attributes}->{$key};
		$flag=1 if ($attr->{Name} eq "left" && $attr->{Value}==160);
	}
}
sub characters{
	my $self=shift;
	my $char=shift;
	if ($flag==1){
		print "<P>",$char->{Data},"</P>\n";
		$flag=0;
	}	
}
1

Code:
use XML::SAX;
use Leo;
$parser=XML::SAX::ParserFactory->parser(Handler=>Leo->new);
$parser->parse_uri("a.txt");

result:
Code:
you expectation

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. Shell Programming and Scripting

Finding a string inside A Tag

I have umpteen number of files containing HTML A tags in the below format or I want to find all the lines that contain the word Login= I used this command grep "Login=" * This gave me normal lines as well which contain the word Login= for example, it returned lines which... (2 Replies)
Discussion started by: dahlia84
2 Replies

3. Shell Programming and Scripting

extract xml tag based on condition

Hi All, I have a large xml file of invoices. The file looks like below: <INVOICES> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>1234</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>2345</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME>... (9 Replies)
Discussion started by: angshuman
9 Replies

4. Shell Programming and Scripting

Replace text inside XML file based on condition

Hi All, I want to change the name as SEQ_13 ie., <Property Name="Name">SEQ_13</Property> when the Stage Type is PxSequentialFile ie., <Property Name="StageType">PxSequentialFile</Property> :wall: Input.XML <Main> <Record Identifier="V0S13" Type="CustomStage" Readonly="0">... (3 Replies)
Discussion started by: kmsekhar
3 Replies

5. Shell Programming and Scripting

How to remove string inside html tag <a>

Does anybody know how i can remove string from <a> tag? There are several hundred posts in a few forums that need to be cleaned up. The precise situation is ---------- <a href="http://mydomain.com/cgi-bin/anyboard.cgi?fvp=/family/sexuality_and_spirituality/&cmd=rA&cG=43"> ------------- my... (6 Replies)
Discussion started by: georgi58
6 Replies

6. Shell Programming and Scripting

How can i find texts inside a html tag using sed?

How can i find texts inside a html tag using sed? Html texts: What i tried: cat infile | sed -e 's/\(<kbd*\)\(.*\)\(kbd>\)/\2/ Expected result like this: sed -i -e 's/@colophon/@@colophon/' \ -e 's/doc@cygnus.com/doc@@cygnus.com/' bfd/doc/bfd.texinfo (5 Replies)
Discussion started by: cola
5 Replies

7. Shell Programming and Scripting

Help with XML tag value extraction based on matching condition

sample xml file part <DocumentMinorVersion>0</DocumentMinorVersion> <DocumentVersion>1</DocumentVersion> <EffectiveDate>2017-05-30T00:00:00Z</EffectiveDate> <FollowOnFrom> <ContractRequest _LoadId="export_AJ6iAFoh6g0rE9"> <_LocalId>CRW2218451</_LocalId> ... (4 Replies)
Discussion started by: paul1234
4 Replies

8. Shell Programming and Scripting

Help with XML tag value extraction based on condition

sample xml file part <?xml version="1.0" encoding="UTF-8"?><ContractWorkspace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" _LoadId="export_AJ6iAFmh+pQHq1" xsi:noNamespaceSchemaLocation="ContractWorkspace.xsd"> <_LocalId>CW2218471</_LocalId> <Active>true</Active> ... (3 Replies)
Discussion started by: paul1234
3 Replies

9. Shell Programming and Scripting

Help with tag value extraction from xml file based on a matching condition

Hi , I have a situation where I need to search an xml file for the presence of a tag <FollowOnFrom> and also , presence of partial part of the following tag <ContractRequest _LoadId and if these 2 exist ,then extract the value from the following tag <_LocalId> which is "CW2094139". There... (2 Replies)
Discussion started by: paul1234
2 Replies

10. UNIX for Beginners Questions & Answers

Replacing tag based on condition

Hi All, I am having a file like below. The file will having information about the records.If you see the file the file is header and data. For example it have 1 men tag and the tag id will be come after headers. The change is I want to convert All pets tag from P to X. I did a sed like below... (5 Replies)
Discussion started by: arunkumar_mca
5 Replies
XML::LibXML::SAX(3)					User Contributed Perl Documentation				       XML::LibXML::SAX(3)

NAME
XML::LibXML::SAX - XML::LibXML direct SAX parser DESCRIPTION
XML::LibXML provides an interface to libxml2 direct SAX interface. Through this interface it is possible to generate SAX events directly while parsing a document. While using the SAX parser XML::LibXML will not create a DOM Document tree. Such an interface is useful if very large XML documents have to be processed and no DOM functions are required. By using this interface it is possible to read data stored within a XML document directly into the application data structures without loading the document into memory. The SAX interface of XML::LibXML is based on the famous XML::SAX interface. It uses the generic interface as provided by XML::SAX::Base. Additionally to the generic functions, which are only able to process entire documents, XML::LibXML::SAX provides parse_chunk(). This method generates SAX events from well balanced data such as is often provided by databases. NOTE: At the moment XML::LibXML provides only an incomplete interface to libxml2's native SAX implementation. The current implementation is not tested in production environment. It may causes significant memory problems or shows wrong behaviour. If you run into specific problems using this part of XML::LibXML, let me know. AUTHORS
Matt Sergeant, Christian Glahn, Petr Pajas VERSION
1.70 COPYRIGHT
2001-2007, AxKit.com Ltd. 2002-2006, Christian Glahn. 2006-2009, Petr Pajas. perl v5.12.1 2009-10-07 XML::LibXML::SAX(3)
All times are GMT -4. The time now is 10:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy