Hi All,
I have following example file
i want to remove all html tags only,
Input File:
<html>
<head>
<title>Software Solutions Inc., </title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body bgcolor=white leftmargin="0" topmargin="0"... (2 Replies)
I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part.
Same problem happens in "type" command in MS-DOS.
I know you can do it by opening it in Internet Explorer,... (4 Replies)
I have umpteen number of files containing HTML A tags in the below format
or
I want to find all the lines that contain the word Login=
I used this command
grep "Login=" *
This gave me normal lines as well which contain the word Login= for example, it returned lines which... (2 Replies)
Could someone, please provide a solution to the following:
I would like to remove some tags from the "head" of multiple html documents across the web site. They look like
<link rel="alternate" type="application/rss+xml"
title="Business and Investment in the Philippines"... (2 Replies)
Hi
I am new to string extractions in shell script... I am trying to extract a string such as #1753 from html tag looks like below.
<a class="model-link tl-tr" href="lastSuccessfulBuild/">Last successful build (#1753), 40 min ago</a>
and want the value as
1753
Could someone help me to... (3 Replies)
I want to print from <fruits> to </fruits> tag which have <fruit> as mango. Also i want both <fruits> and </fruits> in output. Please help
eg.
<fruits>
<fruit id="111">mango<fruit>
.
another 20 lines
.
</fruits> (3 Replies)
How can i find texts inside a html tag using sed?
Html texts:
What i tried:
cat infile | sed -e 's/\(<kbd*\)\(.*\)\(kbd>\)/\2/
Expected result like this:
sed -i -e 's/@colophon/@@colophon/' \ -e 's/doc@cygnus.com/doc@@cygnus.com/' bfd/doc/bfd.texinfo (5 Replies)
Hi,
I have a txt file which contain this:
<a href="linux">Linux</a>
<a href="unix">Unix</a>
<a href="oracle">Oracle</a>
<a href="perl">Perl</a>
I'm trying to extract the text in between these anchor tag and ignoring everything else using grep. I managed to ignore the tags but unable to... (6 Replies)
I want to clean a html file.
I try to remove the script part in the html and remove the rest of tags and empty lines.
The code I try to use is the following:
sed '/<script/,/<\/script>/d' webpage.html | sed -e 's/<*>//g' | sed '/^\s*$/d' > output.txt
However, in this method, I can not... (10 Replies)
Discussion started by: YuhuiFeng
10 Replies
LEARN ABOUT DEBIAN
xml::stream::node
XML::Stream::Node(3pm) User Contributed Perl Documentation XML::Stream::Node(3pm)NAME
XML::Stream::Node - Functions to make building and parsing the tree easier to work with.
SYNOPSIS
Just a collection of functions that do not need to be in memory if you
choose one of the other methods of data storage.
This creates a hierarchy of Perl objects and provides various methods
to manipulate the structure of the tree. It is much like the C library
libxml.
FORMAT
The result of parsing:
<foo><head id="a">Hello <em>there</em></head><bar>Howdy<ref/></bar>do</foo>
would be:
[ tag: foo
att: {}
children: [ tag: head
att: {id=>"a"}
children: [ tag: "__xmlstream__:node:cdata"
children: "Hello "
]
[ tag: em
children: [ tag: "__xmlstream__:node:cdata"
children: "there"
]
]
]
[ tag: bar
children: [ tag: "__xmlstream__:node:cdata"
children: "Howdy "
]
[ tag: ref
]
]
[ tag: "__xmlstream__:node:cdata"
children: "do"
]
]
METHODS
new() - creates a new node. If you specify tag, then the root
new(tag) tag is set. If you specify data, then cdata is added
new(tag,cdata) to the node as well. Returns the created node.
get_tag() - returns the root tag of the node.
set_tag(tag) - set the root tag of the node to tag.
add_child(node) - adds the specified node as a child to the current
add_child(tag) node, or creates a new node with the specified tag
add_child(tag,cdata) as the root node. Returns the node added.
remove_child(node) - removes the child node from the current node.
remove_cdata() - removes all of the cdata children from the current node.
add_cdata(string) - adds the string as cdata onto the current nodes
child list.
get_cdata() - returns all of the cdata children concatenated together
into one string.
get_attrib(attrib) - returns the value of the attrib if it is valid,
or returns undef is attrib is not a real
attribute.
put_attrib(hash) - for each key/value pair specified, create an
attribute in the node.
remove_attrib(attrib) - remove the specified attribute from the node.
add_raw_xml(string,[string,...]) - directly add a string into the XML
packet as the last child, with no
translation.
get_raw_xml() - return all of the XML in a single string, undef if there
is no raw XML to include.
remove_raw_xml() - remove all raw XML strings.
children() - return all of the children of the node in a list.
attrib() - returns a hash containing all of the attributes on this
node.
copy() - return a recursive copy of the node.
XPath(path) - run XML::Stream::XPath on this node.
XPathCheck(path) - run XML::Stream::XPath on this node and return 1 or 0
to see if it matches or not.
GetXML() - return the node in XML string form.
AUTHOR
By Ryan Eatmon in June 2002 for http://jabber.org/
Currently maintained by Darian Anthony Patrick.
COPYRIGHT
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.14.2 2010-01-08 XML::Stream::Node(3pm)