11-30-2015
This /</{:L;N;/>/!bL;d}; is a loop, entered when a < is encountered, appending next lines until an > is read, then deleting the pattern space. It is far from bullet proof, not accounting for e.g. nested tags, but should give you an idea on how you could proceed.
If you want further help you need to be way more specific (details, samples, error msgs, ...).
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part.
Same problem happens in "type" command in MS-DOS.
I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies
2. Shell Programming and Scripting
Hai friends
I have a small doubt..
how can we use html tag in shell scripting
code :
echo "<html>"
echo "<body>"
echo " welcome to peace world "
echo "</body>"
echo "</html>"
output displayed like this:
<html>
<body>
welcome to peace world
</body>
</html> (5 Replies)
Discussion started by: jrex1983
5 Replies
3. Shell Programming and Scripting
Is there any shell command to clean an html tag of its attributes. For ex <p align ="center"> with <p>.
Thanks for your help!! (2 Replies)
Discussion started by: parshant_bvcoe
2 Replies
4. Shell Programming and Scripting
I have existing XML file as below, now based on input string in shell script on workordercode i need to create a seprate xml file
for e.g if we pass the input string as 184851 then it find the tag data from <workOrder>..</workOrder> and write to a new file and similarly next time if i pass the... (3 Replies)
Discussion started by: balrajg
3 Replies
5. Shell Programming and Scripting
Hi all,
I have a file which i have to remove some line from it,
the lines that i have to remove from my file is as below:
</new_name></w"s" langue="Fr-fr" version="1.0" encoding="UTF-8" ?> <New_name>
and it is finding at the middle of my file,
is there any command line in linux to do it or do... (10 Replies)
Discussion started by: id_2pc
10 Replies
6. Shell Programming and Scripting
Does anybody know how i can remove string from <a> tag?
There are several hundred posts in a few forums that need to be cleaned up.
The precise situation is
----------
<a href="http://mydomain.com/cgi-bin/anyboard.cgi?fvp=/family/sexuality_and_spirituality/&cmd=rA&cG=43">
-------------
my... (6 Replies)
Discussion started by: georgi58
6 Replies
7. Shell Programming and Scripting
Hi All,
My name is Prathyu and I am working as a ETL develper. I have one requirement to create a XML file based on the provided XSD file. As per the Datastage standards Key(repeatable) field does not contain any Null values so I am inserting some dummy tag line to that XML file.
... (14 Replies)
Discussion started by: Prathyu
14 Replies
8. Shell Programming and Scripting
I want to print from <fruits> to </fruits> tag which have <fruit> as mango. Also i want both <fruits> and </fruits> in output. Please help
eg.
<fruits>
<fruit id="111">mango<fruit>
.
another 20 lines
.
</fruits> (3 Replies)
Discussion started by: Ashik409
3 Replies
9. Shell Programming and Scripting
Hi All,
Can someone tell me how can we create same xml tag lines based on the number of lines present in other file and replace the Name variable vaule present in other file.
basically I have this xml line
<typ:RequestKey NameType="RIC" Name="A1" Service="DDA"/>
and say I... (4 Replies)
Discussion started by: Optimus81
4 Replies
10. UNIX for Beginners Questions & Answers
Hello,
I want to parse the contents of a multiline html tag
ex:
<html>
<body>
<p>some other text</p>
<div>
<p class="margin-bottom-0">
text1
<br>
text2
<br>
<br>
text3
</p>
</div>
</body> (15 Replies)
Discussion started by: SorcRR
15 Replies
LEARN ABOUT DEBIAN
vilistextum
VILISTEXTUM(1) General Commands Manual VILISTEXTUM(1)
NAME
vilistextum - html to ascii converter
SYNOPSIS
vilistextum [OPTIONS] [inputfile |-] [outputfile | -]
DESCRIPTION
vilistextum is a html to ascii converter specifically programmed to get the best out of incorrect html.
OPTIONS
inputfile,- resp. outputfile,-
replace inputfile with '-' for reading from standard input, likewise outputfile with '-' for writing to standard output.
-a, --no-alt
don't output anything for IMG tags even if they have an ALT attribute. Implies --no-image.
-c, --convert-tags
some tags will be converted to special characters.
-e, --errorlevel NUMBER
increase level of verbosity for error messages (0: No error messages).
-i, --defimage STRING
IMG tags without alt attribute are output as [STRING].
-l, --links
numbers the links in the document and creates footnotes of each link at the end of the file.
-k, --links-inline
print the links directly after the html tag.
-m, --dont-convert-characters
don't convert the entities from windows1252 (€-Ÿ and their proper entity names)
-n, --no-image
don't output [Image] for IMG tags that have no ALT attribute.
-p, --palm
output text more suitable for reading on a PDA.
-r, --remove-empty-alt
if there is an empty ALT attribute in a IMG tag (eg <IMG href="..." alt="">), don't output '[]'.
-s, --shrink-lines [NUMBER]
if there are more than NUMBER empty lines, output only NUMBER. Default: 1.
-t, --no-title
don't output title.
-w, --width NUMBER
maximum line width.
-h, --help
display this help and exit
-v, --version
output version information and exit
MULTIBYTE OPTIONS (Only available if compiled with multibyte support)
-u, --output-utf-8
instead of the character set of the html document, everything will be output as utf-8.
-x, --translit
use the //TRANSLIT feature of libiconv. Consult the iconv manual for details.
-y, --charset CHARSET
if the HTML document doesn't provide a character set in the meta tags, use CHARSET.
LIMITATIONS
The rendering of tables is not very good.
The handling of OL is incomplete. The program treats it as UL and more than 10 nested lists confuse it.
Text is never justified.
REPORTING BUGS
Please report bugs to <bhaak@gmx.net>.
AUTHOR
Vilistextum was written by Patric Mueller <bhaak@gmx.net> and may be freely distributed under the terms of the GNU General Public License
Version 2. There is ABSOLUTELY NO WARRANTY for this program.
SEE ALSO
iconv(3), lynx(1), links(1), w3m(1)
22 OCT 2006 VILISTEXTUM(1)