Sponsored Content
Top Forums UNIX for Dummies Questions & Answers How do I extract text only from html file without HTML tag Post 83909 by los111 on Tuesday 20th of September 2005 08:02:44 PM
Old 09-20-2005
thanks

Thanks a lot! I will try this. I never used lynx before, but I hope my Fedora Core already has it.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parse HTML tag parameters and text

Hi! I have a bunch of HTML files, which I want to parse to CSV files. Every page has a table in it, and I need to parse each row into a csv record. With awk and sed, I managed to put every table row in separate lines. So my file looks like this: <TR> .... </TR> <TR> .... </TR> ...One... (1 Reply)
Discussion started by: senszey
1 Replies

2. Shell Programming and Scripting

SED to extract HTML text data, not quite right!

I am attempting to extract weather data from the following website, but for the Victoria area only: Text Forecasts - Environment Canada I use this: sed -n "/Greater Victoria./,/Fraser Valley./p" But that phrasing does not sometimes get it all and think perhaps the website has more... (2 Replies)
Discussion started by: lagagnon
2 Replies

3. Shell Programming and Scripting

Parsing HTML, get text between 2 HTML tags

Hi there, I'm quite new to the forum and shell scripting. I want to filter out the "166.0 points". The results, that i found in google / the forum search didn't helped me :( <a href="/user/test" class="headitem menu" style="color:rgb(83,186,224);">test</a><a href="/points" class="headitem... (1 Reply)
Discussion started by: Mysthik
1 Replies

4. Shell Programming and Scripting

Removing all except couple of html tags from html file

I tried to find elegant (or at least simple) way to remove all but couple of html tags from html file, but all examples I found dealt with removing all the tags. The logic of the script would be: - if there is <li> or <ul> on the line, do nothing (=write same line to output) - if there is:... (0 Replies)
Discussion started by: juubuntu
0 Replies

5. Shell Programming and Scripting

Add the html tag first and last line the file

Hi, i have 30 html files and i want to add the html tag first (<html>) and end of the line </html> tag..How to do it in script. Thanks, (7 Replies)
Discussion started by: bmk
7 Replies

6. Shell Programming and Scripting

Search for a html tag and print the entire tag

I want to print from <fruits> to </fruits> tag which have <fruit> as mango. Also i want both <fruits> and </fruits> in output. Please help eg. <fruits> <fruit id="111">mango<fruit> . another 20 lines . </fruits> (3 Replies)
Discussion started by: Ashik409
3 Replies

7. UNIX for Dummies Questions & Answers

Extract table from an HTML file

I want to extract a table from an HTML file. the table starts with <table class="tableinfo" and ends with next closing table tag </table> how can I do this with awk/sed... ---------- Post updated at 04:34 PM ---------- Previous update was at 04:28 PM ---------- also I want to... (4 Replies)
Discussion started by: koutroul
4 Replies

8. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies

9. Shell Programming and Scripting

Extract both contents from a html file and do printing

Hi there, Print IP Address: grep 'HostID :' 10.244.9.124\ nessus.html | awk -F '<br>' '{print $12}' | tr -s ' ' | awk -F ':' '{print "<tr><td>" $2 "</td><td>"}' Print Respective Ports: grep 'classsubsection\|./tcp\|./udp' 10.244.9.124\ nessus.html | grep -v 'h2.classsubsection... (3 Replies)
Discussion started by: alvinoo
3 Replies

10. Shell Programming and Scripting

Extract text from html using perl or awk

I am trying to extract text after keywords fron an html file. The keywords are reportLink":, "barcodedSamples": {", "barcodedSamples": {". Both the perl and awk run but the output is just the entire index.html not the desired output. Also for the reportLink": only the text after the second / until... (5 Replies)
Discussion started by: cmccabe
5 Replies
PARSEWIKI(1)							   User Commands						      PARSEWIKI(1)

NAME
parsewiki - transform marked text into HTML, XHTML, Docbook or LaTeX SYNOPSIS
parsewiki [OPTION]... [FILE] DESCRIPTION
This manual page documents briefly the parsewiki command. This manual page was written for the Debian distribution because the original program does not have a manual page. parsewiki is a program that transform a text file with a very minimal Wiki style syntax into other formats, including HTML, XHTML, Docbook and LaTeX. See the file /usr/share/doc/parsewiki/doc/manual-en.txt for a description of the parsewiki syntax. OPTIONS
-f, --format=FORMAT Output format; one of html, xhtml, docbook, latex. (default html) -T, --title=TITLE Title. -t, --template=FILE File with a template to use instead of the standard. -c, --copyright Display copyright and copying permission statement. -h, --help Show this usage summary. FILE is a simple text file with wiki formating syntax. The result will be sent to the Standard Output. If FILE is not given, input will be taken from the Standard Input. EXAMPLES
$ parsewiki myfile.wiki $ cat file.txt | parsewiki -fdocbook --title="An Example" > file.xml BUGS
Report bugs to <villate@gnu.org>. AUTHOR
This manual page was written by Sergio Talens-Oliag <sto@debian.org>, for the Debian project (but may be used by others). parsewiki 0.4.3 July 2003 PARSEWIKI(1)
All times are GMT -4. The time now is 06:46 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy