Sponsored Content
Top Forums Shell Programming and Scripting extract fields from a downloaded html file Post 302624167 by kalpeer on Monday 16th of April 2012 01:55:14 AM
Old 04-16-2012
Please check this thread.. It might help you...

https://www.unix.com/shell-programmin...text-file.html
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. UNIX for Dummies Questions & Answers

Extract some common fields from 1 file that are presnt in another file

I have 2 files FILEA 720646363*PHILIPPINES 117183970*USA 116274291*USA 107940983*USA 107395824*USA 106632425*USA 105861926*USA 105208607*USA 053077046*USA 065428026*ENGLAND FILEB 001125236 001408905 002316511 002521094 020050725 035018308 052288735 (1 Reply)
Discussion started by: unxusr123
1 Replies

3. UNIX for Dummies Questions & Answers

extract fields from text file using delimiter!!

Hi All, I am new to unix scripting, please help me in solving this assignment.. I have a scenario, as follows: 1. i have a text file(read1.txt) with the following data sairam,123 kamal,122 etc.. 2. I have to write a unix... (6 Replies)
Discussion started by: G.K.K
6 Replies

4. Shell Programming and Scripting

Extract urls from index.html downloaded using wget

Hi, I need to basically get a list of all the tarballs located at uri I am currently doing a wget on urito get the index.html page Now this index page contains the list of uris that I want to use in my bash script. can someone please guide me ,. I am new to Linux and shell scripting. ... (5 Replies)
Discussion started by: mnanavati
5 Replies

5. UNIX for Dummies Questions & Answers

How to extract fields from etc/passwd file?

Hi! i want to extract from /etc/passwd file,the user and user info fileds, to a another file.I've tried this: cut -d ':' -f1 ':' -f6 < file but cut can be used to extract olny one field and not two. maybe with awk is this possible? (4 Replies)
Discussion started by: strawhatluffy
4 Replies

6. Shell Programming and Scripting

Extract expressions between two strings in html file

Hello guys, I'm trying to extract all the expressions between the following tags: <b></b> from a HTML file. This is how it looks: big lines containing several dozens expressions (made of 1,2,3,4,6 or even 7 words) I would like to extract: <b>bla ble</b>bla ble</td><tr valign="top"><td... (3 Replies)
Discussion started by: bobylapointe
3 Replies

7. UNIX for Dummies Questions & Answers

Extract table from an HTML file

I want to extract a table from an HTML file. the table starts with <table class="tableinfo" and ends with next closing table tag </table> how can I do this with awk/sed... ---------- Post updated at 04:34 PM ---------- Previous update was at 04:28 PM ---------- also I want to... (4 Replies)
Discussion started by: koutroul
4 Replies

8. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies

9. Shell Programming and Scripting

Extract both contents from a html file and do printing

Hi there, Print IP Address: grep 'HostID :' 10.244.9.124\ nessus.html | awk -F '<br>' '{print $12}' | tr -s ' ' | awk -F ':' '{print "<tr><td>" $2 "</td><td>"}' Print Respective Ports: grep 'classsubsection\|./tcp\|./udp' 10.244.9.124\ nessus.html | grep -v 'h2.classsubsection... (3 Replies)
Discussion started by: alvinoo
3 Replies

10. Shell Programming and Scripting

awk to extract multiple values from file and add two additional fields

In the attached file I am trying to use awk to extract multiple values and create the tab-delimited desired output. In the output R_Index is a the sequential # and Pre_Enrichment is defaulted to .. I can extract from the values to the side of the keywords, but most are above and I can not... (2 Replies)
Discussion started by: cmccabe
2 Replies
Mason::Plugin::HTMLFilters(3pm) 			User Contributed Perl Documentation			   Mason::Plugin::HTMLFilters(3pm)

NAME
Mason::Plugin::HTMLFilters - Filters related to HTML generation FILTERS
HTML or H Do a basic HTML escape on the content - just the characters '&', '>', '<', and '"'. <input name="company" value="<% $company | H %>"> HTMLEntities Do a comprehensive HTML escape on the content, using HTML::Entities::encode_entities. URI or U URI-escape the content. <a href="<% $url | U %>"> HTMLPara Formats a block of text into HTML paragraphs. A sequence of two or more newlines is used as the delimiter for paragraphs which are then wrapped in HTML ""<p>""...""</p>"" tags. Taken from Template::Toolkit. e.g. % $.HTMLPara {{ First paragraph. Second paragraph. % }} outputs: <p> First paragraph. </p> <p> Second paragraph. </p> HTMLParaBreak Similar to HTMLPara above, but uses the HTML tag sequence "<br><br>" to join paragraphs. Taken from Template::Toolkit. e.g. % $.HTMLPara {{ First paragraph. Second paragraph. % }} outputs: First paragraph. <br><br> Second paragraph. FillInForm ($form_data, %options) Uses HTML::FillInForm to fill in the form with the specified $form_data and %options. % $.FillInForm($form_data, target => 'form1') {{ ... <form name='form1'> ... % }} SUPPORT
The mailing list for Mason and Mason plugins is mason-users@lists.sourceforge.net. You must be subscribed to send a message. To subscribe, visit https://lists.sourceforge.net/lists/listinfo/mason-users <https://lists.sourceforge.net/lists/listinfo/mason-users>. You can also visit us at "#mason" on <irc://irc.perl.org/#mason>. Bugs and feature requests will be tracked at RT: http://rt.cpan.org/NoAuth/Bugs.html?Dist=Mason-Plugin-HTMLFilters bug-mason-plugin-htmlfilters@rt.cpan.org The latest source code can be browsed and fetched at: http://github.com/jonswar/perl-mason-plugin-htmlfilters git clone git://github.com/jonswar/perl-mason-plugin-htmlfilters.git SEE ALSO
Mason perl v5.14.2 2012-06-11 Mason::Plugin::HTMLFilters(3pm)
All times are GMT -4. The time now is 01:10 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy