Sponsored Content
Top Forums Shell Programming and Scripting help with sed needed to extract content from html tags Post 302604323 by seb001 on Sunday 4th of March 2012 11:58:08 AM
Old 03-04-2012
that returns

Code:
<html><body><form name='sendme' action='http://example.com/' method='POST'>
abc123def678
<textarea name='2nd'>Text</textarea>
<textarea name='3rd'>Text</textarea>
</form></body></html>

i've tried

Code:
sed '/1st/ s:<textarea[^>]*>\([^<]*\)</textarea>.*:\1:;q' par

with result
Code:
<html><body><form name='sendme' action='http://example.com/' method='POST'>
abc123def678

Code:
sed '/1st/ s:.*<textarea[^>]*>\([^<]*\)</textarea>.*:\1:;q' par

returns last textarea with text

any idea how to modify it ?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to supplement HTML tags with SED

I am cleaning up HTML with sed. With the regexp <a name="+"></a><h>*<span class="mw-headline" >+</span></h> I can find the tags I need. But when I place them in a sed command, sed fails. So I started building up from a smaller command. This is where I am now: sed -r -e s/"<a... (3 Replies)
Discussion started by: DocBrewer
3 Replies

2. UNIX for Advanced & Expert Users

sed to extract HTML content

Hiya, I am trying to extract a news article from a web page. The sed I have written brings back a lot of Javascript code and sometimes advertisments too. Can anyone please help with this one ??? I need to fix this sed so it picks up the article ONLY (don't worry about the title or date .. i got... (2 Replies)
Discussion started by: stargazerr
2 Replies

3. Shell Programming and Scripting

sed to extract only floating point numbers from HTML

Hi All, I'm trying to extract some floating point numbers from within some HTML code like this: <TR><TD class='awrc'>Parse CPU to Parse Elapsd %:</TD><TD ALIGN='right' class='awrc'> 64.50</TD><TD class='awrc'>% Non-Parse CPU:</TD><TD ALIGN='right' class='awrc'> ... (2 Replies)
Discussion started by: pondlife
2 Replies

4. Shell Programming and Scripting

Extract URLs from HTML code using sed

Hello, i try to extract urls from google-search-results, but i have problem with sed filtering of html-code. what i wont is just list of urls thay apears between ........<p><a href=" and next following " in html code. here is my code, i use wget and pipelines to filtering. wget works, but... (13 Replies)
Discussion started by: L0rd
13 Replies

5. Shell Programming and Scripting

sed - striping out html tags

I have pasted the contents of a log file (swmbackup.wrkstn.1262071383.sales2a) below: Workstation: sales2a<BR Vault sales2a-hogwarts will be initialized.<BR <font color="red"There was a problem mounting /mnt/sales2a/desktop$ </FONT<BR <font color="red"There was a problem mounting... (4 Replies)
Discussion started by: bigtonydallas
4 Replies

6. Shell Programming and Scripting

SED to extract HTML text data, not quite right!

I am attempting to extract weather data from the following website, but for the Victoria area only: Text Forecasts - Environment Canada I use this: sed -n "/Greater Victoria./,/Fraser Valley./p" But that phrasing does not sometimes get it all and think perhaps the website has more... (2 Replies)
Discussion started by: lagagnon
2 Replies

7. Shell Programming and Scripting

awk -- Extract data from html within multiple tags as reference

Hi, I'm trying to get some data from an html file, but the problem is before it can extract the information I have multiple patterns that need to be passed through. https://www.unix.com/shell-programming-scripting/150711-extract-data-awk-html-files.html Is a similar problem. The only... (5 Replies)
Discussion started by: counfhou
5 Replies

8. UNIX for Dummies Questions & Answers

Replacing HTML tags with sed

Ok, so this is stupid simple, and I know I am going to feel like an idiot when I get help. I am altering a HTML report that has contraband in it so that the links to said contraband and the images are not shown. The link/img pairs are in the form of : <a... (5 Replies)
Discussion started by: twjolson
5 Replies

9. Shell Programming and Scripting

Print content between two html tags

Hi Expert, Is there any other way to print and write to a same filename the content between two html tags? Here the sample: cat file.html <div id="outline"> hello world<br> </div> <div id="container_faq"> test1<br> </div> <div class="widget_quick"> thead test<br> </div> ... (3 Replies)
Discussion started by: lxdorney
3 Replies

10. Shell Programming and Scripting

Awk/sed HTML extract

I'm extracting text between table tags in HTML <th><a href="/wiki/Buick_LeSabre" title="Buick LeSabre">Buick LeSabre</a></th> using this: awk -F "</*th>" '/<\/*th>/ {print $2}' auto2 > auto3 then this (text between a href): sed -e 's/\(<*>\)//g' auto3 > auto4 How to shorten this into one... (8 Replies)
Discussion started by: p1ne
8 Replies
HOBBITWEB(5)							File Formats Manual						      HOBBITWEB(5)

NAME
Xymon web page headers, footers and forms. DESCRIPTION
The Xymon webpages are somewhat customizable, by modifying the header- and footer-templates found in the ~xymon/server/web/ directory. There are usually two or more files for a webpage: A template_header file which is the header for this webpage, and a template_footer file which is the footer. Webpages where entry forms are used have a template_form file which is the data-entry form. With the exception of the bulletin files, the header files are inserted into the HTML code at the very beginning and the footer files are inserted at the bottom. The following templates are available: bulletin A bulletin_header and bulletin_footer is not shipped with Xymon, but if they exist then the content of these files will be inserted in all HTML documents generated by Xymon. The "bulletin_header" contents will appear after the normal header for the webpage, and the "bulletin_footer" will appear just before the normal footer for the webpage. These files can be used to post important informa- tion about the Xymon system, e.g. to notify users of current operational or monitoring problems. acknowledge Header, footer and form template for the Xymon acknowledge alert webpage generated by bb-ack.cgi(1) bb Header and footer for the Xymon Main view webpages, generated by bbgen(1) bb2 Header and footer for the Xymon All non-green view webpage, generated by bbgen(1) bbnk Header and footer for the now deprecated BBNK webpage, generated by bbgen. You should use the newer hobbit-nkview.cgi(1) utility instead, which uses the hobbitnk templates. bbrep Header and footer for the Xymon Main view availability report webpages, generated by bbgen(1) when running in availability report mode. bbsnap Header and footer for the Xymon Main view snapshot webpages, generated by bbgen(1) when running in snapshot report mode. bbsnap2 Header and footer for the Xymon All non-green view snapshot webpage, generated by bbgen(1) when running in snapshot report mode. columndoc Header and footer for the Xymon Column documentation webpages, generated by the bb-csvinfo.cgi(1) utility in the default Xymon con- figuration. confreport Header and footer for the Xymon Configuration report webpage, generated by the hobbit-confreport.cgi(1) utility. Note that there are also "confreport_front" and "confreport_back" templates, these are inserted into the generated report before the hostlist, and before the column documentation, respectively. event Header, footer and form for the Xymon Eventlog report, generated by hobbit-eventlog.cgi(1) findhost Header, footer and form for the Xymon Find host webpage, generated by bb-findhost.cgi(1) graphs Header and footer for the Xymon Graph details webpages, generated by hobbitgraph.cgi(1) hist Header and footer for the Xymon History webpage, generated by bb-hist.cgi(1) histlog Header and footer for the Xymon Historical status-log webpage, generated by hobbitsvc.cgi(1) utility when used to show a historical (non-current) status log. hobbitnk Header and footer for the Xymon Critical Systems view webpage, generated by hobbit-nkview.cgi(1) hostsvc Header and footer for the Xymon Status-log webpage, generated by hobbitsvc.cgi(1) utility when used to show a current status log. info Header and footer for the Xymon Info column webpage, generated by hobbitsvc.cgi(1) utility when used to show the host configuration page. maintact Header and footer for the Xymon webpage, generated by hobbit-enadis.cgi(1) utility when using the Enable/Disable "preview" mode. maint Header, footer and form for the Xymon Enable/disable webpage, generated by hobbit-enadis.cgi(1) nkack Form show on the status-log webpage when viewed from the "Critical Systems" overview. This form is used to acknowledge a critical status by the operators monitoring the Critical Systems view. nkedit Header, footer and form for the Critical Systems Editor, the hobbit-nkedit.cgi(1) utility. replog Header and footer for the Xymon Report status-log webpage, generated by hobbitsvc.cgi(1) utility when used to show a status log for an availability report. report Header, footer and forms for selecting a pre-generated Availability Report. Handled by the bb-datepage.cgi(1) utility. snapshot Header and footer for the Xymon Snapshot report selection webpage, generated by bb-snapshot.cgi(1) SEE ALSO
bbgen(1), hobbitsvc.cgi(1), xymon(7) Xymon Version 4.2.3: 4 Feb 2009 HOBBITWEB(5)
All times are GMT -4. The time now is 03:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy