extracting Line between HTML tag


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting extracting Line between HTML tag
# 8  
Old 03-01-2012
Quote:
Originally Posted by newlook2011
1st Thanks to huaihaizi3 ,agama for quick responds.

@agama note: T is GNU sed only
Yes, I seem to always forget that. A BSD sed just for completeness:

Code:
sed -n 's/.*<tag>//; !t 
s/<\/tag>.*//; !t 
p'

Newlines required.
# 9  
Old 03-02-2012
Quote:
Originally Posted by newlook2011
1st Thanks to huaihaizi3 ,agama[..]Between can you care to explain code. I am hitting man awk, could not find appropriate answers.
-F\>Use > as a field separator.
/^tag>/if a record starts with "tag" followed by > then
{print $2}print the second field of the record. Since the field separator is set to > $1 will be the tag and $2 will be the content
RS=\<Use < as record separator instead of a newline

---------- Post updated at 09:12 ---------- Previous update was at 09:06 ----------

They can be slightly improved still:
Code:
awk '$1==t{print $2}' RS=\< FS=\> t="tag" infile

removing newlines:
Varying tag:
Code:
awk '$1==t{gsub(ORS,x);print $2}' RS=\< FS=\> t="tag" infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting data between two tag pairs

In a huge log file (43MB, 43k lines) I am trying to extract data between two tag pairs on same line and export it to a file so I can pull it into Excel for a report. One Pair is <Text>data I need</Text> Other pair follows on same line and is <TimeStamp>more data I need</TimeStamp> I would need... (2 Replies)
Discussion started by: NanookArctic
2 Replies

2. Shell Programming and Scripting

Print Value between desired html tag

Hi, I have a html line as below :-... (6 Replies)
Discussion started by: satishmallidi
6 Replies

3. Shell Programming and Scripting

Search for a html tag and print the entire tag

I want to print from <fruits> to </fruits> tag which have <fruit> as mango. Also i want both <fruits> and </fruits> in output. Please help eg. <fruits> <fruit id="111">mango<fruit> . another 20 lines . </fruits> (3 Replies)
Discussion started by: Ashik409
3 Replies

4. Shell Programming and Scripting

Extracting a string from html tag

Hi I am new to string extractions in shell script... I am trying to extract a string such as #1753 from html tag looks like below. <a class="model-link tl-tr" href="lastSuccessfulBuild/">Last successful build (#1753), 40 min ago</a> and want the value as 1753 Could someone help me to... (3 Replies)
Discussion started by: hicharbo
3 Replies

5. Shell Programming and Scripting

Add the html tag first and last line the file

Hi, i have 30 html files and i want to add the html tag first (<html>) and end of the line </html> tag..How to do it in script. Thanks, (7 Replies)
Discussion started by: bmk
7 Replies

6. Shell Programming and Scripting

How to retrieve the value from XML tag whose end tag is in next line

Hi All, Find the following code: <Universal>D38x82j1JJ </Universal> I want to retrieve the value of <Universal> tag as below: Please help me. (3 Replies)
Discussion started by: mjavalkar
3 Replies

7. Shell Programming and Scripting

Script to delete HTML tag

Guys, I have a little script that I got of the internet and that I use in Squid to block ads. I used that script with linux but now i have moved my servers to freebsd. I have a step learning curve there but it is fun: Back to the script issue. The script used to work i with linux but... (15 Replies)
Discussion started by: zongo
15 Replies

8. Shell Programming and Scripting

How can i delete html attributes from tag ?

Input: <table class="pixelBorderTable faqTable" width="100%" border="1" cellpadding="3" cellspacing="0"> <tbody><tr> <td class="pixelBorderTableHeaderTd" valign="top" width="20%" bgcolor="#666666"><p>&nbsp;</p></td> <td class="pixelBorderTableHeaderTd" valign="top"... (1 Reply)
Discussion started by: cola
1 Replies

9. Shell Programming and Scripting

how to use html tag in shell scripting

Hai friends I have a small doubt.. how can we use html tag in shell scripting code : echo "<html>" echo "<body>" echo " welcome to peace world " echo "</body>" echo "</html>" output displayed like this: <html> <body> welcome to peace world </body> </html> (5 Replies)
Discussion started by: jrex1983
5 Replies

10. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies
Login or Register to Ask a Question