Parsing HTML, get text between 2 HTML tags


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parsing HTML, get text between 2 HTML tags
# 1  
Parsing HTML, get text between 2 HTML tags

Hi there, I'm quite new to the forum and shell scripting.

I want to filter out the "166.0 points". The results, that i found in google / the forum search didn't helped me Smilie

Code:
<a href="/user/test" class="headitem menu" style="color:rgb(83,186,224);">test</a><a href="/points" class="headitem menu">166.0 points</a>
<div id="inhalt"><div class="contenthead"><h4>your points</h4><span>166.0 points</span></div><div class="mod point_stats">

It might be very easy for you, but it would help me alot!
Thanks in advance!
# 2  
Hi Mysthik,

One way:
Code:
$ cat infile
<a href="/user/test" class="headitem menu" style="color:rgb(83,186,224);">test</a><a href="/points" class="headitem menu">166.0 points</a>
<div id="inhalt"><div class="contenthead"><h4>your points</h4><span>166.0 points</span></div><div class="mod point_stats">
$ perl -ne 'printf qq[%s\n], $1 if m/<span>([^<]+)<\/span>/' infile
166.0 points

Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #923
Difficulty: Medium
Unix has a tradition of directly representing non-integer Unix time numbers as binary fractions.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Create html <ui> <li> by parsing text file

Hi you all, this is my first post in this forum. I’m italian (please forgive me) :-) so my english will fail to be correct... Anyway, let's get straight to the point! I have a text file like this: ,,,, Disney: 00961-002,,,, ,Pippo: 00531-002,,, ,,Pluto: 00238-002,, ... (5 Replies)
Discussion started by: alcresio
5 Replies

2. UNIX for Advanced & Expert Users

Mutt for html body and multiple html & pdf attachments

Hi all: Been racking my brain on this for the last couple of days and what has been most frustrating is that this is the last piece I need to complete a project. There are numerous posts discussing mutt in this forum and others but I have been unable to find similar issues. Running with... (1 Reply)
Discussion started by: raggmopp
1 Replies

3. Shell Programming and Scripting

Removing all except couple of html tags from html file

I tried to find elegant (or at least simple) way to remove all but couple of html tags from html file, but all examples I found dealt with removing all the tags. The logic of the script would be: - if there is <li> or <ul> on the line, do nothing (=write same line to output) - if there is:... (0 Replies)
Discussion started by: juubuntu
0 Replies

4. Shell Programming and Scripting

BASH parsing for html tags

Hello can anyone help me parse this line. <tr><td>United States of America</td><td>Dollar</td><td>43.309</td></tr><tr><td>Japan</td><td>Yen</td><td>0.5579</td></tr> the line above did not break. so i would like to have a result like this United States of America Dollar 43.309 Japan... (3 Replies)
Discussion started by: doomsayer16
3 Replies

5. UNIX for Advanced & Expert Users

html parsing using unix

hi all, I had raised the same question a few weeks back but forgot to mention a lot of points ... so i am raising a new thread furnishing my requirement ... sorry for that .... here is my problem. i have a html that look like below <tr class="modifications-oddrow"> <td... (2 Replies)
Discussion started by: sais
2 Replies

6. Shell Programming and Scripting

Parsing: How to go from HTML to CSV?

Dear all, I have to parse a large amount of html files, which I would like to transform into comma separated values. The html-files have the following structure: <tag1> CATEGORY_1 <tag2><tag3> HEADER_1 <tag4> <tag5> paragraph_1 <tag6> <tag5> paragraph_2 <tag6> <tag3>HEADER_2... (2 Replies)
Discussion started by: docdudetheman
2 Replies

7. Shell Programming and Scripting

Align Text within <p> Tags in a HTML file.

Hi All !!! I have an HTML file whose contents are as below: <html> <body> <title>This is a test file</title> <p>PLEASE ALIGN ME IN ONE LINE. TEXT....</p> <h2>This is a Test file</h2> <p>PLEASE ALIGN ME IN ONE LINE. TEXT....</p> </body> </html> (2 Replies)
Discussion started by: parshant_bvcoe
2 Replies

8. Shell Programming and Scripting

How to use sed to remove html tags including text between them

How to use sed to remove html tags including text between them? Example: User <b> rolvak </b> is stupid. It does not using <b>OOP</b>! and should output: User is stupid. It does not using ! Thank you.. (2 Replies)
Discussion started by: alphagon
2 Replies

9. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

10. Shell Programming and Scripting

HTML parsing by PERL

i have a HTML report file..its in attachment(a part of the whole report is attached..name "input html.doc").also its source is attached in "report source code.txt" i just want to seperate the datas like in first line it should be.. NHTEST-3848498958-NHTEST-10.2-no-baloo a and so on for whole... (3 Replies)
Discussion started by: avik1983
3 Replies

Featured Tech Videos