Extract expressions between two strings in html file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract expressions between two strings in html file
# 1  
Old 06-15-2012
Java Extract expressions between two strings in html file

Hello guys,

I'm trying to extract all the expressions between the following tags: <b></b> from a HTML file.
This is how it looks: big lines containing several dozens expressions (made of 1,2,3,4,6 or even 7 words) I would like to extract:



Code:
<b>bla ble</b>bla ble</td><tr valign="top"><td width="50%"><p align="left" style="margin-top:0;margin-bottom:0"><b>ble bla ble</b>bla bla ble</td><td width="50%"><p align="left" style="margin-top:0;margin-bottom:0"><b>ble ble ble bla ble</b>ble ble ble bla ble</td><td width="50%"><p align="left" style="margin-top:0;margin-bottom:0"> and so on.

I would like to print them out into a new file, under the form:

bla ble
bla bla ble
ble ble ble bla ble

etc.

I know several posts in the forum adress this question - namely to extract expressions between two strings using sed, perl or awk - but none of the commands I found work in this situation (several of the same tags on the same line and a lot of lines).

How could I make either one of these programs go through the WHOLE file in search of every expression that appear between <b></b>?


Thank you very much !
# 2  
Old 06-15-2012
Hi


Code:
$ sed 's/<\/b>/&\n/g;' file | sed -e '/<b>/!d' -e 's/.*<b>\([^>]*\)<\/b>/\1/'
bla ble
ble bla ble
ble ble ble bla ble

Guru
This User Gave Thanks to guruprasadpr For This Post:
# 3  
Old 06-16-2012
Thank you sir, it's working perfectly !
# 4  
Old 06-16-2012
This extract the data that is between the "b" tags:
Code:
$ cat test.txt | sed -e 's/<\/b>/<\/b>\n/g' | sed -n -e 's/.*<b>\(.*\)<\/b>.*/\1/p'
bla ble
ble bla ble
ble ble ble bla ble

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract both contents from a html file and do printing

Hi there, Print IP Address: grep 'HostID :' 10.244.9.124\ nessus.html | awk -F '<br>' '{print $12}' | tr -s ' ' | awk -F ':' '{print "<tr><td>" $2 "</td><td>"}' Print Respective Ports: grep 'classsubsection\|./tcp\|./udp' 10.244.9.124\ nessus.html | grep -v 'h2.classsubsection... (3 Replies)
Discussion started by: alvinoo
3 Replies

2. UNIX for Dummies Questions & Answers

Extract table from an HTML file

I want to extract a table from an HTML file. the table starts with <table class="tableinfo" and ends with next closing table tag </table> how can I do this with awk/sed... ---------- Post updated at 04:34 PM ---------- Previous update was at 04:28 PM ---------- also I want to... (4 Replies)
Discussion started by: koutroul
4 Replies

3. Shell Programming and Scripting

Extract two strings from a file and create a new file with these strings

I have the following lines in a log file. It would be great if some one can help me to create a new file with the just entries in the below format. 66.150.161.195 HPSAC=Z05 66.150.161.196 HPSAC=A05 That is just extract the IP address and the string DPSAC=its value 66.150.161.195 -... (1 Reply)
Discussion started by: Tuxidow
1 Replies

4. Shell Programming and Scripting

extract fields from a downloaded html file

I have around 100 html files and in each html file I have 5-6 such paragraphs of a company and I need to extract the Name of the company from either the one after "title" or "/company" and then the number of employees and finally the location . <div class="search_result"> <div... (1 Reply)
Discussion started by: gubbu
1 Replies

5. Shell Programming and Scripting

Extract strings from file - Help

Hi, I have a file say with following lines (the lines could start from any column and there can be many many create statements in the file) create table table1....table definition... insert into table1 values..... create or replace view view1....view definition.... What i want is to... (2 Replies)
Discussion started by: whoami191
2 Replies

6. Shell Programming and Scripting

Extract strings within XML file between different delimiters

Good afternoon! I have an XML file from which I want to extract only certain elements contained within each line. The problem is that the format of each line is not exactly the same (though similiar). For example, oa_var will be in each line, however, there may be no value or other... (3 Replies)
Discussion started by: bab@faa
3 Replies

7. Shell Programming and Scripting

Extract strings from multiple lines into one file -

input file Desired csv output gc_type, date/time, milli secs af, Mar 17 13:09:04 2011, 144.596 af, Mar 20 00:37:37 2011, 144.242 af, ar 20 21:30:59 2011, 108.518 Hi All, Any help in acheiving the above would be appreciated. I would like to parse through lines within one file and... (5 Replies)
Discussion started by: satish.vampire
5 Replies

8. Shell Programming and Scripting

How to write a script to extract strings from a file.

Hello fourm members, I want to write a script to extarct paticular strings from the all type of files(.sh files,logfiles,txtfiles) and redirect into a log file. example: I have to find the line below in the script and extract the uname and Pwds. sqsh -scia2007 -DD0011uw01 -uciadev... (5 Replies)
Discussion started by: rajkumar_g
5 Replies

9. Shell Programming and Scripting

extract strings from file and display in csv format

Hello All, I have a file whose data looks something like this I want to extract just the id, name and city fields in a csv format and sort them by id. Output should look like this. 1,psi,zzz 2,beta,pqr 3,theta,xyz 4,alpha,abc 5,gamma,jkl (12 Replies)
Discussion started by: grajp002
12 Replies

10. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies
Login or Register to Ask a Question