![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| break out of 'if' | finalight | Shell Programming and Scripting | 7 | 11-19-2008 03:54 PM |
| Converting HTML data into a spreadsheet | garric | Shell Programming and Scripting | 4 | 04-22-2008 10:00 AM |
| extract data from html tables | Streetrcr | UNIX for Dummies Questions & Answers | 8 | 03-20-2008 06:14 AM |
| How do I extract text only from html file without HTML tag | los111 | UNIX for Dummies Questions & Answers | 4 | 11-28-2007 04:40 AM |
| coverting html data to text in 'c' | phani_sree | High Level Programming | 3 | 10-18-2007 10:06 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
||||
|
To Break data out of HTML
I'm working with the output of an html form and trying to get it into CSV. The html is a table with many entries like the following.
HTML Code:
<tr><td nowrap><b><font size=3>NAME</font></b></td><td nowrap><b>License # : </b> LICENSE</td></tr> <tr><td><b>City : </b> CITY<td nowrap><b>Type : </b> TYPE</td></tr> <tr><td><b>State :</b> ST<td nowrap><b>Status : </b> STATUS</td></tr> <tr><td><b>Phone :</b> PHONE<td nowrap><b>Expires: </b> EXPIRES</td></tr><td></td> <td nowrap><b>Nat. Registry: </b>Y/N</td></tr><tr><td> <tr><td colspan=2><hr width='100%'></td></tr> Code:
NAME, LICENSE, CITY, TYPE, ST, STATUS, PHONE, EXPIRES, Y/N Code:
cat appr-test | sed 's_<tr><td nowrap><b><font size=3>\(.*\)</font>_\1'_ |
|
||||
|
If you really want a "best way", that would be a proper HTML parser.
Assuming you wish to stay with something lighter, like sed or awk, perhaps you can elaborate on what is wrong with the sed you have tried so far. (The <font> tag in your example script does not occur in the HTML sample you posted, but I guess that's beside the point here.) |
| Sponsored Links | ||
|
|
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|