BASH parsing for html tags


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting BASH parsing for html tags
# 1  
Old 11-24-2011
BASH parsing for html tags

Hello can anyone help me parse this line.
Code:
<tr><td>United States of America</td><td>Dollar</td><td>43.309</td></tr><tr><td>Japan</td><td>Yen</td><td>0.5579</td></tr>

the line above did not break.

so i would like to have a result like this
Code:
United States of America
Dollar 
43.309
Japan
Yen

I would like to parse it by searching the <td> and </td> tags and print the inner value

Thanks.

Moderator's Comments:
Mod Comment How to use code tags

Last edited by Franklin52; 11-24-2011 at 08:30 AM.. Reason: Please use code tags for code and data samples, thank you
# 2  
Old 11-24-2011
One way:
Code:
echo $line | sed 's/<[./]*t[rd]>/\n/g' | awk 'NF'

# 3  
Old 11-25-2011
Code:
string='<tr><td>United States of America</td><td>Dollar</td><td>43.309</td></tr><tr><td>Japan</td><td>Yen</td><td>0.5579</td></tr>'

while :
do
  case $string in
    \<*) string=${string#*>} ;;
    *\<*) printf "%s\n" "${string%%<*}"
       string=${string#*>}
       ;;
    *) [ -n "$string" ] && printf "%s\n" "$string"
      break ;;
  esac
done

# 4  
Old 11-25-2011
Another one:


Code:
$ string="<tr><td>United States of America</td><td>Dollar</td><td>43.309</td></tr><tr><td>Japan</td><td>Yen</td><td>0.5579</td></tr>"

Code:
$ echo $string|tr -s "<td>" "\n"|grep -w "^td>*"|cut -d ">" -f 0,2

United States of America
Dollar
43.309
Japan
Yen
0.5579
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Create html <ui> <li> by parsing text file

Hi you all, this is my first post in this forum. I'm italian (please forgive me) :-) so my english will fail to be correct... Anyway, let's get straight to the point! I have a text file like this: ,,,, Disney: 00961-002,,,, ,Pippo: 00531-002,,, ,,Pluto: 00238-002,, ... (5 Replies)
Discussion started by: alcresio
5 Replies

2. UNIX for Dummies Questions & Answers

HTML parsing with UNIX shell script

Hi there, Infra/LEXUS0157/lexus0157.html-<tr><td>Minimum password age</td><td>3 days</td><td>Win2k8 Server</td></tr> How do I extract from this html with unix, I just need the 1.'Minimum password age' & 2. '3 days' parameter. Tried doing so with python, would like to have a better... (7 Replies)
Discussion started by: alvinoo
7 Replies

3. Shell Programming and Scripting

Perl syntax and html ole parsing

Hi gurus I am trying to understand some advanced (for me) perl constructions (syntax) following this tutorial I am trying to parse html: Using Mojo::DOM | Joel Berger say "div days:"; say $_->text for $dom->find('div.days')->each; say "\nspan hours:"; say $_->text for... (1 Reply)
Discussion started by: wakatana
1 Replies

4. Shell Programming and Scripting

Removing all except couple of html tags from html file

I tried to find elegant (or at least simple) way to remove all but couple of html tags from html file, but all examples I found dealt with removing all the tags. The logic of the script would be: - if there is <li> or <ul> on the line, do nothing (=write same line to output) - if there is:... (0 Replies)
Discussion started by: juubuntu
0 Replies

5. Shell Programming and Scripting

Parsing HTML, get text between 2 HTML tags

Hi there, I'm quite new to the forum and shell scripting. I want to filter out the "166.0 points". The results, that i found in google / the forum search didn't helped me :( <a href="/user/test" class="headitem menu" style="color:rgb(83,186,224);">test</a><a href="/points" class="headitem... (1 Reply)
Discussion started by: Mysthik
1 Replies

6. Shell Programming and Scripting

Html parsing - get line after specific string till a point

Hi all :) It sounds complex, for example I want to find the whole html file (there are 5 entries of this string and I need to get all of them) for the string "<td class="contentheading" width="100%">", get the next line from it only till the point that says "</td>", plus removing \t (tabs) ... (6 Replies)
Discussion started by: hakermania
6 Replies

7. UNIX for Advanced & Expert Users

html parsing using unix

hi all, I had raised the same question a few weeks back but forgot to mention a lot of points ... so i am raising a new thread furnishing my requirement ... sorry for that .... here is my problem. i have a html that look like below <tr class="modifications-oddrow"> <td... (2 Replies)
Discussion started by: sais
2 Replies

8. Shell Programming and Scripting

Parsing: How to go from HTML to CSV?

Dear all, I have to parse a large amount of html files, which I would like to transform into comma separated values. The html-files have the following structure: <tag1> CATEGORY_1 <tag2><tag3> HEADER_1 <tag4> <tag5> paragraph_1 <tag6> <tag5> paragraph_2 <tag6> <tag3>HEADER_2... (2 Replies)
Discussion started by: docdudetheman
2 Replies

9. Shell Programming and Scripting

Remove html tags with bash

Hello, is there a way to go through a file and remove certain html tags with bash? If it needs sed or awk, that'll do too. The reason why I want this is, because I have a monitor script which generates a logfile in HTML and every time it generates a logfile, the tags are reproduced. The tags... (4 Replies)
Discussion started by: dejavu88
4 Replies

10. Shell Programming and Scripting

HTML parsing by PERL

i have a HTML report file..its in attachment(a part of the whole report is attached..name "input html.doc").also its source is attached in "report source code.txt" i just want to seperate the datas like in first line it should be.. NHTEST-3848498958-NHTEST-10.2-no-baloo a and so on for whole... (3 Replies)
Discussion started by: avik1983
3 Replies
Login or Register to Ask a Question