Quote:
|
the only problem with your solution is that most of the <tr> tags are across multiple lines in my html page. ie the tag may be opened on line 7 and then closed on line 20.
|
Well, i told you that - in absence of any example - i had to make some assumptions. Here is a new version which will work on tags ranging over several lines. It will still not catch the case of several "<tr>...</tr>" pairs on one line, though.
Code:
sed -n '/<tr>/,/<\/tr> {
s/.*<tr>//
s/<\/tr>.*//
p
}' /path/to/your/file
How this works: the "-n" clause will stop
sed from printing every line it has read, so if you delete the script it would print just nothing. This is to (implicitly) throw out all the lines which are NOT in the specified range.
Everything between the curly braces is executed only when inside the range specified on line 1. As you can see the last command inside the curly braces is a "p", which will print everything inside this range. If you delete the two "s/...."-commands it would print something this:
Code:
something....<tr> content of the tr-tag
some more content
even more content</tr> something else....
As you can see the bold parts should be deleted as they are not part of what you want. The two "s/..."-commands (s=substitute) take care of that along with the tags themselves. At last the p(rint)-command outputs the result of all the trimming.
One more word, though: You got a second answer from me because i appreciated that you were doing genuine research on your own. You almost forfeited this answer because of this:
Quote:
|
[...]withough having to waste time making an example table
|
You might notice i have "wasted time" not only writing a script but even wasted more time explaining how it works in the hope of not only solving the problem at hand but enhancing your understanding at the same time. On top of that i "wasted some more time" to write a script in my first post which nobody is going to need because it was based on faulty assumptions. Assumptions which might not have been faulty at all would i have been able to work from an example created by "wasting time".
I am even now "wasting some more time" to explain to you why you might sometimes get no answer at all or some answer you can't use. Go figure.
I hope this helps.
bakunin