The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #5 (permalink)  
Old 03-19-2008
ShawnMilo ShawnMilo is offline
Registered User
  
 

Join Date: Jun 2006
Posts: 252
Quote:
Originally Posted by Streetrcr View Post
thanks bakunin that is really helpful. i cant post a sample of the html page for various reasons. the only problem with your solution is that most of the <tr> tags are across multiple lines in my html page. ie the tag may be opened on line 7 and then closed on line 20. hence is it possible with sed to delete everything on a line (including the line) BUT stop when it gets to a <tr> tag and start again when it gets to a </tr>? alternatively is there a way to make sed believe that the whole html page is on a single line?

as i am not familiar with the capabilities of sed, it makes it hard for me to know what the best way of completing this task is.
There's no reason you can't mock up an HTML page which looks like the one you're working with but which does not contain any sensitive information. Nobody is interested in throwing darts into a dark room.

If you post something, someone will post code. Otherwise, you're going to have to do it yourself. Try something like replacing all newlines in the file with spaces, splitting the file before each < or after each >, and going from there. If you may have a < or > within the data, then you're going to do a little extra work. That's the best I can do for you at the moment.

ShawnMilo