03-19-2008
thanks bakunin that is really helpful. i cant post a sample of the html page for various reasons. the only problem with your solution is that most of the <tr> tags are across multiple lines in my html page. ie the tag may be opened on line 7 and then closed on line 20. hence is it possible with sed to delete everything on a line (including the line) BUT stop when it gets to a <tr> tag and start again when it gets to a </tr>? alternatively is there a way to make sed believe that the whole html page is on a single line?
as i am not familiar with the capabilities of sed, it makes it hard for me to know what the best way of completing this task is.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I am trying to transpose tables listed in the format into format. Any help would be greatly appreciated.
Input:
test_data_1
1 2 90%
4 3 91%
5 4 90%
6 5 90%
9 6 90%
test_data_2
3 5 92%
5 4 92%
7 3 93%
9 2 92%
1 1 92%
...
Output:... (7 Replies)
Discussion started by: justthisguy
7 Replies
2. UNIX for Dummies Questions & Answers
I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part.
Same problem happens in "type" command in MS-DOS.
I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies
3. Shell Programming and Scripting
I am attempting to extract weather data from the following website, but for the Victoria area only:
Text Forecasts - Environment Canada
I use this:
sed -n "/Greater Victoria./,/Fraser Valley./p"
But that phrasing does not sometimes get it all and think perhaps the website has more... (2 Replies)
Discussion started by: lagagnon
2 Replies
4. AIX
Please help me in creating the script in AIX.
requirement is;
The new component's main function is to extract the data from DB2 tables and company's firewall directly.
The component function needs to check the timestamp in the DB2 tables ((CREDAT and CRETIM) with the requested timestamp and... (1 Reply)
Discussion started by: priyanka3006
1 Replies
5. Shell Programming and Scripting
Hello everyone, I'm new to this forum and i am new as a shell scripter.
my problem is to have html files in a directory and I would like to extract from these some data that lies between two different lines
Here's my situation
<td align="default"> oxidizability (mg / l):
data_to_extract... (6 Replies)
Discussion started by: sbobotex
6 Replies
6. Shell Programming and Scripting
I am working on awk script to generate an HTML format output. With input file as below I am able to generate a HTML file however I want to saperate spare devices in a different table than rest of the devices and which has only Bunch ID, RAW Size and "Bunch Spare" status columns.
INPUT File :
... (2 Replies)
Discussion started by: dynamax
2 Replies
7. Shell Programming and Scripting
I have bash, awk, and sed available on my portable device. I need to extract 10 fields from each table row from a web page that looks like this:
</tr>
<tr>
<td>28 Apr</td>
<td><a... (6 Replies)
Discussion started by: rickgtx
6 Replies
8. Shell Programming and Scripting
Hi, I'm trying to get some data from an html file, but the problem is before it can extract the information I have multiple patterns that need to be passed through.
https://www.unix.com/shell-programming-scripting/150711-extract-data-awk-html-files.html
Is a similar problem. The only... (5 Replies)
Discussion started by: counfhou
5 Replies
9. Shell Programming and Scripting
I have the data in csv in 3 tables. how can I output the same into 3 tables in html.also how can I set the width. tried multiple options . attached is the format.
#!/bin/ksh
awk 'BEGIN{
FS=","
print "<HTML><BODY><TABLE border = '1' cellpadding=10 width=100>"
print... (7 Replies)
Discussion started by: archana25
7 Replies
10. UNIX for Beginners Questions & Answers
Hi I have a script which extracts the table from HTML and convert it into .csv.
But the problem in the script is if we have 2 tables in HTMl . it takes only the first table.
Please help me what changes i need to do in the script to make it read the complete HTML page.
Script is as below:
... (10 Replies)
Discussion started by: deepti01
10 Replies
SUBST(1) General Commands Manual SUBST(1)
NAME
subst - substitute definitions into file(s)
SYNOPSIS
subst [ -e editor ] -f substitutions victim ...
DESCRIPTION
Subst makes substitutions into files, in a way that is suitable for customizing software to local conditions. Each victim file is altered
according to the contents of the substitutions file.
The substitutions file contains one line per substitution. A line consists of two fields separated by one or more tabs. The first field
is the name of the substitution, the second is the value. Neither should contain the character `#', and use of text-editor metacharacters
like `&' and `' is also unwise; the name in particular is best restricted to be alphanumeric. A line starting with `#' is a comment and
is ignored.
In the victims, each line on which a substitution is to be made (a target line) must be preceded by a prototype line. The prototype line
should be delimited in such a way that it will be taken as a comment by whatever program processes the file later. The prototype line must
contain a ``prototype'' of the target line bracketed by `=()<' and `>()='; everything else on the prototype line is ignored. Subst
extracts the prototype, changes all instances of substitution names bracketed by `@<' and `>@' to their values, and then replaces the tar-
get line with the result.
OPTIONS
-e Substitutions are done using the sed(1) editor, which must be found in either the /bin or /usr/bin directories. To specify a dif-
ferent executable, use the ``-e'' flag.
EXAMPLE
If the substitutions file is
FIRST 111
SECOND 222
and the victim file is
x = 2;
/* =()<y = @<FIRST>@ + @<SECOND>@;>()= */
y = 88 + 99;
z = 5;
then ``subst -f substitutions victim'' changes victim to:
x = 2;
/* =()<y = @<FIRST>@ + @<SECOND>@;>()= */
y = 111 + 222;
z = 5;
FILES
victimdir/substtmp.new new version being built
victimdir/substtmp.old old version during renaming
SEE ALSO
sed(1)
DIAGNOSTICS
Complains and halts if it is unable to create its temporary files or if they already exist.
HISTORY
Written at U of Toronto by Henry Spencer.
Rich $alz added the ``-e'' flag July, 1991.
BUGS
When creating a file to be substed, it's easy to forget to insert a dummy target line after a prototype line; if you forget, subst ends up
deleting whichever line did in fact follow the prototype line.
25 Feb 1990 SUBST(1)