I am scraping data from the Internet that has the format similar to what's on this page -- Trigger Notice Report
The code I've written for scraping and storing results works fine when the HTML code is well written, but not when there are mistakes. In particular, the code breaks when there are no tags for closing a table rows.
If you view the source for the link above, you will see that there are '</TR>'s which indicate the end of an HTML row, but sometimes not a new '<TR>' to indicate the beginning of the next row.
I need to use somethink like awk or sed to basically do the following -- insert a line with ''<TR>'' whenever the previous line was "</TR>" followed by a ''<TD [some text]''. For example, in the code below, I need a line with ''<TR>'' just before the highlighted line. The rest of the HTML file follows pretty much the same pattern. Any suggestions?
Hello,
Anyone out there can help on this problem?
I have a zip file about 34MB containing a file in EBCDIC and is resided on a Windows 2000 server.
This zip file is retrieved and read from a UNIX server via SAMBA "SMBCLIENT" (by default the file is transferred via command bin) and issued... (2 Replies)
I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part.
Same problem happens in "type" command in MS-DOS.
I know you can do it by opening it in Internet Explorer,... (4 Replies)
Hi All,
I have written a script which sends mail using “sendmail” command and mail contains HTML code.
When I run scripts on terminal it is working properly, but when I try to run this script through a crontab file it sends blank mail with proper subject.
crontab file detail :
00 05... (1 Reply)
Hello,
I have one file which has been inserted intermittently with HTML web page.
I would like to remove all text between "<html xmlns="http://www.w3.org/1999/xhtml">" and </html> tags.
Can any one please suggest me sed regular expression for it.
Thanks (3 Replies)
I need some help with adding lines to file and substitute a pattern.
Ok I have a file:
#cat names.txt
name: John Doe
stationed: 1
name: Michael Sweets
stationed: 41
.
.
.
And would like to change it to:
name: John Doe
employed
permanently
stationed: 1-office (7 Replies)
I tried to find elegant (or at least simple) way to remove all but couple of html tags from html file, but all examples I found dealt with removing all the tags.
The logic of the script would be:
- if there is <li> or <ul> on the line, do nothing (=write same line to output)
- if there is:... (0 Replies)
Hi,
i have the following code in shell named as test3.sh..
#!/bin/sh
. /home/<user>/.profile
export dt=`date "+%d%b%y"`
export tim=`date "+%d%b%y %HM:%MM"`
cd
export WD=`pwd`
SID="<sid>"
export SID
export ORACLE_SID=$SID
export ORACLE_HOME=/oracle/$SID/102_64
export... (4 Replies)
Hello,
Since i am new in shell scripting, i need some help from you guys. :rolleyes:
I am trying to implement an automata that reflects the attached photo..
The main idea behind is to take an array of (0 & 1)s from the user and terminate it by "end". Then, the string is send to the function... (1 Reply)
I am looking for HTML code that browse text file and grep with database file then retrieve result
txtfileuploaded contain
112233
115599
113366
shell code
grep -F -f txtfileuploaded /data/database.txt
result
112233 Mar 41$
115599 Nov 44$
113366 Oct 33$
attached... (2 Replies)
I have an array in an external file, "array.txt", which contains:
char *testarray={"Zero", "One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine"};I want to be able to add an element to this array, and have that element display, whenever I call it, without having to recompile... (29 Replies)