Monitoring an html web page changes | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Monitoring an html web page changes

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 11-29-2012
prvnrk prvnrk is offline
Registered User
 
Join Date: Jul 2007
Last Activity: 7 April 2014, 2:30 PM EDT
Posts: 177
Thanks: 3
Thanked 1 Time in 1 Post
Monitoring an html web page changes

Hello,

I need to monitor an html web page for ANY changes and should be able to know if it's modified or not (since last query). I do not need what modifications but just notification is enough.

This is a simple web page and I don't need to parse the links any further.

Is it possible to do it using a shell script, if yes please advise me how to do it.

Thanks!

---------- Post updated at 11:07 AM ---------- Previous update was at 10:24 AM ----------

I got it done by using wget and keep downloading html files and comparing with last previously downloaded file. This way works but I just want to know if there's any better way.

Thanks
Sponsored Links
    #2  
Old 11-29-2012
mercy mercy is offline
Registered User
 
Join Date: Feb 2006
Last Activity: 30 November 2012, 9:27 AM EST
Location: msk.ru.earth
Posts: 50
Thanks: 0
Thanked 2 Times in 2 Posts
some thing like
first:

Code:
wget http://domain.com/path/to/page.html
md5 page.html > previous_md5
rm page.html

then run script (from cron)

Code:
#!/bin/sh
wget http://domain.com/path/to/page.html
md5 page.html > last_md5
diff previous_md5 last_md5
if [ "$?" = "!" ] ; then 
      mail -s "page.html changed on `date`" your@mail.addr
fi
mv last_md5 previous_md5
rm page.html

Sponsored Links
    #3  
Old 11-29-2012
prvnrk prvnrk is offline
Registered User
 
Join Date: Jul 2007
Last Activity: 7 April 2014, 2:30 PM EDT
Posts: 177
Thanks: 3
Thanked 1 Time in 1 Post
wget is NOT working at all because sometimes the downloaded HTML file size is getting different (few bytes) even though no changes in the web page.

It's weird and I don't think we can rely on wget for this.

Any suggestions would highly be appreciated.

Thanks!
    #4  
Old 11-29-2012
Yoda's Avatar
Yoda Yoda is offline Forum Advisor  
Jedi Master
 
Join Date: Jan 2012
Last Activity: 17 April 2014, 8:35 PM EDT
Location: Galactic Empire
Posts: 3,282
Thanks: 227
Thanked 1,157 Times in 1,094 Posts
Try using lwp-download

Code:
lwp-download "http://your_URL_here.com" download.html

Sponsored Links
    #5  
Old 11-30-2012
alister alister is offline Forum Advisor  
Registered User
 
Join Date: Dec 2009
Last Activity: 18 April 2014, 6:24 AM EDT
Posts: 3,128
Thanks: 171
Thanked 935 Times in 760 Posts
What operating systems must be supported? Some systems have efficient notification interfaces which do not require polling. Upon notification of file modification, an email can be sent.

An example of a tool which leverages such an api: inotifywait(1) - Linux man page

Regards,
Alister
Sponsored Links
    #6  
Old 11-30-2012
mercy mercy is offline
Registered User
 
Join Date: Feb 2006
Last Activity: 30 November 2012, 9:27 AM EST
Location: msk.ru.earth
Posts: 50
Thanks: 0
Thanked 2 Times in 2 Posts
Quote:
Originally Posted by prvnrk View Post
wget is NOT working at all because sometimes the downloaded HTML file size is getting different (few bytes) even though no changes in the web page.
if file size is getting different - file changed.
wget doesn't change downloaded file by itself

if page some times differ (i.e. it has dinamic content) - you must find enother way to monitoring changes. not get page over web.

do you have access to http server or page source (svn/filesystem/other)?
do you need monitoring differing whole page or it's part?
Sponsored Links
    #7  
Old 12-20-2012
prvnrk prvnrk is offline
Registered User
 
Join Date: Jul 2007
Last Activity: 7 April 2014, 2:30 PM EDT
Posts: 177
Thanks: 3
Thanked 1 Time in 1 Post
Inotifywait is to monitor file changes on LOCAL files systems.

both wget and lwp-download NOT working consistently (they show different sizes of html files even though there were no changes).

Could anyone please suggest any better solution - thanks much in advance!!
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Call shell script from HTML page - without web server vamanu9 Web Programming 7 06-04-2010 01:59 PM
findstr in html page webmunkey23 Web Programming 1 03-19-2009 02:02 AM
Accessing a HTML page pkm_oec UNIX for Dummies Questions & Answers 2 01-14-2009 10:26 AM
Accessing a HTML page pkm_oec Solaris 0 01-14-2009 08:04 AM
Html web page to Unix Connectivity abhilashnair UNIX and Linux Applications 1 03-06-2008 09:13 AM



All times are GMT -4. The time now is 06:55 AM.