Web Page Watcher

Tim ignored_mailbox at yahoo.com.au
Tue Oct 10 08:55:01 UTC 2006


Tim:
>> Have you considered just comparing HTTP headers?

On Mon, 2006-10-09 at 11:59 -0700, Paul Lemmons wrote:
> Ok, I have considered it now and it would be fairly easy to accomplish. 
> I am not sure it would be valuable though. There is a significant amount 
> of data in the headers that is different every time the page is called. 
> Are there particular fields that are only updated when the content of 
> the page has changed? Or were you looking for something else completely?

There's last-modified, expires, and etag headers (basically a checksum)
from the webserver, that spring to mind.  Any one of them would let you
know that the page had changed, with less data to need parsing, and less
of a load on the webserver.  Have a quick look at the Apache manual, or
a website about caching for clues.

-- 
(Currently running FC4, but testing FC5, if that's important.)

Don't send private replies to my address, the mailbox is ignored.
I read messages from the public lists.





More information about the fedora-list mailing list