wget
Daniel Carrillo
daniel.carrillo at gmail.com
Fri Jun 27 18:31:38 UTC 2008
2008/6/27 Joy Methew <ml4joy at gmail.com>:
> hiii all....
>
> we can download any site from "wget -r " options.
> if i want to stop downloading of my site from web server how i can do
> this???
You can configure Apache for refuse connections with UserAgent "wget",
but note that wget can use any UserAgent (--user-agent option).
SetEnvIfNoCase User-Agent "^wget" blacklist
<Location />
...
your options
...
Order allow,deny
Allow from all
Deny from env=blacklist
</Location>
BTW: robots.txt only can stop crawling from "good" crawlers, like
google, yahoo, alexa, etc.
More information about the redhat-list
mailing list