[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: How to mirror web sites
- From: Jeffery Richards <jrichard mda ca>
- To: redhat-install-list redhat com
- Subject: Re: How to mirror web sites
- Date: Tue, 2 Dec 1997 09:01:04 -0800 (PST)
>
> > > Have you tried GNU Wget (should be version 1.4.x) ? I think you might find
> > > it in RPM form in any RedHat mirror site.
> > >
> >
> > I have tried this, but I find that sometimes it fails to follow certain of
> > the links on a page despite them being at the same site and at the same
> > recursion level, and it's not because it is halting due to having downloaded
> > too much data.
>
> What kind of links ? Could they have something to do with the robots.txt
> file ?
>
No, I have the robots.txt recognition turned off. It seems to be isolated to
HREF links that don't specify the full http://... address and instead just say
something like HREF=file.html. I've also noticed it sometimes with SRC= items
as well for images.
--
+------------------------------------+----------------------------------------+
| Jeff Richards | Telephone: (604) 231-2667 |
| MacDonald Dettwiler Ltd. +----------------------------------------+
| 13800 Commerce Parkway | Email: jrichard mda ca |
| Richmond, BC CANADA V6V 2J3 | |
+------------------------------------+----------------------------------------+
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]