Re: wget - the question for advanced users



On Wed, 27 Dec 2006 14:32:59 +0100, Piotr wrote this:

Hi!

I'm trying to make a local copy of my homepage with photos:
www.pbase.com/piotrstankiewicz

More precisely I try to make a copy of this site with all child pages
starting with www.pbase.com/piotrstankiewicz (which is not a problem) and
additionnaly I want to download all the photos used. Unfortunately here I
have a problem as the photos are placed on other servers from the domain
pbase.com.

I tried something like that:

wget -r -H -p --convert-links --no-parent -l 3 --html-extension
-Dpbase.com --exclude-domains forum.pbase.com,search.pbase.com
http://www.pbase.com/piotrstankiewicz

Unfortunately it doesn't work.

I don't know why wget tries to download directories above
www.pbase.com/piotrstankiewicz and it starts downloading ex.
www.pbase.com/register.html (it looks like the option --no-parent doesn't
work or I don't fully understand it's behavior). When is starts to
download the documents from the main forder, it continues and continues
and it wants do download the contents of all the server. :(

How to force the situation that all the pages which doesn't start with
www.pbase.com/piotrstankiewicz (with the exception of photos placed on
other servers) are ignored?

I tried to use the option -I /piotrstankiewicz,/piotrstankiewicz/image In
such a situation all the web pages are downloaded ok (in the way I expect)
but wget doesn't download any photos (ex. it ignores that photo
http://i5.pbase.com/o4/43/588543/1/60342441.SA0469_20na30_final.jpg )

I tried also to use the option --exclude-directories
/galleries,/help,/login (without -I option) as there are the links from
the www.pbase.com/piotrstankiewicz site directing to the structure above
but wget it ignores and it starts to download the contents of all the
site.

Any ideas?


---------------------------------------------------

http://www.pbase.com/piotrstankiewicz


Did you try the --mirror option?

Start over use the --mirror option for your homepage. Enable log file
then maybe you can see what Wget is doing.

.



Relevant Pages

  • Re: numbering JPGs
    ... download more photos, Dell continues the numbering of the JPGs, using ... camera. ... with to download the pictures. ...
    (microsoft.public.windowsxp.general)
  • wget - the question for advanced users
    ... I'm trying to make a local copy of my homepage with photos: ... have a problem as the photos are placed on other servers from the domain ... I don't know why wget tries to download directories above ...
    (alt.os.linux)
  • Re: Waitby
    ... You have to wait for all the photos to download before a page will open. ... Oh, you mean the house is nice, rather than the website? ... Living a bit of that with a jacuzzi must be good for the soul. ...
    (uk.local.cumbria)
  • Re: Camera and scanner wizard causes lockups
    ... another time 25 pictures or another time 140 pictures but it ... Camera and Scanner Wizard so I tried to use that instead. ... photos would be 500MBs...that may be why ... download the photos. ...
    (microsoft.public.windowsxp.photos)
  • Re: Waitby
    ... The constantly changing photos are really ... download before a page will open. ... Oh, you mean the house is nice, rather than the website? ... jacuzzi must be good for the soul. ...
    (uk.local.cumbria)