View previous topic :: View next topic |
Author |
Message |
The_Document Apprentice
Joined: 03 Feb 2018 Posts: 275
|
Posted: Mon Feb 04, 2019 10:04 am Post subject: website coping |
|
|
tried to use httrack and wget to bonafide offline copy of a wordpress site and nether work.
Code: | wget --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" -m -p -P .-A jpg,jpeg,gif,png,css,js,java,txt,xml,htm,html,php,cgi,asp --adjust-extension -e robots=off --base=./ -k -P ./ -x 8 -t 4 https://example.com |
the httrack devs pointed out its not up for dynamic sites. |
|
Back to top |
|
|
erm67 l33t
Joined: 01 Nov 2005 Posts: 653 Location: EU
|
Posted: Tue Feb 05, 2019 11:51 am Post subject: |
|
|
wget never really worked to copy dynamic websites but I have just read about this: https://archivebox.io/ on reddit ..... maybe it works. _________________ Ok boomer
True ignorance is not the absence of knowledge, but the refusal to acquire it.
Ab esse ad posse valet, a posse ad esse non valet consequentia
My fediverse account: @erm67@erm67.dynu.net |
|
Back to top |
|
|
Hu Administrator
Joined: 06 Mar 2007 Posts: 22968
|
Posted: Wed Feb 06, 2019 2:54 am Post subject: |
|
|
For dynamic sites, you are probably better off downloading a backup of the site's backend data and instantiating a clone locally. This is more maintainable, and often smaller, than saving all the different pages generated by the site. |
|
Back to top |
|
|
The_Document Apprentice
Joined: 03 Feb 2018 Posts: 275
|
Posted: Sun Feb 17, 2019 11:03 am Post subject: |
|
|
erm67 wrote: | wget never really worked to copy dynamic websites but I have just read about this: https://archivebox.io/ on reddit ..... maybe it works. |
Tried archivebox it works well.
Can wet grab all links and files from a local html file? Im not sure how to parse local files to wget. |
|
Back to top |
|
|
|