View previous topic :: View next topic |
Author |
Message |
eccerr0r Watchman
![Watchman Watchman](/images/ranks/rank-G-2-watchman.gif)
Joined: 01 Jul 2004 Posts: 9890 Location: almost Mile High in the USA
|
Posted: Wed Jan 12, 2022 9:46 pm Post subject: wget not liking > 2GiB the first time around? [SOLVED] |
|
|
Code: | ~/www$ ls -l threegig.img
-rw-r--r-- 1 me me 3221225472 Jan 12 14:17 threegig.img
~/www$ mkdir tempcrap
~/www$ cd tempcrap
~/www/tempcrap$ wget http://127.0.0.1/~me/threegig.img
--2022-01-12 14:28:19-- http://127.0.0.1/~me/threegig.img
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2147483647 (2.0G)
Saving to: 'threegig.img'
threegig.img 100%[===================>] 2.00G 57.9MB/s in 58s
2022-01-12 14:33:22 (35.2 MB/s) - 'threegig.img' saved [2147483647/2147483647]
~/www/tempcrap$ wget -c http://127.0.0.1/~me/threegig.img
--2022-01-12 14:34:33-- http://127.0.0.1/~me/threegig.img
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 3221225472 (3.0G), 1073741825 (1.0G) remaining
Saving to: 'threegig.img'
threegig.img 100%[+++++++++++++======>] 3.00G 40.3MB/s in 17s
2022-01-12 14:34:50 (60.6 MB/s) - 'threegig.img' saved [3221225472/3221225472]
|
Why is 1GB missing the first time around?
Though this is a semi-contrived case, but the problem is real. And there is no X-Y problem here, nor is this illicit traffic: Geofabrik does not offer bittorrent or rsync when downloading OSM extract data...
Also as another strange observation, if at least 1 byte is already downloaded and one continues, it will detect the full file size... Weird.
-----------------
SOLVED: Do NOT use wget-1.12.1, it has a regression on 32-bit. Upgrade to 1.12.2. _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Last edited by eccerr0r on Thu Jan 13, 2022 10:06 am; edited 2 times in total |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
pingtoo Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/gallery/Star Wars/movie_star_wars_storm_trooper.gif)
Joined: 10 Sep 2021 Posts: 1481 Location: Richmond Hill, Canada
|
Posted: Wed Jan 12, 2022 10:04 pm Post subject: Re: wget not liking > 2GiB the first time around? |
|
|
I think if the site/webserver decide not to give out the whole file at once than not much wget can do to make it happen
eccerr0r wrote: | Code: | ~/www$ ls -l threegig.img
-rw-r--r-- 1 me me 3221225472 Jan 12 14:17 threegig.img
~/www$ mkdir tempcrap
~/www$ cd tempcrap
~/www/tempcrap$ wget http://127.0.0.1/~me/threegig.img
--2022-01-12 14:28:19-- http://127.0.0.1/~me/threegig.img
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2147483647 (2.0G) <--------------------------------------------- site/webserver decision
Saving to: 'threegig.img'
threegig.img 100%[===================>] 2.00G 57.9MB/s in 58s
2022-01-12 14:33:22 (35.2 MB/s) - 'threegig.img' saved [2147483647/2147483647]
~/www/tempcrap$ wget -c http://127.0.0.1/~me/threegig.img
--2022-01-12 14:34:33-- http://127.0.0.1/~me/threegig.img
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 3221225472 (3.0G), 1073741825 (1.0G) remaining <------------------- site/webserver decision
Saving to: 'threegig.img'
threegig.img 100%[+++++++++++++======>] 3.00G 40.3MB/s in 17s
2022-01-12 14:34:50 (60.6 MB/s) - 'threegig.img' saved [3221225472/3221225472]
|
Why is 1GB missing the first time around?
Though this is a semi-contrived case, but the problem is real. And there is no X-Y problem here, nor is this illicit traffic: Geofabrik does not offer bittorrent or rsync when downloading OSM extract data...
Also as another strange observation, if at least 1 byte is already downloaded and one continues, it will detect the full file size... Weird. |
|
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
eccerr0r Watchman
![Watchman Watchman](/images/ranks/rank-G-2-watchman.gif)
Joined: 01 Jul 2004 Posts: 9890 Location: almost Mile High in the USA
|
Posted: Wed Jan 12, 2022 10:57 pm Post subject: |
|
|
Well, it did cough up the whole file the second time around...
Also firefox slurps down the whole file in one go to the same server(s) - both geofabrik and my Apache server. _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
pingtoo Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/gallery/Star Wars/movie_star_wars_storm_trooper.gif)
Joined: 10 Sep 2021 Posts: 1481 Location: Richmond Hill, Canada
|
Posted: Wed Jan 12, 2022 11:17 pm Post subject: |
|
|
eccerr0r wrote: | Well, it did cough up the whole file the second time around...
Also firefox slurps down the whole file in one go to the same server(s) - both geofabrik and my Apache server. |
Your second round use -c option which tell site/webserver continue from where it left.
man wget wrote: | Code: |
-c
--continue
Continue getting a partially-downloaded file. This is useful when
you want to finish up a download started by a previous instance of
Wget, or by another program. For instance:
wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z
If there is a file named ls-lR.Z in the current directory, Wget
will assume that it is the first portion of the remote file, and
will ask the server to continue the retrieval from an offset equal
to the length of the local file.
Note that you don't need to specify this option if you just want
the current invocation of Wget to retry downloading a file should
the connection be lost midway through. This is the default
behavior. -c only affects resumption of downloads started prior to
this invocation of Wget, and whose local files are still sitting
around.
Without -c, the previous example would just download the remote
file to ls-lR.Z.1, leaving the truncated ls-lR.Z file alone.
If you use -c on a non-empty file, and the server does not support
continued downloading, Wget will restart the download from scratch
and overwrite the existing file entirely.
Beginning with Wget 1.7, if you use -c on a file which is of equal
size as the one on the server, Wget will refuse to download the
file and print an explanatory message. The same happens when the
file is smaller on the server than locally (presumably because it
was changed on the server since your last download
attempt)---because "continuing" is not meaningful, no download
occurs.
On the other side of the coin, while using -c, any file that's
bigger on the server than locally will be considered an incomplete
download and only "(length(remote) - length(local))" bytes will be
downloaded and tacked onto the end of the local file. This
behavior can be desirable in certain cases---for instance, you can
use wget -c to download just the new portion that's been appended
to a data collection or log file.
However, if the file is bigger on the server because it's been
changed, as opposed to just appended to, you'll end up with a
garbled file. Wget has no way of verifying that the local file is
really a valid prefix of the remote file. You need to be
especially careful of this when using -c in conjunction with -r,
since every file will be considered as an "incomplete download"
candidate.
Another instance where you'll get a garbled file if you try to use
-c is if you have a lame HTTP proxy that inserts a "transfer
interrupted" string into the local file. In the future a
"rollback" option may be added to deal with this case.
Note that -c only works with FTP servers and with HTTP servers that
support the "Range" header. |
|
Maybe site/webserber more friendly to firefox? ![Smile :-)](images/smiles/icon_smile.gif) |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
eccerr0r Watchman
![Watchman Watchman](/images/ranks/rank-G-2-watchman.gif)
Joined: 01 Jul 2004 Posts: 9890 Location: almost Mile High in the USA
|
Posted: Thu Jan 13, 2022 12:43 am Post subject: |
|
|
Well it appears so... now the question is... is Firefox right or is wget right?
They both can't be doing the right behavior....
Curl agrees with firefox. So net-misc/wget-1.21.1 has a bug, and it looks like it's only on 32-bit as 64-bit works.
--
looks like wget-1.21.2 has the bug fixed and is currently getting stabilized. Bug's been around for a while too, lol...wish I noticed it really was a bug, earlier, instead of put up with it, oh well. _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|