View previous topic :: View next topic |
Author |
Message |
GreenNeonWhale n00b
Joined: 30 Mar 2016 Posts: 64
|
Posted: Thu Jan 23, 2025 12:10 am Post subject: Netiquette regarding mirrors and huge downloads |
|
|
Hi,
I'm a long time Gentoo user, and, a big fan of Gentoo.
I would like to have my own personal, locally stored, copy of:
- Gentoo's entire /distfiles directory -- all the source code.
- A subset of the stage3 files.
- A subset of the install .iso images.
I'm seeking to download all of this data, and, from time to time, update my locally stored copy.
I know that this is HUGE download, and could potentially be a burden on whichever mirror I choose. I'm seeking to avoid over burdening and/or selfishly using mirrors -- in short, I don't want to be a dick.
I have a 1GB fiber connection at my disposal.
My original idea was to find the fastest mirror available to me, and test its max speed to me with a small download. Then, limit my download to a small fraction of that. I found rsync://mirrors.rit.edu/gentoo/ to be the fastest, at around 90MiB/sec. I figured I would limit my download to 2MiB/sec. I started to do that, but then stopped shortly thereafter.
So, would the above be a generally acceptable use of a mirror? If not, would a slower speed be okay?
Should I directly contact the mirror admins and check first?
Or, is a download of this magnitude simply too big to do without, well, being a dick.
I'd appreciate any advice and guidance from the Gentoo community, especially from the folks who maintain our servers.
Thank You! |
|
Back to top |
|
|
Banana Moderator
Joined: 21 May 2004 Posts: 1859 Location: Germany
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54795 Location: 56N 3W
|
Posted: Thu Jan 23, 2025 1:31 pm Post subject: |
|
|
GreenNeonWhale,
I'm tempted to ask why ... but 'because I can' is good enough. :)
Last time I asked, a distfiles mirror was over 250G. That was about 5 years ago. It will be bigger now.
Be aware that the mirrors do not carry fetch restricted packages. You can fetch them but should not host them publicly. They are fetch restricted for a reason.
Raise a bug on infra stating your intentions and ask if its OK.
If you don't get a response, go ahead. It's easier to get forgiveness than to get permission. :)
I have a collection of stuff online olde-distfiles and old Gentoo
Feel free to mirror anything of interest at 2MiB/sec. That server has a 1Gbit/sec network link and traffic is not metered.
Every now and then I update it/add to it and point users here to it for old distfiles. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Wewfus n00b
Joined: 23 Mar 2024 Posts: 7
|
Posted: Thu Jan 23, 2025 6:03 pm Post subject: |
|
|
Is there any reason why you want to use rsync over git? If it were me I'd take full advantage of the git mirror for Gentoo on github. Since I'm not opposed to using all of Microsoft's bandwidth that I can considering what they do with the code uploaded there. Just something to consider.
Edit: I forgot to mention that most mirrors will throttle you anyway if you attempt to pull too much too fast. Same goes for syncing. I sync'd too much by mistake last week when playing around with the default install path from the handbook. I think it denied me access for a few hours. Don't really know for sure how long the ban was in place because I went to sleep afterwards.
I played with the git sync method and found it to be much faster than rsync. Git has some other features compared to rsync which are nice when it comes to source code. It isn't as good as rsync for binary data though. Perhaps a combination of git for source code+rsync for .isos and other files might be more to your liking.
I've had to push/pull a lot of data to github as part of my job over the years and I don't ever recall it throttling me or not being able to max out our meager cable internet connection. I too am getting fiber in the next few weeks (2Gbps symmetrical connection) so I'm interested in keeping a local mirror as well. My plan was to initially seed it from the github mirror and use the rsync mirrors as a fall back should it ever be down. A lot of the tree doesn't update that often. So once you have it initially seeded with the data from github you'd only have to check the other mirrors once every few days and maybe pull down a handful of updated ebuilds.
The github mirror is here by the way: https://github.com/gentoo/gentoo and for GURU: https://github.com/gentoo/guru |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10723 Location: Somewhere over Atlanta, Georgia
|
Posted: Thu Jan 23, 2025 7:36 pm Post subject: |
|
|
OP was asking about distfiles, not repos. And distfiles aren't served with git, for good reason.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters. |
|
Back to top |
|
|
nokilli Apprentice
Joined: 25 Feb 2004 Posts: 237
|
Posted: Tue Jan 28, 2025 5:31 am Post subject: |
|
|
NeddySeagoon wrote: | Be aware that the mirrors do not carry fetch restricted packages. You can fetch them but should not host them publicly. They are fetch restricted for a reason. |
Does this mean that a peer-to-peer solution to the distfiles problem can never exist?
In the NBD thread I was talking about rsync-over-nbd to dload source and hopefully save bandwidth. I see now that can't ever work.
But I was toying around with using single-writer nbd volumes to propagate distfiles. So you'd have /var/cache/distfiles on a writable filesystem on its own block device and where you do you normal distfiles stuff just as before, but now because you can publish that block device over the Internet read-only, you can let others easily snag a tarball from your mirror.
Potentially saving Gentoo bandwidth, and definitely enhancing the security of a Gentoo user's system. _________________ We are the block device. The kernel is our client. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54795 Location: 56N 3W
|
Posted: Tue Jan 28, 2025 10:08 am Post subject: |
|
|
nokilli,
There is the universal set of distfiles. That cannot be legally distributed and as far as I know does not exist in one place.
Then there is the subset that are distributed by Gentoo. Gentoo takes care that they can be distributed.
You are free to distribute these too.
Does that help? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
nokilli Apprentice
Joined: 25 Feb 2004 Posts: 237
|
Posted: Tue Jan 28, 2025 12:00 pm Post subject: |
|
|
NeddySeagoon wrote: | nokilli,
There is the universal set of distfiles. That cannot be legally distributed and as far as I know does not exist in one place.
Then there is the subset that are distributed by Gentoo. Gentoo takes care that they can be distributed.
You are free to distribute these too.
Does that help? |
I get it now. You're talking about something like oracle-jdk. _________________ We are the block device. The kernel is our client. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54795 Location: 56N 3W
|
Posted: Tue Jan 28, 2025 12:23 pm Post subject: |
|
|
nokilli,
Certainly the list produced by Code: | qgrep RESTRICT | grep mirror | are not on the Gentoo mirrors. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|