Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Why not just put all of portage on read-only NBD volumes?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 200

PostPosted: Sat Jan 11, 2025 5:12 pm    Post subject: Why not just put all of portage on read-only NBD volumes? Reply with quote

The tree, the distfiles, all of it.

Do mirrors just like is being done now with http, https, rsync....

Only now, you're inviting the public to mount the network block device and access it like it's a local drive.

Go further, and make it part of the handbook, where you have everybody setting up the chroot. In the same way if you were using nfs for portage, by doing those mounts before the chroot.

The final system could do something which uses dm-cache or lvm's cache or whatever to make it so that files that are effectively downloaded from the NBD volume can be reused later without having to re-fetch.

A NBD server should be a simple affair. You're identifying bandwidth abusers at the block level ffs, people who try dd get throttled quickly and easily.

And you'd have the only distro as far as I can tell that allows users some piece of mind as they perform updates of their system. Everybody else seems to either demand simultaneous access to your home directory with the tls connection back to the mothership or works hard to break any proxy solution the community comes up with to gain some measure of control over their data.

The install cd combined with the read-only repository drive should provide excellent privacy for its users.

Moved from "Portage & Programming" to "Gentoo Chat". --Zucca
_________________
Today is the first day of the rest of your Gentoo installation.
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10690
Location: Somewhere over Atlanta, Georgia

PostPosted: Sat Jan 11, 2025 7:45 pm    Post subject: Reply with quote

I've tried it and I didn't like it—with the repositories, at least. My own personal experience is that Portage performance with a remotely mounted repository is dramatically lower than with a local copy. This is principally during the dependency resolution process. I do sync my home server against Gentoo and then sync all my other machines against my home server, but that's more out of respect for Gentoo's donated bandwidth than anything else. My home server also shares its /usr/portage/distfiles with the rest of the network for the same reason.

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 22977

PostPosted: Sat Jan 11, 2025 8:25 pm    Post subject: Reply with quote

What problem is this solving? If users need to avoid revealing what they are reading, then asking them to read it over the Internet from a network block device is not an improvement over asking them to read it over the Internet using a TLS-encrypted connection. They need a way to get a purely local copy and do so without revealing what they are reading. Portage already caches everything it reasonably can, using local files, not a mirror of a block device.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 200

PostPosted: Sun Jan 12, 2025 12:30 am    Post subject: Reply with quote

Performance would be solved using dm-cache. Yes the first sync pulls in the whole tree but we're doing that already anyways. Once all of the bits are sitting in your local cache, speeds should back to normal.

And nobody is asking to avoid revealing what THEY are reading. The point is to be able to update a Linux installation without letting THE PACKAGE MANAGER read any of the data the user has on the system.

I should be able to update my Linux system without exposing my data. This does that.

And it seems so simple.
_________________
Today is the first day of the rest of your Gentoo installation.
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 22977

PostPosted: Sun Jan 12, 2025 1:24 am    Post subject: Reply with quote

I don't see how this solves your stated problem. If you don't want the package manager poking around in the user's home directories, give those home directories permission bits that prevent the Linux user portage from searching them. That can be done just as readily with the current Portage design as with your proposal.

If you want everything cached by dm-cache, would it not be better to download everything in one go to local storage that knows it is primary, rather than letting the cache slowly populate itself as it is accessed? Your proposal does not seem simple to me. It relies on at least two kernel features that most people otherwise do not need to enable. It relies on infrastructure no one is running yet. It seems to me that its performance will at best be equal to the current setup, but likely worse due to demand-loading parts of the tree over time instead of eagerly loading the entire tree into local storage at emerge --sync time.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 200

PostPosted: Sun Jan 12, 2025 1:27 am    Post subject: Reply with quote

John R. Graham wrote:
I do sync my home server against Gentoo and then sync all my other machines against my home server, but that's more out of respect for Gentoo's donated bandwidth than anything else.


I just want to expand on this because I'm trying to do something similar.

I have the home server running Gentoo and it is hosting both /var/db/repos/gentoo and /var/cache/distfiles over nfs. Going to install Gentoo on a new machine connected to that server is a delight; I just remember to perform the nfs mounts prior to the chroot, and then get to skip the sections of the handbook covering how to set up portage and distfiles.

The problem of course is that the home server is running with a limited number of installed packages. The new machine is the laptop and I hope to have sway, Firefox, whichever Doom we're on, etc. I also hope to never access the Internet with this device, leaving all such things to the home server.

Now we arrive at the fork in the road. When you say you sync your home server against Gentoo, do you also sync your distfiles?

If you do, you have a very easy time of it going forward. Any package you want to emerge on the workstation you get to, easily and without fuss.

If, on the other hand, you are not syncing your distfiles, then you have to manually fetch the needed packages from the home server. The current solution to that appears to be running the output of emerge -p on the target machine through some sed/perl/python and then sneakernetting the result onto the host machine where you do a emerge -f --nodeps to actually get the files.

And it works! I am happily on my way to a Gentoo-powered home network!

I was in this situation before; I think I had the big bulky machine at home without Internet I wanted to update via the laptop I could take to the cafe. My solution then was to sync all of distfiles. This was frowned upon here. So I stopped doing it.

But it was great! I could be out in the sticks and have all of Gentoo at my disposal. I could wipe out my install through stupidity, but if the drive holding portage and the distfiles was still intact, I could claw my way back to a running system. Without the Internet.

Anyways, I am very happy with the way my setup is working out, but it could have been so much simpler, is I guess what I'm saying. I want to respect Gentoo's bandwidth too. There's some weird stuff in the distfiles, have you looked? 99% of it is stuff I will never install.

But I don't know in advance which packages make up the 1% that I will install. The ability to simply mount that critical volume over the Internet to quickly and safely satisfy those dependencies reeks of elegance.
_________________
Today is the first day of the rest of your Gentoo installation.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 200

PostPosted: Sun Jan 12, 2025 1:57 am    Post subject: Reply with quote

Hu wrote:
I don't see how this solves your stated problem. If you don't want the package manager poking around in the user's home directories, give those home directories permission bits that prevent the Linux user portage from searching them. That can be done just as readily with the current Portage design as with your proposal.


It involves my having the Gentoo machine connected to the Internet. I appreciate what you're saying about the effectiveness of discretionary access control, but the surface area of that solution is enormous when compared to the solution of simply never connecting to the Internet, or, in that case, making that connection as safe as it can possibly be.

Hu wrote:
If you want everything cached by dm-cache, would it not be better to download everything in one go to local storage that knows it is primary, rather than letting the cache slowly populate itself as it is accessed?


Ok, I can withdraw that part of the recommendation. Nothing wrong with just downloading the snapshot to the local device. Being able to have just one copy on the local network shared across all devices is nice and all, but you're right, no real advantage over the local copy. Other than having to maintain a local copy that is.

Simplifying access to the distfiles addresses the pain point.

Hu wrote:
Your proposal does not seem simple to me. It relies on at least two kernel features that most people otherwise do not need to enable...

nbd and dm-cache should already be part of any self-respecting kernel, c'mon.

Hu wrote:
It relies on infrastructure no one is running yet.

A nbd server is a nothing thing. Ten lines of Python, tops. Twenty if you want to throttle abusers.

But wait, there's more!

Why can't the Gentoo install CD simply be GRUB and a kernel with nbd enabled, that's given a nbd server on the Internet to load the initramfs from? EDIT: ok the initramfs would have to be on the stick but it can mount the nbd device and then you can switch_root to that.

Once a Gentoo user has this thing burned onto a USB drive, it's good for life and they never have to touch another thing. The image used for the initramfs can be updated constantly. It loads them into a shell where /var/db/repos/gentoo and /var/cache/distfiles are already mounted and they can then do partitioning/formatting, etc. stage3 would be hosted on yet another nbd volume.

Two little [x] things in the kernel but they give so much!
_________________
Today is the first day of the rest of your Gentoo installation.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum