Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Why not just put all of portage on read-only NBD volumes?
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3  
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Thu Jan 16, 2025 3:08 am    Post subject: Reply with quote

pingtoo wrote:
If unaware of there is man in the middle and use kernel as nbd client than any security done at operating system are lost.

the middle have opportunity to control anything kernel can do (read/write file content, execute new process without visibility).


pingtoo wrote:
Are you aware when running nbd-client the program just initiate kernel to contact remote nbd server? the nbd-client program does not perform any I/O for the connected NBD device.


You answered your own question. The kernel is managing this connection just as it does all of the others.

Whatever the risk is being borne by the kernel acting as client, it appears to be identical to the risk borne hosting any other connection.

The kernel is our client. The protocol is simpler than ping. I honestly don't see the problem, but sure, Linux is huge.
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Thu Jan 16, 2025 3:11 am    Post subject: Reply with quote

pjp wrote:
nokilli wrote:
There is an implicit final tuple of (Math.MAXINT, 1),
Presuming search results are correct, that's a go-ism.

They aren't correct, I mistyped.

Should have said, (2**64, 1).

I know a few languages but as I get older as I switch from one to another, more and more the previous language sort of stays stuck in the matrix.
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Thu Jan 16, 2025 3:40 am    Post subject: Reply with quote

pjp wrote:
Of the threats to be concerned with, I just don't see Linux package managers as being a concern. If you were somehow obligated to use RedHat or Canonical, or some obscure distro, I'd understand more.

Again, it was Fedora rpm-ostree's seeming insistence on denying users proxied access to their repository that got me rolling on this.

And what did I do? I come racing back to Gentoo.

That doesn't mean we can't benefit from a trustless package manager.

Who would you trust more? The package manager that insists on making you connect directly to their servers with an encrypted connection? Or the package manager that lets you manage your installation in an entirely trustless fashion, so much so you don't even need to think about it? Who's got your back?

You guys are taking this personally. Stop it. Read the news. Who do you trust today to keep your system secure when the whole space is a house of mirrors?

Seriously, come up with a good name. Gentoo Hardened was great! Something like that, that conveys a focus on making everything it can trustless, exploiting the one feature that no other distro has: we're source-based.

Why is that important? Because if we do the NBD server right, we can make it really fast for users we can assess to be using the service responsibly. There are places where time to getting a patched system up and running is important, yes?

NBD is fully asynchronous, did you know that? It means our client, the Linux kernel, can request blocks in one order and we, the block device, can serve them in some other order. This means we can easily support burst transfer rates for responsible users even within a strict rate limiting regime that keeps bandwidth abusers in the bulk queue.
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Thu Jan 16, 2025 3:58 am    Post subject: Reply with quote

Facing the serious prospect that the tarballs won't be stable.

Again, if the plan is to store the package source in addition to the tarballs so that the user only need download the one file as opposed to many, there is still the problem of verifying package integrity since it's the tarball that gets signed by upstream, right?

And while you can expand the package source from a tarball, and then recreate the tarball from that source, and do indeed end up with a tarball that contains the same files, it's maybe not the same bits. Upstream is signing a hash of the tarball. One bit out of place and it all breaks.

Add to the complexity, I bet the underlying file system plays an important role. Projects die on these rocks all the time.

For the moment I'm going to assume there's an answer, perhaps some additional process in addition to expanding all of these tarballs that produces some kind of hint file that can be used better construct the tarball on the user's end? If this really can save significant bandwidth then addressing that complexity would be worth it, no?

(cpio guys are doing picard_facepalm.jpg)
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Thu Jan 16, 2025 7:30 am    Post subject: Reply with quote

We are the block device.

How powerful is that?

Here's an example: I changed my mind. Let's make this a writable NBD volume. Then we can put a CoW filesystem on there and instant multiuser filesystem, right?

Wrong. You can't do that because there is no way for the block device to allow the filesystem to perform a meaningful lock.

But wait.

We are the block device. If it was important to us, we could be aware of the filesystem we are hosting.

We would understand that a CoW filesystem has this one block that holds it all together. When you do any kind of write to a CoW filesystem, it does it to all new blocks that nobody else can see, then at the last moment it swaps out the root of the old btree with the root of the new btree you just created, right?

All we have to do here is serialize access to the root block. That means, in our case, developers giving us new tarballs, which are really file copies to the NBD server, which are really writing a bunch of blocks to the block device (that's us) in the hopes that when they are done, the root of the new btree they've just created is accepted as the new root for the filesystem.

Too slow, somebody beat you to it? You have to try again. At some point your copy goes through and then from that moment on becomes an indelible part of the Gentoo distfiles monolith.

This is as ideal a filesystem at it gets for where you have a few people doing updates and a great many people downloading the files! And isn't this exactly our use case?

Bonus! It's a live filesystem! The user is now mounting it rw, which means, we can add files to the device even while there are 1000 people who have this thing mounted in their mtabs. That does mean more blocks flying around, so we need a filesystem we can tune.

And now look at the potential latency between dev checkin and user access. Do the case that shows this whole system in its best light: the really package with the one tiny file change. Dev can submit that block and user can fetch it in the very next kernel call.

Bonus! You have PORTAGE!!! How absolutely wonderful is this, where you can let individual devs so easily post their stuff unimpeded and still let the community meter access via portage!

The Cathedral and the Bazaar! (sort of)

The LLM insists there's a way to get NBD working over TLS. Does that mean there may be some simple kind of authentication mechanism we can use to grant devs write access? Or would you be willing to drink still more koolaid and join me in my campaign to make block knocking a legitimate authentication protocol? Devs are granted access based on correctly reading specific blocks from the block device. This could also be used as a lock hint.

In the same way we can identify the root block for the CoW tree we're hosting, we can also agree on a convention for communicating status information back to the client by simply returning blocks that are updated with the new information in some convenient format, like json. Status information like, whether you were granted a lock or not. Or what lock contention is looking like.

http, ftp... these are protocols best reserved for people who can't block device.

EDIT EDIT EDIT (are you seeing this?) EDIT: this is not entirely stupid, but is useless for public-facing. Requires a client that we can trust. The kernel is our client... but anybody can pretend to be the kernel. Any process can connect and essentially pose as the kernel and ask the block device (you guys) to write bad blocks. No way to enforce use of a specific filesystem. I can do it here in the home network... I could write to the block device from different machines, and even handle competing requests, because I could trust that I was using a correct implementation of, say, btrfs, but what advantage then over nfs on native btrfs? Zero. Officially giving up on public-facing RW. IT CAN NEVER WORK. THAR BE DRAGONS HERE. KIDS DON'T TRY THIS AT HOME.

(but hey, at school, why not?)
_________________
We are the block device. The kernel is our client.


Last edited by nokilli on Thu Jan 16, 2025 2:54 pm; edited 1 time in total
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Thu Jan 16, 2025 7:49 am    Post subject: Reply with quote

I don't have a good feel for how important it is to reduce the time it takes to fix bugs. I'm going to proceed like how I'm looking at bandwidth savings, maybe it's enormous. But then too maybe not.

It could be important.

Does portage have the ability to allow the user to always retain /var/tmp/portage contents?

So user builds a package. All of the build files are allowed to remain. Why? So if this rsync over nbd thing happens in the future, we could conceivably rsync the new version from our local distfiles into the build directory?

What does that do to time from exploit discovery to exploit mitigation? We do the build it's conceivable that it's a compile or two and then a link, right? We could do the build at the same time we're recreating the tarball for package verification, and only on successful verification do the ebuild install?

Fastest gun in the west. Which is pretty impressive when you consider that Gentoos are not known for being gun-wielding birds.

Ok I'll stop.
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Thu Jan 16, 2025 7:53 am    Post subject: Reply with quote

Just one last thought...

Make it a pay service. Gentoo Express. Figure out a better form of authentication.

It's acceptable to pay $5 to a distro to get the CD, right? That's just another block device.

Ok, stopping.
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20580

PostPosted: Fri Jan 17, 2025 4:14 am    Post subject: Reply with quote

nokilli wrote:
Again, it was Fedora rpm-ostree's seeming insistence on denying users proxied access to their repository that got me rolling on this.
I missed the earlier reference to that detail. I've never used Fedora, so I don't know what "rpm-ostree" is. Organizations running content through a proxy seems commonplace, so I'm surprised they block that.

nokilli wrote:
That doesn't mean we can't benefit from a trustless package manager.
I never said it did. Has someone else?

nokilli wrote:
Who would you trust more? The package manager that insists on making you connect directly to their servers with an encrypted connection? Or the package manager that lets you manage your installation in an entirely trustless fashion, so much so you don't even need to think about it? Who's got your back?
Well, as I mentioned earlier... If I were concerned about Gentoo stealing my private data, I'd stop using it. And while I wasn't explicit, I thought it reasonable that "Gentoo" could be replaced by any distro / OS of concern.

nokilli wrote:
You guys are taking this personally. Stop it.
I'm not sure what gave you that impression, but my interest is purely curiosity. Nothing about it is personal. Some time ago I noticed the nbd feature and considered using. I never tried it. That curiosity was the only reason I opened the thread. I look forward to seeing your results. Maybe it will eventually become an option to consider. My interest goes no farther than that.

nokilli wrote:
Read the news. Who do you trust today to keep your system secure when the whole space is a house of mirrors?
Since I have no idea what you're referring to, I can only guess. If my employer requires me to use a particular produce or service, that's their choice. As for the house of mirrors, my concern is primarily focused on web based problems, including but not limited to javascript.

nokilli wrote:
Seriously, come up with a good name. Gentoo Hardened was great! Something like that, that conveys a focus on making everything it can trustless, exploiting the one feature that no other distro has: we're source-based.
Until now, I didn't really understand your issue / problem, and I'm not certain I do now.

nokilli wrote:
Why is that important? Because if we do the NBD server right, we can make it really fast for users we can assess to be using the service responsibly. There are places where time to getting a patched system up and running is important, yes?
Sure, fast is good, for whatever responsible use might entail. Currently the "irresponsible" syncing policy is pretty forgiving.

I have no idea how your solution would improve time to patching a system over the current implementation. That's where testing and or the service in beta testing would be helpful, whenever the appropriate time for that may be.

nokilli wrote:
NBD is fully asynchronous, did you know that?
You may have mentioned something about it. Do you have a target date when you'll start your own testing?

Anyway, good luck with your efforts.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Fri Jan 17, 2025 6:52 am    Post subject: Reply with quote

pjp wrote:
nokilli wrote:
That doesn't mean we can't benefit from a trustless package manager.
I never said it did. Has someone else?

I could have made it clearer that my concern was about the process, a few people took it the wrong way I think.

It's pretty standard in my experience: I rely on the package manager running as root, increasingly using tls connections, and with my data sitting right there the entire time. We just take it for granted that that's the way it has to be.

To be clear: the one part of rpm-ostree that sits atop dnf accepts the usual http(s)_proxy business. It's the part that streams blocks in over ostree that appears to be stubborn.

A lot of the criticism on my end is easily mitigated with something like apt-cacher-ng.

But then I go read the apt-cacher-ng developer notes and they at one point had outbound message body parts on their to-do list.

I don't want outbound body message parts on their to-do list. If there's a POST coming from my machine, I want it to originate with Firefox.

This is why I want to hit things with rocks. The general trend towards greater interoperability is wonderful in every context but security. I can perform a system update securely by simply selectively rsyncing against a read-only directory?

That's one big satisfying rock.

pjp wrote:
I have no idea how your solution would improve time to patching a system over the current implementation. That's where testing and or the service in beta testing would be helpful, whenever the appropriate time for that may be.

I'm finishing the testing component for it and so I guess I create the github repository and post a link here then. Should be soon.
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Fri Jan 17, 2025 6:55 am    Post subject: Reply with quote

dmraid

mirrorselect becomes mirrorsselect

Gentoo suffers same bandwidth cost, but user gets to enjoy improved latency when accessing the repository.

We we we we are are are are the the the the block block block block device device device device.

And the kernel is our client.
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54725
Location: 56N 3W

PostPosted: Fri Jan 17, 2025 12:14 pm    Post subject: Reply with quote

nokilli,

... and things disappearing from the NBD server before they are needed?
How is that addressed?

My Pi5 takes 24 hours to build chromium and a couple of days for a complete update.
My AMD E350 takes over a week for an update.

I would get a bit upset if ebuilds and distfiles vanished after the dependency tree had been calculated, or even in mid dependency tree calculation.
That takes over an hour too.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10700
Location: Somewhere over Atlanta, Georgia

PostPosted: Fri Jan 17, 2025 2:12 pm    Post subject: Reply with quote

nokilli wrote:
...Again, if the plan is to store the package source in addition to the tarballs so that the user only need download the one file as opposed to many, there is still the problem of verifying package integrity since it's the tarball that gets signed by upstream, right?
Wrong, or at least not relevant in the Gentoo context. It's each released instance of the repository that's signed, not the individual tarballs. The repository contains hashes of the source tarballs, though, so the repo structure effectively extends the chain of trust to the tarballs.

Just a technical point, but it's another thing that'd need to be addressed in your scheme: the security model of the Gentoo repo would need significant alteration.

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Fri Jan 17, 2025 2:57 pm    Post subject: Reply with quote

NeddySeagoon wrote:
nokilli,

... and things disappearing from the NBD server before they are needed?
How is that addressed?

My Pi5 takes 24 hours to build chromium and a couple of days for a complete update.
My AMD E350 takes over a week for an update.

I would get a bit upset if ebuilds and distfiles vanished after the dependency tree had been calculated, or even in mid dependency tree calculation.
That takes over an hour too.


Two things:

First, I'm no longer interested in involving portage with this. Portage is what lets us go wild and crazy with this. I'm only interested in the distfiles, trustless package management as well as reducing bandwidth costs, so I could be talking about using CompuServe over 1200 baud modems and it wouldn't matter because at the end of the day the tarball needs to verify and portage is what decides whether that happened. So no vanishing ebuilds ever, at least not because of any of this.

Second. We put up an image. People use it. When it comes time, we put up a new image but we let people continue to use the previous image. No more users? Server process ends, image unmounts, disk space reclaimed. Users we regard as abusers don't factor into that, so they won't lock the image open.

Maybe two images aren't enough for that. Maybe you need three, or twenty, to maintain some sort of balance between user accessibility and the affordability of continuing to provide the service. My understanding is that the disk space isn't the expensive part.

But let's say it's just one.

For the distfiles to vanish and impact the process I am describing would mean it would have to happen during ebuild fetch. ebuild fetch gets patched to initiate the rsync over nbd of the package tree in the update case, so it would have to be very well-timed. And in any case, recovery is re-mounting the volume and re-initiating the emerge.

And it's always been my impression that you guys maintain a deep enough version history with the tarballs to essentially preclude this sort of thing from happening anyways.

That's not the problem.

The problem I want to hear you speak to is how to guarantee that we are always able to let the user's system correctly create the tarball portage needs to verify. After using rsync over nbd to update the project on the user's machine, we have the entire project tree, everything we need to build. But portage can't verify the tarball, unless we re-create it. The whole point of this is to not download it if we don't have to! And we can do that, and it can hold all of the right files, but does it hold the right bits??? I'm betting you're one of these tar guys. There's like a switch or something, right?
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54725
Location: 56N 3W

PostPosted: Fri Jan 17, 2025 3:14 pm    Post subject: Reply with quote

nokilli,

I'm a retired electronics engineer. Software is close to black magic for me. :)
I did do a lot of systems engineering too. Lesson one there is to start at a high level with the system requirements, then add detail to approach a solution.

You are starting with a perceived solution then trying to fit it to the problem. That won't work.
Poorly defined, or changing requirements lead to cost and schedule overruns, sometimes with big expensive projects being abandoned.

Start with a complete problem definition, then decompose it into lots of smaller problems. That may change the original complete problem definition but thats OK, it an iterative process.

Once you have your problem decomposed and its stable, look at potential solutions to all the smaller problems.
Even make a prototype, which you will throw away.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
pingtoo
Veteran
Veteran


Joined: 10 Sep 2021
Posts: 1405
Location: Richmond Hill, Canada

PostPosted: Fri Jan 17, 2025 3:23 pm    Post subject: Reply with quote

nokilli wrote:
...
we have the entire project tree, everything we need to build. But portage can't verify the tarball,
...
Unfortunately, current Portage (emerge) CANNOT work on project source tree. not because it need to verify tarball signatures. it is because the emerge code design does not use source tree as source to build.
Back to top
View user's profile Send private message
nokilli
Apprentice
Apprentice


Joined: 25 Feb 2004
Posts: 235

PostPosted: Fri Jan 17, 2025 3:36 pm    Post subject: Reply with quote

pjp is right, you guys need to see results and there's nothing out there.

So trying to get the ball rolling.

https://github.com/nokilli/sandbag

Neddy doesn't know tar.
_________________
We are the block device. The kernel is our client.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Goto page Previous  1, 2, 3
Page 3 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum