Is it possible to max out modern hardware with portage?

therik · n00b Joined: 14 Jul 2024 Posts: 7

I wonder if I'm missing some settings, but I started the

pietinger · Posted: Sun Jul 14, 2024 9:44 pm Post subject:

therik,

Welcome to Gentoo Forums !

Please see this short article which explains why you will often see only one running compile job:
https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Optimize_compile_times#Using_EMERGE_DEFAULT_OPTS
_________________
https://wiki.gentoo.org/wiki/User:Pietinger

eccerr0r · Posted: Sun Jul 14, 2024 9:50 pm Post subject:

Yeah there is something called Amdahl's Law and portage is stuck with a lot of unparallelizeable code. There is also a lot of serialization in portage too which doesn't help.
Distcc really is only helpful on big projects like qtwebengine or chrome, else it actually somewhat is a burden. A lot of small ebuilds are kind of hindered by distcc.

I just made my E5-2690v2 build a kernel. 10 mins for kernel+modules. Not really that fast but it's not been that fast for me since the 2m 30sec kernel build times with my dual celeron-450 way back when (without modules so not apples vs apples).

distcc has been somewhat helpful however. Been trying to load down my E5-2690v2 and it has been taking a lot of distcc load and it couldn't be faster having the other machines compiling on their own.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?

dimko · Apprentice Joined: 12 Feb 2006 Posts: 201

I am not a programmer, but if I understand correctly, when you compile some programs, they require some libs to exist on the system so some functionality of sdaid libs can be checked.
Which means there can be a situation where bunch of packets are 'waiting' as per requirement of package, for that lib to be compiled. Now multiply by several such libs and you get misery.
And I suspect its not so easily mathematically calculated, as its 'traveling salesman' problem.
_________________
Just a user.

logrusx · Advocate Joined: 22 Feb 2018 Posts: 2420

There are two things here:

as dimko has noted, there might not be enough jobs available. The rest may be waiting on a single dependency to be compiled.
/var/db/pkg cannot be written concurrently.

It took me ~8 hours to emerge ~1600 packages using the binary packages host.

p.s. no, it's not traveling salesperson problem, which is intractable. Also you should be careful what you point at when naming its complexity.

Best Regards,
Georgi

e8root · Tux's lil' helper Joined: 09 Feb 2024 Posts: 94

It should be obvious you cannot parallelize compilation jobs infinitely because a lot of packages depend on other packages.
What you could do (instead of asking "What is holding it up?" questions...) is to double-check if Portage is doing what it is supposed to be doing correctly. I mean obviously this is the kind of issue which in theory is simple but if you want to make it optimal it can get very complex very fast. Maybe Portage is too conservative to the point of not utilizing all resources optimally but maybe that is the only sane way to do it - and what I would assume is the case.

In fact it should be obvious Portage cannot be 100% optimal from simple fact that you cannot beforehand know how given build will load the system. Not to mention each system is different with different bottlenecks and settings. You will for example get much lower average cpu load when using LTO than without, etc. It is not even that obvious it would be that beneficial to run some tasks in parallel if you could - though I guess Portage doesn't care and it will run things in parallel if it can and has enough "jobs" slots.

As for distcc... when you add more computers with local memory, local storage etc. the whole optimization problem becomes much more complex. Especially when those computers have different performance. I never used distcc so I don't know how it actually schedules things but I can imagine throwing slower computers to the mix as some sort of helper resources can be very detrimental to performance if long-running task is executed on these slower machines e.g. LTO linking. Not an issue when all computers in the distcc network have identical or very similar performance but if not there might be serious bottlenecks.

All in all I would assume less than 100% CPU utilization cannot be helped on anything that isn't single core. And even then you would get I/O and network bottlenecks. I don't see any way around this issue.

Also - your settings are wrong.
Sure 2GB per compilation job is the value that is unrealistic but 30 jobs times 31 threads each is in theory 930 running processes - even with fraction of memory usage compared to recommended 2GB it could (if it was possible to run so many packages builds in parallel) overwhelm your computer. In this case it would start to heavily stutter and with just 16GB swap things would start crashing. So... if it doesn't work anyways as you expected it is probably best to reduce --jobs to more reasonable value just to be on the safe side.
_________________
Unix Wars - Episode V: AT&T Strikes Back

pjp · Administrator Joined: 16 Apr 2002 Posts: 20485

logrusx · Advocate Joined: 22 Feb 2018 Posts: 2420

pjp · Administrator Joined: 16 Apr 2002 Posts: 20485

That still seems beyond reasonably slow. Do other binary installs take ~4hrs? If so, a LOT has changed since I last installed one in ~2012.
_________________
Quis separabit? Quo animo?

logrusx · Advocate Joined: 22 Feb 2018 Posts: 2420

Well, I remember a thread someone was asking why emerging a virtual took 10 times it even more time than it used to some years before, so I guess a lot has changed.

Tomorrow I'll see if I can derive an average of how long it takes to merge a binary package.

Best Regards,
Georgi

eccerr0r · Posted: Mon Jul 15, 2024 11:35 pm Post subject:

# time emerge -1 virtual/ssh # (note: real time is higher than timestamps below because dependency computation is not timestamped. Also this is a 32-bit KVM on a Core2 Quad.)
real 0m34.434s
user 0m19.483s
sys 0m7.333s

Mon Jul 15 17:42:23 MDT 2024 virtual/ssh clean
Mon Jul 15 17:42:24 MDT 2024 virtual/ssh setup
Mon Jul 15 17:42:28 MDT 2024 virtual/ssh install
Mon Jul 15 17:42:29 MDT 2024 virtual/ssh
Mon Jul 15 17:42:32 MDT 2024 virtual/ssh instprep
Mon Jul 15 17:42:33 MDT 2024 virtual/ssh
Mon Jul 15 17:42:33 MDT 2024 virtual/ssh preinst
Mon Jul 15 17:42:34 MDT 2024 virtual/ssh
Mon Jul 15 17:42:36 MDT 2024 virtual/ssh prerm
Mon Jul 15 17:42:37 MDT 2024 virtual/ssh postrm
Mon Jul 15 17:42:37 MDT 2024 virtual/ssh cleanrm
Mon Jul 15 17:42:38 MDT 2024 virtual/ssh postinst
Mon Jul 15 17:42:39 MDT 2024 virtual/ssh
Mon Jul 15 17:42:40 MDT 2024 virtual/ssh
Mon Jul 15 17:42:41 MDT 2024 virtual/ssh clean

hmm... no really really bad steps but all take a chunk out of the pie. However if there were 600 packages on the system and all were as fast as virtuals, at 3 packages per minute, it would take over 3 hours to install on this 32 bit KVM on a Core2Quad (64-bit)... which sounds really bad because there's more to it than virtual packages.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?

logrusx · Advocate Joined: 22 Feb 2018 Posts: 2420

Ah it was your thread! I lost track of it however.

It takes 10 seconds for a virtual on my system. And if I remember correctly it was taking at leas 15 seconds for a binary package. I don't feel keen on trying to parse emerge.log to extract averages, so I leave it at that. But emerging binary packages is not that fast as someone might expect. Portage is still doing a lot of work. And it can't be done in parallel because of the lock on /var/db/pkg. Maybe there are ways to improve on that but it increases the volume of information that might be lost during an unexpected termination of emerge.

Best Regards,
Georgi

therik · n00b Joined: 14 Jul 2024 Posts: 7

pingtoo · Posted: Tue Jul 16, 2024 1:34 pm Post subject:

logrusx · Advocate Joined: 22 Feb 2018 Posts: 2420

pingtoo · Posted: Tue Jul 16, 2024 2:36 pm Post subject:

logrusx,

Understood and complete agree with your point.

In fact for my Gentoo practice, I don't update my system until I have have new need. Or I will just build a new image from scratch when I feel my system is too out of date. unlike some that do frequent update I see my usage is not to concern of how each application up to day, but if the application serve my need. if application function correct and I don't need new feature I see no reason to update. I see security not in the form of making individual point secure, i am more toward to make sure the outer layer most is secure.

My idea of study just academic. The questions are can it be make even faster? is there a bottle neck? can the bottle neck be overcome? Can it be done without complete rewrite?

logrusx · Advocate Joined: 22 Feb 2018 Posts: 2420

Genone · Posted: Thu Jul 18, 2024 11:30 am Post subject:

pingtoo · Posted: Thu Jul 18, 2024 2:13 pm Post subject:

hoegger · n00b Joined: 06 Apr 2008 Posts: 50

On a 22-core Xeon E5 with 128 GB RAM I have observed 44 threads fluctuating around 90% while emerging chromium.
While emerging opencv, most of the cores did barely anything.

wanne32 · n00b Joined: 11 Nov 2023 Posts: 69

Usually the reaction times of the disks are the Problem. Not so much throughput. Also RAM is today often a much bigger bottleneck than the CPU. This is the reason why compiling on an modern M2 Pro-Mac is often by a decent factor faster than on an old 2X-Core DDR4/Sata server. Since its RAM has at least 2 times longer access times and its thorughput is often smaller by an even bigger factor. Sata-Disks do much worse. So you are often maxing out your hardware. Its just the Memory an not the CPU that is on 100%.
But since waiting for tmpfs (RAM) counts as CPU usage on most systems you should still see your CPU at 100%. How do you messure overall CPU-Usage? I get usually >>90% CPU usage when I am using tmpfs. If you set your load average to 31 and your parallel processes per CPU also to 31 it is more or less clear that it rarely starts a second process. This will hamper your CPU-utilisation but maybe not so much your compile time since more cache hits will compensate for the idle time.

But also compiling it on only 32G tmpfs sounds scarce for bigger things like firefox/kde/chromium. What are you compiling? Can it be that you are satisfying your internet connection since you are compiling a lot of big things without much compiletime?

Also llvm/clang are these packages that usually also stop my CPU running on max. The compilation process doesn't seems very parallel and a lot of other packages depend on it. So you can not go on with compiling other packages.

Edit: my 100,0000th post!

logrusx · Advocate Joined: 22 Feb 2018 Posts: 2420

wanne32 · n00b Joined: 11 Nov 2023 Posts: 69

logrusx · Advocate Joined: 22 Feb 2018 Posts: 2420

eccerr0r · Posted: Fri Jul 19, 2024 6:56 pm Post subject:

Lol I was inspecting