Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Is it possible to max out modern hardware with portage?
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
therik
n00b
n00b


Joined: 14 Jul 2024
Posts: 7

PostPosted: Sun Jul 14, 2024 8:26 pm    Post subject: Is it possible to max out modern hardware with portage? Reply with quote

I wonder if I'm missing some settings, but I started the
Code:
emerge --ask --emptytree @world
as part of the switch to 23.0 profiles about 6 hours ago, about 1100 packages to compile.
After two hours of barely getting anywhere, I interrupted the run and edited make.conf:
Code:
FEATURES="parallel-fetch parallel-install -ebuild-locks -merge-wait"
MAKEOPTS="-j31 -l34"
EMERGE_DEFAULT_OPTS="--jobs=30 --load-average=31"

Specifically, I disabled ebuild-locks and merge-wait, the rest was the same.
Still, my 1 min load averages rarely got over 10 for the first few hours, most of the time hovering around 3-4, with overall CPU usage around 10-30%, ~10GiB of mem used and occasional blips on drive IO.

What is holding it up? Am I having some wrong settings somewhere? Is it just a line of sequential operations that can't be parallelized?
If so, what's the purpose of distcc, if we can't really parallelize even on a single system?

I then tried mounting /var/tmp/portage as a tmpfs and while the compilation did pick up a bit, it looks more like a coincidence. All the parallelism is still coming from makeopts and gcc, not from portage running many emerges at once.

I guess what I don't understand is that there are many warnings online about how makeopts -jX and --jobs can multiply each other, but I see emerge very rarely run more than 1 job at a time.

Typically, this is what I see:
Code:
.....
>>> Installing (191 of 470) net-libs/libproxy-0.5.5::gentoo
>>> Completed (191 of 470) net-libs/libproxy-0.5.5::gentoo
>>> Emerging (192 of 470) app-emacs/po-mode-0.22::gentoo
>>> Installing (192 of 470) app-emacs/po-mode-0.22::gentoo
>>> Completed (192 of 470) app-emacs/po-mode-0.22::gentoo
>>> Emerging (193 of 470) sys-devel/llvm-common-17.0.6::gentoo
>>> Jobs: 192 of 470 complete, 1 running            Load avg: 5.3, 10.1, 14.8

Code:

Gentoo /var/tmp/portage # df -h
Filesystem      Size  Used Avail Use% Mounted on
....
tmpfs            32G  1.5G   30G   5% /var/tmp/portage

Code:

Gentoo /var/tmp/portage # free -h
               total        used        free      shared  buff/cache   available
Mem:            62Gi        12Gi        25Gi       2.3Gi        28Gi        50Gi
Swap:           15Gi        15Mi        15Gi
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5104
Location: Bavaria

PostPosted: Sun Jul 14, 2024 9:44 pm    Post subject: Reply with quote

therik,

Welcome to Gentoo Forums ! :D

Please see this short article which explains why you will often see only one running compile job:
https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Optimize_compile_times#Using_EMERGE_DEFAULT_OPTS
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Sun Jul 14, 2024 9:50 pm    Post subject: Reply with quote

Yeah there is something called Amdahl's Law and portage is stuck with a lot of unparallelizeable code. There is also a lot of serialization in portage too which doesn't help.
Distcc really is only helpful on big projects like qtwebengine or chrome, else it actually somewhat is a burden. A lot of small ebuilds are kind of hindered by distcc.

I just made my E5-2690v2 build a kernel. 10 mins for kernel+modules. Not really that fast but it's not been that fast for me since the 2m 30sec kernel build times with my dual celeron-450 way back when (without modules so not apples vs apples).

distcc has been somewhat helpful however. Been trying to load down my E5-2690v2 and it has been taking a lot of distcc load and it couldn't be faster having the other machines compiling on their own.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
dimko
Apprentice
Apprentice


Joined: 12 Feb 2006
Posts: 201

PostPosted: Sun Jul 14, 2024 10:59 pm    Post subject: perhaps issue is nbt with portage? Reply with quote

I am not a programmer, but if I understand correctly, when you compile some programs, they require some libs to exist on the system so some functionality of sdaid libs can be checked.
Which means there can be a situation where bunch of packets are 'waiting' as per requirement of package, for that lib to be compiled. Now multiply by several such libs and you get misery.
And I suspect its not so easily mathematically calculated, as its 'traveling salesman' problem.
_________________
Just a user.
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2418

PostPosted: Mon Jul 15, 2024 5:20 am    Post subject: Reply with quote

There are two things here:

  • as dimko has noted, there might not be enough jobs available. The rest may be waiting on a single dependency to be compiled.
  • /var/db/pkg cannot be written concurrently.

It took me ~8 hours to emerge ~1600 packages using the binary packages host.

p.s. no, it's not traveling salesperson problem, which is intractable. Also you should be careful what you point at when naming its complexity.

Best Regards,
Georgi
Back to top
View user's profile Send private message
e8root
Tux's lil' helper
Tux's lil' helper


Joined: 09 Feb 2024
Posts: 94

PostPosted: Mon Jul 15, 2024 6:18 am    Post subject: Reply with quote

It should be obvious you cannot parallelize compilation jobs infinitely because a lot of packages depend on other packages.
What you could do (instead of asking "What is holding it up?" questions...) is to double-check if Portage is doing what it is supposed to be doing correctly. I mean obviously this is the kind of issue which in theory is simple but if you want to make it optimal it can get very complex very fast. Maybe Portage is too conservative to the point of not utilizing all resources optimally but maybe that is the only sane way to do it - and what I would assume is the case.

In fact it should be obvious Portage cannot be 100% optimal from simple fact that you cannot beforehand know how given build will load the system. Not to mention each system is different with different bottlenecks and settings. You will for example get much lower average cpu load when using LTO than without, etc. It is not even that obvious it would be that beneficial to run some tasks in parallel if you could - though I guess Portage doesn't care and it will run things in parallel if it can and has enough "jobs" slots.

As for distcc... when you add more computers with local memory, local storage etc. the whole optimization problem becomes much more complex. Especially when those computers have different performance. I never used distcc so I don't know how it actually schedules things but I can imagine throwing slower computers to the mix as some sort of helper resources can be very detrimental to performance if long-running task is executed on these slower machines e.g. LTO linking. Not an issue when all computers in the distcc network have identical or very similar performance but if not there might be serious bottlenecks.

All in all I would assume less than 100% CPU utilization cannot be helped on anything that isn't single core. And even then you would get I/O and network bottlenecks. I don't see any way around this issue.

Also - your settings are wrong.
Sure 2GB per compilation job is the value that is unrealistic but 30 jobs times 31 threads each is in theory 930 running processes - even with fraction of memory usage compared to recommended 2GB it could (if it was possible to run so many packages builds in parallel) overwhelm your computer. In this case it would start to heavily stutter and with just 16GB swap things would start crashing. So... if it doesn't work anyways as you expected it is probably best to reduce --jobs to more reasonable value just to be on the safe side.
_________________
Unix Wars - Episode V: AT&T Strikes Back
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20485

PostPosted: Mon Jul 15, 2024 7:11 pm    Post subject: Reply with quote

logrusx wrote:
It took me ~8 hours to emerge ~1600 packages using the binary packages host.
That seems horribly slow. Did you also compile a lot of packages?
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2418

PostPosted: Mon Jul 15, 2024 7:41 pm    Post subject: Reply with quote

pjp wrote:
logrusx wrote:
It took me ~8 hours to emerge ~1600 packages using the binary packages host.
That seems horribly slow. Did you also compile a lot of packages?


I remember one big package in particular but I don't think it took more than an hour. I was surprised by the time it took too. But most of the time no parallel jobs were running and many were waiting on the lock of /var/db/pkg. I didn't notice a big difference between larger and smaller binary packages, so I think it was mostly portage doing the merging stuff.

EDIT: just checked, it was 6:25 hours. Most of the time I used the computer, but it wasn't under load anyway.

Best Regards,
Georgi
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20485

PostPosted: Mon Jul 15, 2024 8:28 pm    Post subject: Reply with quote

That still seems beyond reasonably slow. Do other binary installs take ~4hrs? If so, a LOT has changed since I last installed one in ~2012.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2418

PostPosted: Mon Jul 15, 2024 9:05 pm    Post subject: Reply with quote

Well, I remember a thread someone was asking why emerging a virtual took 10 times it even more time than it used to some years before, so I guess a lot has changed.

Tomorrow I'll see if I can derive an average of how long it takes to merge a binary package.

Best Regards,
Georgi
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Mon Jul 15, 2024 11:35 pm    Post subject: Reply with quote

# time emerge -1 virtual/ssh # (note: real time is higher than timestamps below because dependency computation is not timestamped. Also this is a 32-bit KVM on a Core2 Quad.)
real 0m34.434s
user 0m19.483s
sys 0m7.333s

Mon Jul 15 17:42:23 MDT 2024 virtual/ssh clean
Mon Jul 15 17:42:24 MDT 2024 virtual/ssh setup
Mon Jul 15 17:42:28 MDT 2024 virtual/ssh install
Mon Jul 15 17:42:29 MDT 2024 virtual/ssh
Mon Jul 15 17:42:32 MDT 2024 virtual/ssh instprep
Mon Jul 15 17:42:33 MDT 2024 virtual/ssh
Mon Jul 15 17:42:33 MDT 2024 virtual/ssh preinst
Mon Jul 15 17:42:34 MDT 2024 virtual/ssh
Mon Jul 15 17:42:36 MDT 2024 virtual/ssh prerm
Mon Jul 15 17:42:37 MDT 2024 virtual/ssh postrm
Mon Jul 15 17:42:37 MDT 2024 virtual/ssh cleanrm
Mon Jul 15 17:42:38 MDT 2024 virtual/ssh postinst
Mon Jul 15 17:42:39 MDT 2024 virtual/ssh
Mon Jul 15 17:42:40 MDT 2024 virtual/ssh
Mon Jul 15 17:42:41 MDT 2024 virtual/ssh clean

hmm... no really really bad steps but all take a chunk out of the pie. However if there were 600 packages on the system and all were as fast as virtuals, at 3 packages per minute, it would take over 3 hours to install on this 32 bit KVM on a Core2Quad (64-bit)... which sounds really bad because there's more to it than virtual packages.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2418

PostPosted: Tue Jul 16, 2024 5:55 am    Post subject: Reply with quote

Ah it was your thread! I lost track of it however.

It takes 10 seconds for a virtual on my system. And if I remember correctly it was taking at leas 15 seconds for a binary package. I don't feel keen on trying to parse emerge.log to extract averages, so I leave it at that. But emerging binary packages is not that fast as someone might expect. Portage is still doing a lot of work. And it can't be done in parallel because of the lock on /var/db/pkg. Maybe there are ways to improve on that but it increases the volume of information that might be lost during an unexpected termination of emerge.

Best Regards,
Georgi
Back to top
View user's profile Send private message
therik
n00b
n00b


Joined: 14 Jul 2024
Posts: 7

PostPosted: Tue Jul 16, 2024 12:59 pm    Post subject: Reply with quote

logrusx wrote:
pjp wrote:
logrusx wrote:
It took me ~8 hours to emerge ~1600 packages using the binary packages host.
That seems horribly slow. Did you also compile a lot of packages?


I remember one big package in particular but I don't think it took more than an hour. I was surprised by the time it took too. But most of the time no parallel jobs were running and many were waiting on the lock of /var/db/pkg. I didn't notice a big difference between larger and smaller binary packages, so I think it was mostly portage doing the merging stuff.

EDIT: just checked, it was 6:25 hours. Most of the time I used the computer, but it wasn't under load anyway.

Best Regards,
Georgi


This is pretty much what I saw, lot of waiting, 1 package at a time, most of the system being almost idle, the compilation itself was rarely the bottleneck.

What's different about portage in this regard? Other distros seem to run through it much faster.
Back to top
View user's profile Send private message
pingtoo
Veteran
Veteran


Joined: 10 Sep 2021
Posts: 1245
Location: Richmond Hill, Canada

PostPosted: Tue Jul 16, 2024 1:34 pm    Post subject: Reply with quote

therik wrote:
What's different about portage in this regard? Other distros seem to run through it much faster.

Because the package manager was written in Python? The package database is just a plain file? Just joking :lol:

Seriously, sense so many are joint for conversation, it must mean there is great interest on this topic. Can someone diverse a plan, share some knowledge on how to profile a typical emerge run? I am willing to put in my time for such effort but not sure where to start.
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2418

PostPosted: Tue Jul 16, 2024 1:50 pm    Post subject: Reply with quote

therik wrote:

What's different about portage in this regard? Other distros seem to run through it much faster.


I guess portage was never meant to be fast. I mean, you're waiting for all those compile jobs to finish, what does it matter how fast portage is. Also portage is old. Back in the days of HDD that could have barely been noticed. And last but not least, as pingtoo pointed out, the DB is plain text files. Files do no t support concurrent access, concurrent access poses much more threats to data integrity.

Other distributions run much faster, but how often do you hear of broken updates? At leas back when I was into other distributions, I couldn't answer for myself, why there were so many upgrade breakages. Now I know why portage breaks so rarely and it's exactly the fact it's doing it slowly but safely.

And that's not the typical case either.

You either emerge a small amount of packages or wait for compile jobs, so it's not noticeable.

And one emptytree emerge as part of a migration is not something that's worth the time invested into improving the time of emerging binary packages. You'll forget about it soon enough.

@pingtoo, look at the sentence above.

Best Regards,
Georgi
Back to top
View user's profile Send private message
pingtoo
Veteran
Veteran


Joined: 10 Sep 2021
Posts: 1245
Location: Richmond Hill, Canada

PostPosted: Tue Jul 16, 2024 2:36 pm    Post subject: Reply with quote

logrusx,

Understood and complete agree with your point.

In fact for my Gentoo practice, I don't update my system until I have have new need. Or I will just build a new image from scratch when I feel my system is too out of date. unlike some that do frequent update I see my usage is not to concern of how each application up to day, but if the application serve my need. if application function correct and I don't need new feature I see no reason to update. I see security not in the form of making individual point secure, i am more toward to make sure the outer layer most is secure.

My idea of study just academic. The questions are can it be make even faster? is there a bottle neck? can the bottle neck be overcome? Can it be done without complete rewrite?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2418

PostPosted: Tue Jul 16, 2024 3:41 pm    Post subject: Reply with quote

pingtoo wrote:

My idea of study just academic. The questions are can it be make even faster? is there a bottle neck? can the bottle neck be overcome? Can it be done without complete rewrite?


I believe this theme has come up many times through the years and I think the answer is no, considering the current state of portage. Also I have very strong trust we have very good developers working on it and I wouldn't even attempt to improve what they've done. If I knew Python I would dig into it, but I don't feel very keen on learning it, so... :)

Best Regards,
Georgi
Back to top
View user's profile Send private message
Genone
Retired Dev
Retired Dev


Joined: 14 Mar 2003
Posts: 9608
Location: beyond the rim

PostPosted: Thu Jul 18, 2024 11:30 am    Post subject: Reply with quote

pingtoo wrote:
My idea of study just academic. The questions are can it be make even faster? is there a bottle neck? can the bottle neck be overcome? Can it be done without complete rewrite?


You certainly wouldn't need a complete rewrite, but you'd probably have to sacrifice something (flexibility, stability, compatibility, maintainability, ...). E.g. if the mentioned /var/db/pkg lock is a major obstacle (relatively speaking) you could remove or redesign it to be more granular, but that might increase the risk of data corruption or other hard to identify problems later on. Similar reason why python has the GIL to this day despite numerous attempts to get rid of it.

As for identifying the potential bottle neck, going by the emerge.log extract above it would seem that there is a significant amount of time spent in phase setup and teardown. So not really doing anything, but just checks and cleanup that may or may not be necessary in the majority of cases, but has to be done anyway for safety. Another idea would be to add logic to identify which phases actually need to be executed for a given ebuild (in particular virtuals and alike) and outright skip the rest, but that is much more complex and error-prone than you'd expect.
Back to top
View user's profile Send private message
pingtoo
Veteran
Veteran


Joined: 10 Sep 2021
Posts: 1245
Location: Richmond Hill, Canada

PostPosted: Thu Jul 18, 2024 2:13 pm    Post subject: Reply with quote

Genone wrote:
pingtoo wrote:
My idea of study just academic. The questions are can it be make even faster? is there a bottle neck? can the bottle neck be overcome? Can it be done without complete rewrite?


You certainly wouldn't need a complete rewrite, but you'd probably have to sacrifice something (flexibility, stability, compatibility, maintainability, ...). E.g. if the mentioned /var/db/pkg lock is a major obstacle (relatively speaking) you could remove or redesign it to be more granular, but that might increase the risk of data corruption or other hard to identify problems later on. Similar reason why python has the GIL to this day despite numerous attempts to get rid of it.

As for identifying the potential bottle neck, going by the emerge.log extract above it would seem that there is a significant amount of time spent in phase setup and teardown. So not really doing anything, but just checks and cleanup that may or may not be necessary in the majority of cases, but has to be done anyway for safety. Another idea would be to add logic to identify which phases actually need to be executed for a given ebuild (in particular virtuals and alike) and outright skip the rest, but that is much more complex and error-prone than you'd expect.


Thank you very much for your input.

This is exactly I am trying to identify, except I would like to have defined way (as some sort of profiling tool) that would give out the information on a typical run for a given system.

To be able to identify the effort is one of the key I am thinking about because only then we know if it worthy to spend it. for example if we did (with some magic effort) that found a solution to one bottle neck, yet that implementation may call for a big upgrade on the user end, Assume the gain can only amplify on a relative powerful machine than it may not be worthy due to the gain as whole (to Gentoo community) is small.

Of cause above is just an assumption and that is why I was thinking some effort need to spend on profiling a typical run in order to learn should any effort spend on it.

Come to think of it, is a central database like Portage File List concept good? collect machine type, number of packages built on the given run, time spend on each phase, etc...
Back to top
View user's profile Send private message
hoegger
n00b
n00b


Joined: 06 Apr 2008
Posts: 50

PostPosted: Thu Jul 18, 2024 9:39 pm    Post subject: Reply with quote

On a 22-core Xeon E5 with 128 GB RAM I have observed 44 threads fluctuating around 90% while emerging chromium.
While emerging opencv, most of the cores did barely anything.
Back to top
View user's profile Send private message
wanne32
n00b
n00b


Joined: 11 Nov 2023
Posts: 69

PostPosted: Fri Jul 19, 2024 1:27 pm    Post subject: Reply with quote

Usually the reaction times of the disks are the Problem. Not so much throughput. Also RAM is today often a much bigger bottleneck than the CPU. This is the reason why compiling on an modern M2 Pro-Mac is often by a decent factor faster than on an old 2X-Core DDR4/Sata server. Since its RAM has at least 2 times longer access times and its thorughput is often smaller by an even bigger factor. Sata-Disks do much worse. So you are often maxing out your hardware. Its just the Memory an not the CPU that is on 100%.
But since waiting for tmpfs (RAM) counts as CPU usage on most systems you should still see your CPU at 100%. How do you messure overall CPU-Usage? I get usually >>90% CPU usage when I am using tmpfs. If you set your load average to 31 and your parallel processes per CPU also to 31 it is more or less clear that it rarely starts a second process. This will hamper your CPU-utilisation but maybe not so much your compile time since more cache hits will compensate for the idle time.

But also compiling it on only 32G tmpfs sounds scarce for bigger things like firefox/kde/chromium. What are you compiling? Can it be that you are satisfying your internet connection since you are compiling a lot of big things without much compiletime?

Also llvm/clang are these packages that usually also stop my CPU running on max. The compilation process doesn't seems very parallel and a lot of other packages depend on it. So you can not go on with compiling other packages.

Edit: my 100,0000th post!
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2418

PostPosted: Fri Jul 19, 2024 2:16 pm    Post subject: Reply with quote

wanne32 wrote:
Usually the reaction times of the disks are the Problem.


I'll only address this as the rest is completely irrelevant. First, most of it happens in memory buffers, so that makes it redundant to use tmpfs if you have decent amount of memory. Second, modern SSD's are very fast.

We're talking about binary packages here, so no compilation happens. That's why most of your post is irrelevant.

Best Regards,
Georgi
Back to top
View user's profile Send private message
wanne32
n00b
n00b


Joined: 11 Nov 2023
Posts: 69

PostPosted: Fri Jul 19, 2024 4:37 pm    Post subject: Reply with quote

logrusx wrote:
First, most of it happens in memory buffers, so that makes it redundant to use tmpfs if you have decent amount of memory.
First and foremost: They solve not much since make waits until data is written to disk. So only reads get an speedup. Just try it: Compile llvm or similar on disk and on tmpfs. Only really big things like compiling chromium on -O3 won`t profit that much. But even there the speedups are considerable.
Quote:
Second, modern SSD's are very fast.
Usual modern SATA-SSDs have access times of 0.11ms or 1100000ns. Usual access times for Memory are 11ns. So just 100000 times slower – so y4s, very fast!
Quote:
We're talking about binary packages here
Where do you see that?

@therik: you are not in an container/VM or similar?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2418

PostPosted: Fri Jul 19, 2024 4:50 pm    Post subject: Reply with quote

wanne32 wrote:

Quote:
We're talking about binary packages here
Where do you see that?


Read the whole discussion, don't just drop in. The rest I'm not commenting on, you should know what is relevant and what significant.

Best Regards,
Georgi
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9824
Location: almost Mile High in the USA

PostPosted: Fri Jul 19, 2024 6:56 pm    Post subject: Reply with quote

Lol I was inspecting
Code:
# qlop -mvt virtual/*

and I notice
Code:
2023-02-25T21:23:15 >>> virtual/editor-0-r4: 19'19"

I don't think it really took 20 minutes, but it was probably waiting on a lock for 19 minutes...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum