a use model for ccache...to solve llvm-16 running out of RAM

eccerr0r

Argh.

It appears that llvm-16 is no longer fitting in 2GB RAM probably because I wanted to use distcc, once again llvm-tblgen is consuming all RAM and causing the computer to thrash.

Well what I was wondering... perhaps I should start using ccache - basically do this:

<enable ccache>

FEATURES=distcc MAKEOPTS="-j 32" emerge -1 llvm
<wait until it starts thrashing in llvm-tblgen>
^C <abort>
MAKEOPTS="-j1" emerge -1 llvm

Then it should pick up the pieces with ccache and hope it can actually finish.

It's unfortunate that this has to be done in two steps, I really wish that some of these RAM intensive or undistributable steps can automatically take a smaller -j option and let the distributable segments get a large -j option.

The same problem for dev-lang/rust where a larger -j is good for the compile of llvm at the beginning as it's distributable, but a lower -j option for the rustc portion during the latter part as rustc is not distributable.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?

sam_ · Developer Joined: 14 Aug 2020 Posts: 1957

I really doubt this is worth the time saving from the manual babysitting you have to do. Just use an appropriate MAKEOPTS for the package and set it in /etc/portage/package.env for larger software. Then go spend your time on something else.

eccerr0r · Posted: Tue Oct 10, 2023 4:50 pm Post subject:

Yeah it is a pain, wish that the build scripts would take into account expected workload - specifically for processes that generate a load average over 1 for each invocation, and feed that back to make/ninja/...

Just want to optimize completion time/decrease latency sometimes - like if the thing you're emerging is necessary for something else -- and it may not even be in portage. Say a private project needs a new version of llvm ...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?

szatox · Advocate Joined: 27 Aug 2013 Posts: 3424

You could try limiting parallelism with load. I think it's -l in MAKEOPTS.
Hitting SWAP should make your load skyrocket due to processes waiting for disk IO.

Another thing, emerge has a bunch of hooks. Perhaps there is one that could be used to restart a failed build with some options changed?

pingtoo · Posted: Tue Oct 10, 2023 7:36 pm Post subject:

eccerr0r · Posted: Tue Oct 10, 2023 7:44 pm Post subject:

Zucca · Posted: Tue Oct 10, 2023 7:59 pm Post subject:

sam_ · Developer Joined: 14 Aug 2020 Posts: 1957

See also bug 692576 and https://github.com/gentoo/portage/pull/913.

eccerr0r · Posted: Tue Oct 10, 2023 11:38 pm Post subject:

Currently I think the two classes of evil are (for distcc/portage)...

- all sorts of LTO. For some reason LTO thinks it's ok to spawn a whole bunch of jobs. Really need to stop it from doing it if make is submitting these jobs, at least using -l with make will limit -- but not of one job spawns 16 sub-jobs. At least IIRC all the times I've seen LTO builds it spawns a whole bunch of jobs, and load average ends up way higher than -l specification.
- anything that has mixed builds of (some non distcc'able language like rust) + C/C++ -- things like librsvg and cbindgen are fine since they're pretty much all rust so we can make a package.env for them, but rust itself (C++ for LLVM, and rust for the rest of rust) and things like firefox (mixture of C++ and rust).

I don't know if portage integration really is the problem. Using emerge --jobs is one thing, mainly I run it because of the faster merge/ebuilds that have a lot of single threaded portions compared to the compile portion so some parallelism can be gotten there, but usually that is not CPU intensive and more disk i/o than CPU.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?

Hu · Administrator Joined: 06 Mar 2007 Posts: 22626

Exactly what command line options are you using to enable LTO? Depending on how you invoke it, it can interface with the GNU make jobserver to manage the maximum number of job tokens the LTO processing consumes.

eccerr0r · Posted: Wed Oct 11, 2023 1:01 am Post subject:

Not sure, thought the default should deal with it?

I'm not even sure what llvm-tblgen is doing, it's part of the llvm build that seems to be disobeying load average for some reason.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?