View previous topic :: View next topic |
Author |
Message |
L1NTHALO n00b
Joined: 27 Aug 2024 Posts: 33
|
Posted: Wed Oct 30, 2024 11:41 am Post subject: Do aggressive optimizations make sense? |
|
|
There are many options to "optimize" gentoo and I've been wondering which one makes the most sense. You hear a lot of different advice, so it would be great to hear from someone who knows
I myself am on the extreme end of aggressive optimizations. I have set -march=native, -pipe, -flto, -O3, -fraphite-identity, -floop-nest-optimize, -fno-semantic-interposition, -fipa-pta and -fdevirtualize-at-ltrans globally, enabled lto, pgo and graphite USE Flags and compiled my kernel with clang ThinLTO.
The question is does this even make sense? I've heard that O3 is not necessarily always beneficial and that you should only use lto on packages that have the use flag. Am I doing more harm than good by setting all these flags? What about graphite, does it make sense to enable globally?
Would it be better to just use safe CFLAGS and maybe graphite globally and only set lto for packages that support it? What about Kernel LTO?
Does it all even make a difference? |
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3477
|
Posted: Wed Oct 30, 2024 12:02 pm Post subject: |
|
|
Quote: | You hear a lot of different advice, so it would be great to hear from someone who knows | This is because people have a lot of different needs, purposes, and priorities.
Optimization is about trade-offs, you sacrifice a thing you don't care about to maximize something of greater value to you.
-O0 is optimal when going for short compilation time.
Back in the days when games would be printed in magazines, when you'd have to type the code into your computer before you could play, optimization meant minimizing the size of the source code.
Inlining function calls speeds up execution by reducing number of jumps, but increases the size of executable.
Basically, figure out what do you want first, and THEN think how to get there. In many cases just accepting the defaults is optimal, because it saves you time you'd wast eon research. _________________ Make Computing Fun Again |
|
Back to top |
|
|
eschwartz Developer
Joined: 29 Oct 2023 Posts: 240
|
Posted: Wed Oct 30, 2024 3:31 pm Post subject: |
|
|
L1NTHALO wrote: | There are many options to "optimize" gentoo and I've been wondering which one makes the most sense. You hear a lot of different advice, so it would be great to hear from someone who knows their shit.
I myself am on the extreme end of aggressive optimizations. I have set -march=native, -pipe, -flto, -O3, -fraphite-identity, -floop-nest-optimize, -fno-semantic-interposition, -fipa-pta and -fdevirtualize-at-ltrans globally, enabled lto, pgo and graphite USE Flags and compiled my kernel with clang ThinLTO.
The question is does this even make sense? I've heard that O3 is not necessarily always beneficial |
-O3 is less commonly used, so compiler developers have less commonly tested it and it is possible you may be the first person to try something and hit a bug in that thing. It's unlikely to be an issue though.
Aside for that, -O3 enables more aggressive optimizations. Many optimizations are a gamble with reasonably determined stakes, that doing one thing instead of something else is *usually* faster and therefore is worth always doing because it's an average win. It's entirely possible to use -O2 and get slower code than -O0 in extremely carefully rigged scenarios -- but it is doubtful. In practice you will always win because of averages. Is -O3 a bit less of a clear win? Maybe? Cost-benefit analyses are fiddly. It's probably fine. You are unlikely to end up with slower programs anyway.
L1NTHALO wrote: | and that you should only use lto on packages that have the use flag. |
The reverse is true. Gentoo is systematically removing USE=lto from packages that do have it in favor of expecting -flto to be in your *FLAGS variables. The expectation is that packages which fail to build with LTO enabled in your flags, should be reported as a bug -- many such bugs are already reported at https://bugs.gentoo.org/show_bug.cgi?id=lto
There are still 100+ bugs opened for LTO issues, but there are a total of 1143 if you include the ones which were closed as fixed so we are doing pretty good. Eventually they will all be fixed. In many cases that means using "filter-lto" in the ebuild to modify the *FLAGS from make.conf to not include -flto when it doesn't work.
L1NTHALO wrote: | Am I doing more harm than good by setting all these flags? What about graphite, does it make sense to enable globally? |
graphite is not well maintained in GCC and it will likely end up removed entirely one of these days (as the optimizations it was supposed to make are just being implemented via other methods anyway). It's not necessarily a bug to use it but its *benefits* are rather doubtful. |
|
Back to top |
|
|
L1NTHALO n00b
Joined: 27 Aug 2024 Posts: 33
|
Posted: Wed Oct 30, 2024 3:46 pm Post subject: |
|
|
So if I don't care about potential bugs (which I don't) -O3 and -flto should be a win on average right?
What about the other optimizations related to lto (-fdevirtualize-at-ltrans -fipa-pta)? They are bundled in with the gentooLTO project. |
|
Back to top |
|
|
eschwartz Developer
Joined: 29 Oct 2023 Posts: 240
|
Posted: Wed Oct 30, 2024 4:04 pm Post subject: |
|
|
L1NTHALO wrote: | So if I don't care about potential bugs (which I don't) -O3 and -flto should be a win on average right?
What about the other optimizations related to lto (-fdevirtualize-at-ltrans -fipa-pta)? They are bundled in with the gentooLTO project. |
fipa-pta is abandoned, does not scale, needs major redesign, horribly prone to having the compiler itself segfault with an Internal Compiler Error
-fdevirtualize-at-ltrans is mainly good for being buggy and also really slow...
The gentooLTO project is effectively dead and has been reimplemented from scratch in mainline gentoo. I don't think their recommendations were ever vigorously researched.
Anyways you can use all these flags if you like but if you don't want to spend your time potentially reporting miscompiles to GCC itself, I would stick with -march=native -O3 -flto.
Note also w.r.t. -pipe, that isn't an optimization flag, it is part of the default CFLAGS because it makes the compiler faster at passing data between the different phases and should only NOT be used, when part of your toolchain is not from GNU and is incapable of reading from pipes. It's a pretty exotic failure mode. |
|
Back to top |
|
|
L1NTHALO n00b
Joined: 27 Aug 2024 Posts: 33
|
Posted: Wed Oct 30, 2024 4:11 pm Post subject: |
|
|
Thank you very much! So much amazing info.
One last question: where would one be informed about such developments (e.g. graphite being dead, gentoo trying to remove USE=lto)? Is there any news platform or is it just part of being a developer? |
|
Back to top |
|
|
eschwartz Developer
Joined: 29 Oct 2023 Posts: 240
|
Posted: Thu Oct 31, 2024 4:04 pm Post subject: |
|
|
Kind of part of being a developer. For example, you can hear a lot of interesting stories about the state of compiler optimization passes if you hang out with Gentoo's toolchain team (several of whom are also upstream developers for gcc, glibc, binutils...) |
|
Back to top |
|
|
CaptainBlood Advocate
Joined: 24 Jan 2010 Posts: 3978
|
Posted: Thu Oct 31, 2024 4:32 pm Post subject: |
|
|
@eschwartz
Nice sum up.
Maybe its time to remove sys-devel/gcc USE=graphite here...
I've removed from CFLAGS years ago.
How to cancel ebuild lto filtering
Thks 4 ur attention, interest & support. _________________ USE="-* ..." in /etc/portage/make.conf here, i.e. a countermeasure to portage implicit braces, belt & diaper paradigm
LT: "I've been doing a passable imitation of the Fontana di Trevi, except my medium is mucus. Sooo much mucus. " |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|