View previous topic :: View next topic |
Author |
Message |
pjp Administrator
Joined: 16 Apr 2002 Posts: 20485
|
Posted: Sun Mar 31, 2019 12:13 am Post subject: Build and run mixed native and generic binaries? |
|
|
Yet another post related to building binaries. But after reading a lot of posts on the subject (thanks NeddySeagoon, Hu and others!), I still didn't come across answers to these questions.
I'd like to not do any build work on my laptop. And I'd prefer to keep some features of the CPU, such as acpi. At least I'm guessing that would be useful for a laptop? I wasn't able to identify what it did, but would guess "power saving" for the CPU if nothing else. Besides, that's just one obvious example.
I've mostly eliminated distcc as not a good candidate, primarily due to exclusive use of wireless with the laptop. Having to park the laptop away with an cable is not particularly helpful.
A last resort would be building generic binaries in chroots, probably one per host (X, no X, etc).
Laptop: Intel, currently -march=native.
Build host: AMD Phenom, also -march=native.
First question, as in the title, can the AMD system build generic binaries for the laptop and have them co-exist with the laptop's native binaries? This doesn't solve the "no build work on the laptop" part, but I could at least offload the worst offenders. An alternate might be mostly generic binaries, except for those needing features such as acpi?
Second, would be whether or not crossdev would work to build native binaries for the laptop. If I understand correctly, then I'd probably have to do some of that with qemu? Related to that process, how readily can this be automated? From what I've read, not everything will build, and some of that will require using qemu to compile anything that failed. And if that is possible and reasonable, there remains the issue of some packages contaminating the host's files. That seems particularly worrisome.
A third and final question is mostly educational and has to do with identifying the common set of CPU instructions.
native appears to identify as broadwell:
Code: | $ gcc -### -E - -march=native 2>&1 | sed -r '/cc1/!d;s/(")|(^.* - )//g'
-march=broadwell -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -msgx -mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave -mxsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mclflushopt -mxsavec -mxsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-avx5124fmaps -mno-avx5124vnniw -mno-clwb -mno-mwaitx -mno-clzero -mno-pku -mno-rdpid --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=3072 -mtune=generic |
However, this comparison -- perhaps bogus -- of native to broadwell, seems to show some features disabled with broadwell? Which would seem to indicate that native != broadwell. What I was looking for here was a way to specify a greatest common arch (such as broadwell) and then add or subtract only a few specific instructions. This seemed like a potentially cleaner solution than listing every specific instruction in common (I also have a slightly older AMD than the previously mentioned Phenom). Code: | $ diff <(gcc -march=native -Q --help=target) <(gcc -march=broadwell -Q --help=target)
12c12
< -mabm [enabled]
---
> -mabm [disabled]
28,29c28,29
< -mavx256-split-unaligned-load [enabled]
< -mavx256-split-unaligned-store [enabled]
---
> -mavx256-split-unaligned-load [disabled]
> -mavx256-split-unaligned-store [disabled]
47c47
< -mclflushopt [enabled]
---
> -mclflushopt [disabled]
126c126
< -msgx [enabled]
---
> -msgx [disabled]
150c150
< -mtune= generic
---
< -mxsavec [enabled]
---
> -mxsavec [disabled]
160c160
< -mxsaves [enabled]
---
> -mxsaves [disabled] |
Thanks for any feedback. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Sun Mar 31, 2019 1:26 pm Post subject: |
|
|
- Yes it can build generic for the other (assuming they are sharing same arch, it's more arch trouble, but i assume both are i686 or amd64 arch)
- The binaries cannot co-exists, amd one binary share the same name as generic ones, so to do so, you need a wrapper that change binary target directory to a specific one rather than default one: see PKGDIR
- The problem is not really co-existence of the binaries, it's more having the same tree on both computer.
Here's an example showing the real problem there
intel host use packages: pg1, pg1 need pg2, pg3[something] need pg2[ssl]
amd host use : pg2[-ssl] for pg4
when you will build the generic "pg3" for the intel host, amd host will need to build : pg1, pg3[something] but also rebuild pg2[ssl] as it have pg2 already but build with -ssl
but asking pg2[ssl] on the amd host will boom with blockers because pg4 do need pg2[-ssl]
It's something that you could be avoid by using the USEFLAG of the target host, with still a limit: it's something undoable with an ebuild that have a build dependency (because you build the build dependency but you don't install it, and the package that need it to build will lack it if the build host doesn't have it).
You should set --usepkg (to re-use previously made binary and avoid rebuilding them), --buildpkgonly (to keep your amd system all safe), PORTAGE_CONFIGROOT (to use a copy of your target host /etc/portage, setting all the USEFLAG it use, CFLAGS... in one shot) and you should also set PKGDIR to build these binary elsewhere and re-use them
From my own experience:
- distcc is the most easy
- chroot is the best if build cpu have no problem running target code
- building binary for the other is easy if they have a common tree and useflag, a pain if they are really too different |
|
Back to top |
|
|
Hu Administrator
Joined: 06 Mar 2007 Posts: 22686
|
Posted: Sun Mar 31, 2019 4:49 pm Post subject: Re: Build and run mixed native and generic binaries? |
|
|
pjp wrote: | I'd like to not do any build work on my laptop. And I'd prefer to keep some features of the CPU, such as acpi. At least I'm guessing that would be useful for a laptop? I wasn't able to identify what it did, but would guess "power saving" for the CPU if nothing else. Besides, that's just one obvious example. | ACPI generally isn't an optional feature on user packages. equery hasuse acpi returns no hits for me. It's an optional kernel feature, and generally you want it enabled for laptops and desktops. Your concern makes sense in the broader context.
pjp wrote: | I've mostly eliminated distcc as not a good candidate, primarily due to exclusive use of wireless with the laptop. Having to park the laptop away with an cable is not particularly helpful. | I don't understand how one implies the other. You can use distcc over any medium that can transport TCP. It may be unacceptably slow in some cases, but it can work.
pjp wrote: | First question, as in the title, can the AMD system build generic binaries for the laptop and have them co-exist with the laptop's native binaries? | Yes. As far as I know, differing -march values should never change the ABI. You should be able to freely mix generic and customized files, assuming the executing CPU understands the instructions in both. pjp wrote: | An alternate might be mostly generic binaries, except for those needing features such as acpi? | Yes. You might also benefit from FEATURES=binpkg-multi-instance, which lets you keep multiple builds of the same version of a package. pjp wrote: | Second, would be whether or not crossdev would work to build native binaries for the laptop. If I understand correctly, then I'd probably have to do some of that with qemu? | You might make it work, but it's a lot of trouble for no gain. These two CPUs are both x86-compatible, so there's no reason to use qemu. They are probably both 64-bit capable, so if you run both of them as amd64 (not x86), you don't need crossdev. pjp wrote: | native appears to identify as broadwell: | I read that to say that -march=broadwell enables fewer features than -march=native. If I were to guess, it would be that your chip is not a first-generation broadwell, or maybe not even a broadwell at all, but rather a later generation that is strictly superior to v1.0 broadwell (handles new instructions, may have a better clock speed than v1.0 broadwell, etc.). However, the next named family gcc recognizes enables even more capabilities than your chip understands, so gcc cannot use that name. Instead, it starts at broadwell, then adds on features that your chip handles that a basic v1.0 broadwell would not (or perhaps should not, if early broadwell understood those instructions but executed them poorly).
There's no risk in using -march=broadwell and ignoring the extra options. You won't give the compiler quite as much permission to optimize as your chip could handle, but the resulting code should run correctly.
Generally, I would avoid using -mno-insn parameters for disabling features you cannot use. A buggy build system might delete those, and then build a program you cannot run. If instead you choose the next lowest -march and add features, this hypothetical buggy build system would produce suboptimal but usable code. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54578 Location: 56N 3W
|
Posted: Sun Mar 31, 2019 9:31 pm Post subject: |
|
|
pjp,
Lets not confuse building and running.
If you put Code: | CFLAGS=-march=broadwell -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -msgx -mbmi2 -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave -mxsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mclflushopt -mxsavec -mxsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-avx5124fmaps -mno-avx5124vnniw -mno-clwb -mno-mwaitx -mno-clzero -mno-pku -mno-rdpid --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=3072 -mtune=generic | into chroot on the Phenom,
The Phenom will be quite happy building like that.
You will experience issues when you come to run the code on the Phenom, so don't. This means that most of binaries in the stage3 must not be built with the Code: | CFLAGS=-march=broadwell... |
However, you can still (mostly) FEATURES=buildpkgonly those packages with the laptop CFLAGS.
A few times you will get lucky, The instruction sets have a lot in common, so some things build for the laptop will run on the Phenom but that's only because gcc did not need any laptop only instructions.
You will discover a few odd build systems that build code, then run it during the course of the build, they may fail with illegal instruction errors.
Don't forget the laptop CPU_FLAGS_X86 in the chroot.
You can save space by deleting the -mno- flags. That's the default anyway. so it need not be specified explicitly. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6148 Location: Dallas area
|
Posted: Sun Mar 31, 2019 10:06 pm Post subject: |
|
|
google "gcc march vs mtune" for the difference and what it might help with what you're trying to do. _________________ UM780, 6.1 zen kernel, gcc 13, profile 17.0 (custom bare multilib), openrc, wayland |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20485
|
Posted: Mon Apr 01, 2019 5:05 am Post subject: |
|
|
krinn wrote: | - The binaries cannot co-exists, amd one binary share the same name as generic ones, so to do so, you need a wrapper that change binary target directory to a specific one rather than default one: see PKGDIR | Sorry, I worded that poorly. I don't mean native /usr/bin/foo and generic /usr/bin/foo. I meant native foo and generic bar. I was thinking there could be a problem if they both required the same library.
krinn wrote: | - The problem is not really co-existence of the binaries, it's more having the same tree on both computer.
Here's an example showing the real problem there
intel host use packages: pg1, pg1 need pg2, pg3[something] need pg2[ssl]
amd host use : pg2[-ssl] for pg4
when you will build the generic "pg3" for the intel host, amd host will need to build : pg1, pg3[something] but also rebuild pg2[ssl] as it have pg2 already but build with -ssl
but asking pg2[ssl] on the amd host will boom with blockers because pg4 do need pg2[-ssl]
It's something that you could be avoid by using the USEFLAG of the target host, with still a limit: it's something undoable with an ebuild that have a build dependency (because you build the build dependency but you don't install it, and the package that need it to build will lack it if the build host doesn't have it). | Correct. That's why I mentioned the server with no X and the laptop with X. The build host should never be exposed to what it builds for the laptop, other than the build process itself. That's mainly why I suspected a chroot may be the only option.
krinn wrote: | You should set --usepkg (to re-use previously made binary and avoid rebuilding them), --buildpkgonly (to keep your amd system all safe), PORTAGE_CONFIGROOT (to use a copy of your target host /etc/portage, setting all the USEFLAG it use, CFLAGS... in one shot) and you should also set PKGDIR to build these binary elsewhere and re-use them | --usepkg on the laptop? I believe --getbinpkg (-g) and --usepkgonly (-K) would be what I want for the laptop.
I'll need to do more reading on how PORTAGE_CONFIGROOT / SYSROOT works. This appears to be similar to what is used in the cross build guide which warns that some packages may make changes to the host and not the target. That concerns me. What happens if it is a read only file system? It also seems like it ought to drop root privileges to avoid such damage to the host environment.
krinn wrote: | From my own experience:
- distcc is the most easy
- chroot is the best if build cpu have no problem running target code
- building binary for the other is easy if they have a common tree and useflag, a pain if they are really too different | To that last point, I think they are appreciably different.
It''s looking more and more like a chroot. So if I set up the chroot to be generic, I believe it should then be able to create specific broadwell binaries that do not need to run in the chroot.
Thanks! _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20485
|
Posted: Mon Apr 01, 2019 5:56 am Post subject: Re: Build and run mixed native and generic binaries? |
|
|
Hu wrote: | ACPI generally isn't an optional feature on user packages. equery hasuse acpi returns no hits for me. It's an optional kernel feature, and generally you want it enabled for laptops and desktops. Your concern makes sense in the broader context. | I hadn't considered that ACPi was probably a kernel issue. Makes sense. That just happened to be an instruction that jumped out as a potential problem. That probably means I'm more likely to be "safe" with amd64/x86_64 environment rather than worrying about anything broadwell specific. That also helps clarify binary distros not having that kind of problem with laptops or similar hardware issues.
Hu wrote: | pjp wrote: | I've mostly eliminated distcc as not a good candidate, primarily due to exclusive use of wireless with the laptop. Having to park the laptop away with an cable is not particularly helpful. | I don't understand how one implies the other. You can use distcc over any medium that can transport TCP. It may be unacceptably slow in some cases, but it can work. | The unacceptably slow part seemed to make it effectively not usable if theoretically possible. That was just the primary reason. It is also just another layer of complexity I didn't want to have deal with or maintain. I'm also wanting to use a similar setup for VMs (and if applicable, containers). Running distcc in VMs on the same host doesn't sound like the best method (unless it is more of a necessity with an ARM). In the long run, I may end up trying to do all this with Catalyst. But initially not with the laptop.
Hu wrote: | Yes. As far as I know, differing -march values should never change the ABI. You should be able to freely mix generic and customized files, assuming the executing CPU understands the instructions in both. | Excellent. Although I may not need to bother given the previous discussion about ACPI. I was guessing that it should work, but was concerned about @system and libraries that would be shared between a generic and native binary.
Hu wrote: | pjp wrote: | An alternate might be mostly generic binaries, except for those needing features such as acpi? | Yes. You might also benefit from FEATURES=binpkg-multi-instance, which lets you keep multiple builds of the same version of a package. | Thanks, looks interesting. I'm amazed at how much Portage can do. At times it can be overwhelming.
Hu wrote: | You might make it work, but it's a lot of trouble for no gain. These two CPUs are both x86-compatible, so there's no reason to use qemu. They are probably both 64-bit capable, so if you run both of them as amd64 (not x86), you don't need crossdev. | This was mainly if other solutions wouldn't work. I was thinking about the AMD host building broadwell binaries and whether or not it might have to run them for part of the process. I'm thinking this way probably because I still don't have a firm grasp of how the build host separates building stuff for itself vs. the build duties for the laptop in such a way the the build host isn't affected by the results.
It seems like most descriptions of the process focus around the build host building what it is going to build for itself, and that every other system match it in configuration to use those binaries (or the build host matches a common set of everything else). That's all fine and well, but as soon as server vs. desktop or laptop comes into play, that method seems to no longer be a viable method. I hadn't come across anything which describes how to use a build host for maintaining multiple variations of server, VMs and one or more types of desktop or laptop. Or if I did read it, I didn't realize that's what I was reading.
Hu wrote: | I read that to say that -march=broadwell enables fewer features than -march=native. If I were to guess, it would be that your chip is not a first-generation broadwell, or maybe not even a broadwell at all, but rather a later generation that is strictly superior to v1.0 broadwell (handles new instructions, may have a better clock speed than v1.0 broadwell, etc.). However, the next named family gcc recognizes enables even more capabilities than your chip understands, so gcc cannot use that name. Instead, it starts at broadwell, then adds on features that your chip handles that a basic v1.0 broadwell would not (or perhaps should not, if early broadwell understood those instructions but executed them poorly).
There's no risk in using -march=broadwell and ignoring the extra options. You won't give the compiler quite as much permission to optimize as your chip could handle, but the resulting code should run correctly.
Generally, I would avoid using -mno-insn parameters for disabling features you cannot use. A buggy build system might delete those, and then build a program you cannot run. If instead you choose the next lowest -march and add features, this hypothetical buggy build system would produce suboptimal but usable code. | Thanks. I thought that would probably work, but had no idea. native being newer than what gcc thought of as a broadwell seemed the likely reason. But I do my best to avoid assumptions that might have a deleterious effect I think that helps for when I eventually get a Ryzen (I'll then have 3 AMD systems) and maybe a Raspberry Pi (or something low powered that can use WOL to poke other systems).
Thanks! _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20485
|
Posted: Mon Apr 01, 2019 6:09 am Post subject: |
|
|
NeddySeagoon wrote: | The Phenom will be quite happy building like that.
You will experience issues when you come to run the code on the Phenom, so don't. This means that most of binaries in the stage3 must not be built with the Code: | CFLAGS=-march=broadwell... |
However, you can still (mostly) FEATURES=buildpkgonly those packages with the laptop CFLAGS.
A few times you will get lucky, The instruction sets have a lot in common, so some things build for the laptop will run on the Phenom but that's only because gcc did not need any laptop only instructions. | In theory, I understand the build vs. run. I'm just unclear on the run part and how it all comes together in the end. I'll start with a chroot for a basic VM tomorrow.
NeddySeagoon wrote: | You will discover a few odd build systems that build code, then run it during the course of the build, they may fail with illegal instruction errors. | This is the type of issue that makes me think it isn't a process that lends itself well to automation. Along the lines of sync however often, check for a reason to update, then build the new stuff. The reason to or not to update would be manual, but the other stuff would be mostly automatic.
NeddySeagoon wrote: | Don't forget the laptop CPU_FLAGS_X86 in the chroot.
You can save space by deleting the -mno- flags. That's the default anyway. so it need not be specified explicitly. | Thanks for the reminder. I've made too many notes that I'm likely to forget something. Although I'm now leaning more toward the generic approach for simplicity.
Thanks! _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20485
|
Posted: Mon Apr 01, 2019 6:17 am Post subject: |
|
|
Anon-E-moose wrote: | google "gcc march vs mtune" for the difference and what it might help with what you're trying to do. | Will do. I've read some about it while looking for other things, but I hadn't yet searched with the purpose of understanding what they do. Thanks for the suggestion. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54578 Location: 56N 3W
|
Posted: Mon Apr 01, 2019 10:24 am Post subject: |
|
|
pjp,
Consider the following chroot process.
Make a chroot and untar the stage3 there. Its built with no -march set and -mtune=generic, so it runs anywhere.
For each package in the stage3 set package specific CFLAGS containing -mtune=generic.
You probably want per package CPU_FLAGS_X86 for the system set too.
Optionally, you can do better by using CFLAGS and CPU_FLAGS_X86 in common but that changes the performance of the chroot and laptop, not the concept.
Now to get binary packages for the system set that will run on both systems.
The content of CFLAGS and CPU_FLAGS_X86 in make.conf does not matter at this time, as the per package setting will be used.
Set up CFLAGS and CPU_FLAGS_X86 in make.conf to suit the laptop and any other settings that affect the build of packages.
Copy the laptops world file to the chroot
to rebuild everything to suit the laptop.
More pedantically, this will rebuild everything on the laptop, using the CFLAGS and CPU_FLAGS_X86 from make.conf for all the packages outside of the @system set.
It will rebuild @system too but using the per package CFLAGS and CPU_FLAGS_X86 again.
That's a bit of a waste and can be avoided by not doing the above, but its easier to explain and understand in complete logical chunks.
If you want non @system set packages in the chroot because you don't like nano for $EDITOR, add them to the package list that gets the best compromise instruction net.
Likewise, if there are things there you will never use in the chroot, remove them from the per package list.
None of the above changes the ABI in use. Only the instructions called to implement the ABI are affected.
There are a few special packages, like linux-firmware, kernel sources that do not do any building, one is binary blobs, the other is text, so making packages is not useful.
On the topic of foreign arch VMs or chroots, cross compiling on the host is useful. The choice is between running the native compiler in QEMU or the cross compiler on the host.
The latter is more efficient. Some of the gains are offset by QEMU driving the network stack in the guest. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|