View previous topic :: View next topic |
Author |
Message |
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9890 Location: almost Mile High in the USA
|
Posted: Mon Feb 03, 2025 6:21 pm Post subject: old binary incompatibility... |
|
|
I was just looking at my logs and noticed a program I wrote 20 years ago stopped working. Not sure when it stopped, but it now segfaults. I hadn't touched the binary for years.
I wonder what happened to it.
Timestamp looks old -- 20 years+ ... I don't have backups that old anymore to see if it was changed.
Code: | brokenbinary: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.0.0, not stripped |
I found the source code and recompiled it, and the new binary works fine. Weird!
Code: | newbinary: ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 3.2.0, not stripped |
Hmm. 3.2 vs 2.0 ... could this be the problem?
Then I tried running gdb on the broken binary:
Code: | Program received signal SIGSEGV, Segmentation fault.
0xb7e89376 in __fstatat64_time64 () from /lib/libc.so.6 |
hmm https://github.com/taviso/123elf/issues/12 ?
I guess things do break backward compatibility... i wonder if there are more binaries out there on my filesystem that no longer work... _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6218 Location: Dallas area
|
Posted: Mon Feb 03, 2025 8:23 pm Post subject: |
|
|
What does ldd report from the old binary? _________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
|
John R. Graham Administrator
Joined: 08 Mar 2005 Posts: 10729 Location: Somewhere over Atlanta, Georgia
|
Posted: Mon Feb 03, 2025 10:09 pm Post subject: |
|
|
Anecdotally, Linus keeps very old binaries around for kernel testing, some built when pre-1.* kernels were still new. I find it surprising that it stopped working.
- John _________________ I can confirm that I have received between 0 and 499 National Security Letters.
Last edited by John R. Graham on Tue Feb 04, 2025 3:09 am; edited 1 time in total |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3923 Location: Rasi, Finland
|
Posted: Mon Feb 03, 2025 10:33 pm Post subject: |
|
|
John R. Graham wrote: | I find it surprising that it stopped working. | As do I.
I merely joined here to watch how this progresses. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9890 Location: almost Mile High in the USA
|
Posted: Mon Feb 03, 2025 11:03 pm Post subject: |
|
|
According to the website it's a glibc compliation issue.
Anyway,
Code: | $ ldd brokenbinary
linux-gate.so.1 (0xb7722000)
libc.so.6 => /lib/libc.so.6 (0xb74dd000)
/lib/ld-linux.so.2 (0x80050000) |
It apparently still is a glibc stack alignment issue? Does glibc have the same backward compatibility ideal? _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6218 Location: Dallas area
|
Posted: Tue Feb 04, 2025 12:10 am Post subject: |
|
|
From a thread about glibc and ncurses, but probably applies, in general
Quote: | This old System V/386 code uses 4 byte stack alignment (i.e. stack pointer is incremented and decremented in multiples of 4 bytes).
We're linking it dynamically to libc and ncurses. If host libc and ncurses use some SSE instructions in their compiled code (which will be the case on all modern mainstream OSs), these instructions expect 16 byte stack alignment.
The old System V/386 code might leave the stack aligned to a multiple of 4 bytes that's not a multiple of 16 bytes at some point before jumping into glibc or ncurses, which will cause a segfault.
There are a couple ways to make this old code work on a modern system:
use -mstackrealign when compiling libc and ncurses for the system 123elf will run on. Some distributions do this for 32-bit versions of libraries. This flag will make the compiler generate extended function prologues and epilogues that will check stack alignment on each function call and allign it to 16 bytes if necessary. |
The old glibc 2.0 was probably done with 4 bit code and all the newer glibcs default to 16. _________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9890 Location: almost Mile High in the USA
|
Posted: Tue Feb 04, 2025 1:34 am Post subject: |
|
|
Correct, at least that's what I think is going on here.
So basically it's a userland breakage of old binaries and a recompile is needed?
Or is simply pointing an old binary to a glibc that doesn't have the alignment issue? Not sure... _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
Hu Administrator
Joined: 06 Mar 2007 Posts: 23080
|
Posted: Tue Feb 04, 2025 1:55 am Post subject: |
|
|
As I read the piece quoted above:- Very old binaries use a stack that is not well-aligned for use of SSE.
- Modern libraries may, if not told otherwise, use SSE and may assume the stack is well-aligned.
- The conflicting assumptions set up for a crash.
Possible fixes:- Rebuild the old binary with a compiler that provides a well-aligned stack.
- Rebuild the libraries not to use SSE, and therefore not require a highly-aligned stack.
- Rebuild the libraries with -mstackrealign, so that the library fixes the stack alignment before trying to use SSE.
- Use an old library that was built without SSE / without the expectation that the caller would provide a well-aligned stack.
|
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6218 Location: Dallas area
|
Posted: Tue Feb 04, 2025 10:47 am Post subject: |
|
|
Hu, from my understanding that's correct.
Just because I was curioius, I went looking for what 2.0.0 vs 3.2.0 meant,
it's based on the kernel version, long ago, glibc bumped up the minimum version of kernel needed for that version of glibc
but in 2016, they changed that and 3.2.0 has been unchanged since then.
eccerr0r, you must have compiled that binary on a very old kernel version. _________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3923 Location: Rasi, Finland
|
Posted: Tue Feb 04, 2025 11:03 am Post subject: |
|
|
I wonder how other libc implementations fare in this..? Although musl had its first release in 2011 (wikipedia states). So we can't really go too far back in time with the comparisons with that. µLibc on the other hand is much more older. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9890 Location: almost Mile High in the USA
|
Posted: Tue Feb 04, 2025 11:14 am Post subject: |
|
|
If musl took sse into account I suspect they wouldn't have to deal with this, though it would mean they waste bytes.
What is the design goal, should it use sse or will it prefer size/compatibility (does it use x87 math?) _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
Anon-E-moose Watchman
Joined: 23 May 2008 Posts: 6218 Location: Dallas area
|
Posted: Tue Feb 04, 2025 11:34 am Post subject: |
|
|
Interesting read (long) on this issue https://stackoverflow.com/questions/49391001/why-does-the-x86-64-amd64-system-v-abi-mandate-a-16-byte-stack-alignment
Quote: | Footnote 1: 32-bit Linux
Not all 32-bit platforms broke backwards compatibility with existing binaries and hand-written asm the way Linux did; some like i386 NetBSD still only use the historical 4-byte stack alignment requirement from the original version of the i386 SysV ABI.
The historical 4-byte stack alignment was also insufficient for efficient 8-byte double on modern CPUs. Unaligned fld / fstp are generally efficient except when they cross a cache-line boundary (like other loads/stores), so it's not horrible, but naturally-aligned is nice.
Even before 16-byte alignment was officially part of the ABI, GCC used to enable -mpreferred-stack-boundary=4 (2^4 = 16-bytes) on 32-bit. This currently assumes the incoming stack alignment is 16 bytes (even for cases that will fault if it's not), as well as preserving that alignment. I'm not sure if historical gcc versions used to try to preserve stack alignment without depending on it for correctness of SSE code-gen or alignas(16) objects. |
_________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
|
sam_ Developer
Joined: 14 Aug 2020 Posts: 2119
|
Posted: Tue Feb 04, 2025 12:05 pm Post subject: |
|
|
The SSE alignment issue is an interesting topic, but it's not clear to me that's actually the problem here. Not only have I not seen any (substantive) evidence given for that, but even not long ago, we had an example of a bug in glibc where something completely unrelated had broken and was promptly fixed: https://sourceware.org/PR32148. If this is the case, building glibc with USE=stack-realign will help.
If not, we would ideally see disassembly of the calling function and the crashing one in glibc. If you want it fixed, it's likely you'll need to provide that binary in an upstream bug report to glibc too.
Fixed the link so that it doesn't contain the period after it. --Zucca
Last edited by sam_ on Tue Feb 04, 2025 12:08 pm; edited 1 time in total |
|
Back to top |
|
|
sam_ Developer
Joined: 14 Aug 2020 Posts: 2119
|
Posted: Tue Feb 04, 2025 12:07 pm Post subject: |
|
|
eccerr0r wrote: | If musl took sse into account I suspect they wouldn't have to deal with this, though it would mean they waste bytes.
What is the design goal, should it use sse or will it prefer size/compatibility (does it use x87 math?) |
"into account"? The only difference there is that musl doesn't have a bunch of SIMD implementations of e.g. memcpy and friends. The question of ABI compatibility with musl is not something people often ask anyway, not least because on 32-bit platforms, you (realistically) lost it with the time_t change, even if IIRC musl's own ABI didn't change there, but surely used libraries would have by necessity. |
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9890 Location: almost Mile High in the USA
|
Posted: Tue Feb 04, 2025 6:46 pm Post subject: |
|
|
I don't know the history of musl since it's fairly new to the scene, whether musl wants 16 byte stack alignment from the get go since simd was around unlike glibc where simd was not. However due to musl targeting size I can't see that happening as alignment trades off memory for speed.
The funny thing is that the software I have does not actually use time_t directly but uses time to seed the PRNG. At least recompiling was an option for me because I still had the source code but was kind of alarmed when it stopped working. Not sure if anyone wants the binary to play with anyway so I suppose it just needs to be left at this for now. _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
sam_ Developer
Joined: 14 Aug 2020 Posts: 2119
|
Posted: Tue Feb 04, 2025 7:42 pm Post subject: |
|
|
Please try USE=stack-realign as suggested above. |
|
Back to top |
|
|
|