Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
old binary incompatibility...
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9890
Location: almost Mile High in the USA

PostPosted: Mon Feb 03, 2025 6:21 pm    Post subject: old binary incompatibility... Reply with quote

I was just looking at my logs and noticed a program I wrote 20 years ago stopped working. Not sure when it stopped, but it now segfaults. I hadn't touched the binary for years.

I wonder what happened to it.

Timestamp looks old -- 20 years+ ... I don't have backups that old anymore to see if it was changed.
Code:
brokenbinary: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.0.0, not stripped

I found the source code and recompiled it, and the new binary works fine. Weird!
Code:
newbinary:     ELF 32-bit LSB pie executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 3.2.0, not stripped

Hmm. 3.2 vs 2.0 ... could this be the problem?

Then I tried running gdb on the broken binary:
Code:
Program received signal SIGSEGV, Segmentation fault.
0xb7e89376 in __fstatat64_time64 () from /lib/libc.so.6

hmm https://github.com/taviso/123elf/issues/12 ?

I guess things do break backward compatibility... i wonder if there are more binaries out there on my filesystem that no longer work...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6218
Location: Dallas area

PostPosted: Mon Feb 03, 2025 8:23 pm    Post subject: Reply with quote

What does ldd report from the old binary?
_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
John R. Graham
Administrator
Administrator


Joined: 08 Mar 2005
Posts: 10729
Location: Somewhere over Atlanta, Georgia

PostPosted: Mon Feb 03, 2025 10:09 pm    Post subject: Reply with quote

Anecdotally, Linus keeps very old binaries around for kernel testing, some built when pre-1.* kernels were still new. I find it surprising that it stopped working.

- John
_________________
I can confirm that I have received between 0 and 499 National Security Letters.


Last edited by John R. Graham on Tue Feb 04, 2025 3:09 am; edited 1 time in total
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3923
Location: Rasi, Finland

PostPosted: Mon Feb 03, 2025 10:33 pm    Post subject: Reply with quote

John R. Graham wrote:
I find it surprising that it stopped working.
As do I.
I merely joined here to watch how this progresses.
_________________
..: Zucca :..

My gentoo installs:
init=/sbin/openrc-init
-systemd -logind -elogind seatd

Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9890
Location: almost Mile High in the USA

PostPosted: Mon Feb 03, 2025 11:03 pm    Post subject: Reply with quote

According to the website it's a glibc compliation issue.
Anyway,
Code:
$ ldd brokenbinary
        linux-gate.so.1 (0xb7722000)
        libc.so.6 => /lib/libc.so.6 (0xb74dd000)
        /lib/ld-linux.so.2 (0x80050000)

It apparently still is a glibc stack alignment issue? Does glibc have the same backward compatibility ideal?
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6218
Location: Dallas area

PostPosted: Tue Feb 04, 2025 12:10 am    Post subject: Reply with quote

From a thread about glibc and ncurses, but probably applies, in general

Quote:
This old System V/386 code uses 4 byte stack alignment (i.e. stack pointer is incremented and decremented in multiples of 4 bytes).

We're linking it dynamically to libc and ncurses. If host libc and ncurses use some SSE instructions in their compiled code (which will be the case on all modern mainstream OSs), these instructions expect 16 byte stack alignment.

The old System V/386 code might leave the stack aligned to a multiple of 4 bytes that's not a multiple of 16 bytes at some point before jumping into glibc or ncurses, which will cause a segfault.

There are a couple ways to make this old code work on a modern system:

use -mstackrealign when compiling libc and ncurses for the system 123elf will run on. Some distributions do this for 32-bit versions of libraries. This flag will make the compiler generate extended function prologues and epilogues that will check stack alignment on each function call and allign it to 16 bytes if necessary.


The old glibc 2.0 was probably done with 4 bit code and all the newer glibcs default to 16.
_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9890
Location: almost Mile High in the USA

PostPosted: Tue Feb 04, 2025 1:34 am    Post subject: Reply with quote

Correct, at least that's what I think is going on here.
So basically it's a userland breakage of old binaries and a recompile is needed?
Or is simply pointing an old binary to a glibc that doesn't have the alignment issue? Not sure...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 23080

PostPosted: Tue Feb 04, 2025 1:55 am    Post subject: Reply with quote

As I read the piece quoted above:
  • Very old binaries use a stack that is not well-aligned for use of SSE.
  • Modern libraries may, if not told otherwise, use SSE and may assume the stack is well-aligned.
  • The conflicting assumptions set up for a crash.
Possible fixes:
  • Rebuild the old binary with a compiler that provides a well-aligned stack.
  • Rebuild the libraries not to use SSE, and therefore not require a highly-aligned stack.
  • Rebuild the libraries with -mstackrealign, so that the library fixes the stack alignment before trying to use SSE.
  • Use an old library that was built without SSE / without the expectation that the caller would provide a well-aligned stack.
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6218
Location: Dallas area

PostPosted: Tue Feb 04, 2025 10:47 am    Post subject: Reply with quote

Hu, from my understanding that's correct.

Just because I was curioius, I went looking for what 2.0.0 vs 3.2.0 meant,
it's based on the kernel version, long ago, glibc bumped up the minimum version of kernel needed for that version of glibc
but in 2016, they changed that and 3.2.0 has been unchanged since then.

eccerr0r, you must have compiled that binary on a very old kernel version.
_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3923
Location: Rasi, Finland

PostPosted: Tue Feb 04, 2025 11:03 am    Post subject: Reply with quote

I wonder how other libc implementations fare in this..? Although musl had its first release in 2011 (wikipedia states). So we can't really go too far back in time with the comparisons with that. µLibc on the other hand is much more older.
_________________
..: Zucca :..

My gentoo installs:
init=/sbin/openrc-init
-systemd -logind -elogind seatd

Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9890
Location: almost Mile High in the USA

PostPosted: Tue Feb 04, 2025 11:14 am    Post subject: Reply with quote

If musl took sse into account I suspect they wouldn't have to deal with this, though it would mean they waste bytes.

What is the design goal, should it use sse or will it prefer size/compatibility (does it use x87 math?)
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6218
Location: Dallas area

PostPosted: Tue Feb 04, 2025 11:34 am    Post subject: Reply with quote

Interesting read (long) on this issue https://stackoverflow.com/questions/49391001/why-does-the-x86-64-amd64-system-v-abi-mandate-a-16-byte-stack-alignment

Quote:
Footnote 1: 32-bit Linux

Not all 32-bit platforms broke backwards compatibility with existing binaries and hand-written asm the way Linux did; some like i386 NetBSD still only use the historical 4-byte stack alignment requirement from the original version of the i386 SysV ABI.

The historical 4-byte stack alignment was also insufficient for efficient 8-byte double on modern CPUs. Unaligned fld / fstp are generally efficient except when they cross a cache-line boundary (like other loads/stores), so it's not horrible, but naturally-aligned is nice.

Even before 16-byte alignment was officially part of the ABI, GCC used to enable -mpreferred-stack-boundary=4 (2^4 = 16-bytes) on 32-bit. This currently assumes the incoming stack alignment is 16 bytes (even for cases that will fault if it's not), as well as preserving that alignment. I'm not sure if historical gcc versions used to try to preserve stack alignment without depending on it for correctness of SSE code-gen or alignas(16) objects.

_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
sam_
Developer
Developer


Joined: 14 Aug 2020
Posts: 2119

PostPosted: Tue Feb 04, 2025 12:05 pm    Post subject: Reply with quote

The SSE alignment issue is an interesting topic, but it's not clear to me that's actually the problem here. Not only have I not seen any (substantive) evidence given for that, but even not long ago, we had an example of a bug in glibc where something completely unrelated had broken and was promptly fixed: https://sourceware.org/PR32148. If this is the case, building glibc with USE=stack-realign will help.

If not, we would ideally see disassembly of the calling function and the crashing one in glibc. If you want it fixed, it's likely you'll need to provide that binary in an upstream bug report to glibc too.

Fixed the link so that it doesn't contain the period after it. --Zucca


Last edited by sam_ on Tue Feb 04, 2025 12:08 pm; edited 1 time in total
Back to top
View user's profile Send private message
sam_
Developer
Developer


Joined: 14 Aug 2020
Posts: 2119

PostPosted: Tue Feb 04, 2025 12:07 pm    Post subject: Reply with quote

eccerr0r wrote:
If musl took sse into account I suspect they wouldn't have to deal with this, though it would mean they waste bytes.

What is the design goal, should it use sse or will it prefer size/compatibility (does it use x87 math?)


"into account"? The only difference there is that musl doesn't have a bunch of SIMD implementations of e.g. memcpy and friends. The question of ABI compatibility with musl is not something people often ask anyway, not least because on 32-bit platforms, you (realistically) lost it with the time_t change, even if IIRC musl's own ABI didn't change there, but surely used libraries would have by necessity.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9890
Location: almost Mile High in the USA

PostPosted: Tue Feb 04, 2025 6:46 pm    Post subject: Reply with quote

I don't know the history of musl since it's fairly new to the scene, whether musl wants 16 byte stack alignment from the get go since simd was around unlike glibc where simd was not. However due to musl targeting size I can't see that happening as alignment trades off memory for speed.

The funny thing is that the software I have does not actually use time_t directly but uses time to seed the PRNG. At least recompiling was an option for me because I still had the source code but was kind of alarmed when it stopped working. Not sure if anyone wants the binary to play with anyway so I suppose it just needs to be left at this for now.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
sam_
Developer
Developer


Joined: 14 Aug 2020
Posts: 2119

PostPosted: Tue Feb 04, 2025 7:42 pm    Post subject: Reply with quote

Please try USE=stack-realign as suggested above.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum