Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Gentoo on old CPUs (e.g. no sse2)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
mmogilvi
n00b
n00b


Joined: 13 May 2011
Posts: 63

PostPosted: Tue Oct 09, 2018 3:45 am    Post subject: Gentoo on old CPUs (e.g. no sse2) Reply with quote

Recently various upstream projects seem to have started assuming SSE2 instructions are available by default on 32 bit x86 processors. This raises a question about the extent to which Gentoo should continue to support such CPUs. If not supported, it seems like it should be officially announced, or at least clearly flagged in relevant ebuilds.

I think most or all pre-SSE2 processors are more than ten years old, and today they are painfully slow for compiling the whole system, even when they work.

Note that AMD athlon processors (circa early 2000s) typically did not support SSE2, although the contemporary Pentium 4 did support it. However, that was one of the few times in history when (ignoring cost and wattage) the fastest single-core AMD performance was often faster than the fastest Intel single-core performance.

Recently I (partially) updated a few old machines that I hadn't updated in several months, and ran into new processor-related problems. A dual-processor Athlon MP machine, a VIA C3 machine ("VIA Ezra"), and a Pentium 4 machine that does have SSE2. Some of these machines also have temperature-related memory issues, manifesting with occasional random "segmentation faults" once every few hours depending on ambient temperature, although I'm filtering those unrepeatable problems out of this post.

The SSE2 processor problems don't particularly surprise me; who is going to regularly test such old/slow systems these days? I don't make any real use of these machines, but I thought I would document my findings here, in case anyone actually cares about similarly old machine:

PROBLEM 1: CHOST vs CHOST_x86

This doesn't actually involve SSE2 specifically:

On my Athlon machine my make.conf has always set CHOST to "i486-pc-linux-gnu", since I first installed gentoo years ago. But this is inconsistent with CHOST_x86 (as reported by the "portageq envvar CHOST_x86" command from https://wiki.gentoo.org/wiki/Changing_the_CHOST_variable ) which is set to "i686-pc-linux-gnu" by default. The above link warns about this possibility when changing CHOST, but I've never actually changed CHOST... The inconsistency hasn't been a problem until recently, when llvm apparently uses CHOST_x86 for installation names of various executables/etc, but clang just uses CHOST, and fails to find llvm stuff because of the inconsistency. (As a tangent, most CPU features seem to be indicated by various CPU feature flags rather than numbers embedded in CHOST i[34567]86. What exactly are the modern guidelines about the distinction between CHOST number vs feature flags?)

WORKAROUND/FIX: Set CHOST_x86 to the same thing as CHOST, and rebuild llvm.

POSSIBLE GENTOO IMPROVEMENTS:

  • Maybe CHOST_x86 default should be smarter on ancient systems where CHOST was never i686?
  • Maybe something should warn about the inconsistency?


PROBLEM 2: Qt SSE2 configuration

When built with gcc-6.4.0-r1, the qtcore-5.9.6 package builds the qHash function to require some SSE2 instructions even when I explicitly add the -mno-sse2 option to CFLAGS/CXXFLAGS. This doesn't show up until it tries to build qtwidgets: when it tries to actually use the "uic" it builds early on, uic crashes immediately with an illegal instruction in qtcore's qHash function.

This is somewhat similar to bug 552942, but not exactly. Currently, Qt5 appears to only fully disable SSE2 if the "can I compile SSE2" configure auto-test fails, or it is explicitly disabled on the compile line ("-no-sse", but not via CFLAGS?). It doesn't appear to try to RUN the compiled auto-test program. My guess is that it is supposed to detect SSE2 availability at runtime, but something about the compiler-generated code (inlining related?) is allowing some SSE2 (a movq, at least) into a code path that has not verified the availability of SSE2.

WORKAROUND: Force the "can I compile SSE2" test to intentionally fail by using epatch_user, by adding /etc/portage/patches/dev-qt/qtcore/qt-configure-failSSE2.patch
Code:
diff --git a/configure.json b/configure.json
index ce20aa3..9c554c4 100644[/bug]
--- a/configure.json
+++ b/configure.json
@@ -367,8 +367,8 @@
             "test": {
                 "include": "emmintrin.h",
                 "main": [
-                    "__m128i a = _mm_setzero_si128();",
-                    "_mm_maskmoveu_si128(a, _mm_setzero_si128(), 0);"
+                    "__m128i a = INTENTIONALLY_FAIL1();",
+                    "INTENTIONALLY_FAIL3(a, INTENTIONALLY_FAIL2(), 0);"
                 ],
                 "qmake": [
                     "!defined(QMAKE_CFLAGS_SSE2, var): error(\"This compiler does not support SSE2\")",

I haven't confirmed if this needs to be copied/symlinked to other Qt packages or not. As a defensive measure, I've symlinked the qtcore directory with this patch to several other qt packages as well, but I don't know if that is necessary.

POSSIBLE GENTOO (or upstream) IMPROVEMENTS:

  • Maybe there is a bug in gcc about what functions may or may not use SSE2 in dynamically-determined cases, that could be fixed (or maybe is already fixed in other versions)?
  • Maybe the qHash logic in Qt source could be reworked to avoid this, and only use SSE2 if dynamically determined to be enabled?
  • Maybe Qt should actually try to run an SSE2 test program (and validate some output) before deciding to enable SSE2?
  • Maybe re-introduce some way to explicitly pass -no-sse2 to Qt's configure script from qt5-build.eclass when indicated by some kind of user configuration option? There used to be a technique based on adapting CFLAGS from the aforementioned bug 552942, but it was removed in commit https://github.com/gentoo/qt/commit/19e67f928a60a88953d6b85443e630367cedf46a#diff-5b814a94d0e5e878c4a513d23ecbceff Also consider if maybe a USE-flag technique would maybe be better than a CFLAGS technique.


More details: The following notes were from Qt 5.9.4 (not 5.9.6) some months ago, but 5.9.6 seems to fail in the same way (I just didn't copy down the newer details):
Code:
/var/tmp/portage/dev-qt/qtwidgets-5.9.4-r1/work/qtbase-opensource-src-5.9.4/src/
widgets/uic_wrapper.sh dialogs/qfiledialog.ui -o .uic/ui_qfiledialog.h
make: *** [Makefile:1373: .uic/ui_qfiledialog.h] Illegal instruction
make: *** Waiting for unfinished jobs....

Code:
Dump of assembler code for function _Z5qHashRK10QByteArrayj:
   0xb7b85470 <+0>:     push   %ebp
   0xb7b85471 <+1>:     push   %edi
   0xb7b85472 <+2>:     push   %esi
   0xb7b85473 <+3>:     push   %ebx
   0xb7b85474 <+4>:     call   0xb7b2dad0
   0xb7b85479 <+9>:     add    $0x43ab87,%ebx
   0xb7b8547f <+15>:    sub    $0x1c,%esp
   0xb7b85482 <+18>:    mov    0x30(%esp),%eax
   0xb7b85486 <+22>:    mov    0x34(%esp),%ecx
   0xb7b8548a <+26>:    mov    (%eax),%ebp
   0xb7b8548c <+28>:    mov    -0x25c(%ebx),%eax
   0xb7b85492 <+34>:    mov    0x4(%ebp),%edi
   0xb7b85495 <+37>:    mov    %eax,(%esp)
   0xb7b85498 <+40>:    add    0xc(%ebp),%ebp
   0xb7b8549b <+43>:    movq   (%eax),%xmm0
=> 0xb7b8549f <+47>:    movq   %xmm0,0x8(%esp)
   0xb7b854a5 <+53>:    mov    0xc(%esp),%edx
   0xb7b854a9 <+57>:    mov    0x8(%esp),%eax
   0xb7b854ad <+61>:    mov    %edx,%esi
   0xb7b854af <+63>:    or     %eax,%esi
   0xb7b854b1 <+65>:    je     0xb7b854f8 <_Z5qHashRK10QByteArrayj+136>
   0xb7b854b3 <+67>:    xor    %edx,%edx
   0xb7b854b5 <+69>:    and    $0x100000,%eax
   0xb7b854ba <+74>:    mov    %edx,%esi
   0xb7b854bc <+76>:    or     %eax,%esi
   0xb7b854be <+78>:    jne    0xb7b854e8 <_Z5qHashRK10QByteArrayj+120>
   0xb7b854c0 <+80>:    test   %edi,%edi
   0xb7b854c2 <+82>:    mov    %ecx,%eax
   0xb7b854c4 <+84>:    je     0xb7b854dc <_Z5qHashRK10QByteArrayj+108>
   0xb7b854c6 <+86>:    add    %ebp,%edi
   0xb7b854c8 <+88>:    mov    %eax,%ecx
   0xb7b854ca <+90>:    inc    %ebp
   0xb7b854cb <+91>:    shl    $0x5,%ecx
   0xb7b854ce <+94>:    sub    %eax,%ecx
   0xb7b854d0 <+96>:    mov    %ecx,%eax
   0xb7b854d2 <+98>:    movzbl -0x1(%ebp),%ecx
   0xb7b854d6 <+102>:   add    %ecx,%eax
   0xb7b854d8 <+104>:   cmp    %ebp,%edi
   0xb7b854da <+106>:   jne    0xb7b854c8 <_Z5qHashRK10QByteArrayj+88>
   0xb7b854dc <+108>:   add    $0x1c,%esp
   0xb7b854df <+111>:   pop    %ebx
   0xb7b854e0 <+112>:   pop    %esi
   0xb7b854e1 <+113>:   pop    %edi
   0xb7b854e2 <+114>:   pop    %ebp
   0xb7b854e3 <+115>:   ret


PROBLEM 3: Rust bootstrapping requires SSE2

Rust is annoyingly needed for new firefox, increasing overall firefox build time by a factor of 2 or more. A significant concern on obsolete hardware. Not sure if firefox-bin requires SSE2, or what other browser options might work.

The rust ebuild now explicitly specifies a REQUIRE_USE=sse2 on x86, although it did not when I first started updating these machines.

(Also tangent: the rust ebuild apparently always downloads and uses a binary bootstrap build, even if you happen to already have a full build of rust installed and running? Is that necessary?)

WORKAROUND: None I can find. (Although rust takes so long to compile on old machines I kind of doubt anyone would bother, even if it can be made to work.)

POSSIBLE GENTOO IMPROVEMENTS: Some quick googling for SSE2 and rust suggests that an i586 bootstrap build existed a few years ago for such cases. Not sure if it still exists. Perhaps use that for bootstrapping if SSE2 is not available?
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7470

PostPosted: Tue Oct 09, 2018 9:18 am    Post subject: Reply with quote

for #1
This wiki article is horribly written because you cannot get what it is trying to prove, if i clearly get the example taken in it, it shown an old i386 CHOST set on an i686 capable computer, and try to show user howto migrate the i386 into a fully i686 CHOST.
This is a kinda stupid example, i686 is out for years, and users with i3(4-5)86 chost wishing to migrate to i686 could be count on fingers.
Or it could help users that pickup a wrong stage3 file, that wish moving it to a fully i686 CHOST
Not the best example to take for changing CHOST.

#2
The behavior of sse2 set in CFLAGS doesn't control anything outside gcc, by settings sse2 in gcc, you are telling gcc "wherever you could optimize the code to use sse2", there's no check if your cpu handle sse2 or not, just an instruction that tell gcc: do sse2 if you see somewhere that could be change with sse2 code.
This is how gcc has always work, because of cross building or just to build code for another cpu with the feature enable or disable as you wish.

The problem is that it only act on c code, by saying -mno-sse2 you are not saying "do not use sse2", but "do not optimize the c code into sse2 instructions even you see somewhere that could be change to sse2 instructions", and if someone use assembler or inline sse2 instruction, this mean the result will use sse2 even you have tell gcc to not by itself create any.
movq was had with mmx, making a legit usage of movq on a non sse2 cpu, so it would be bad to use movq only if your code cannot be use on a non mmx cpu ; see https://en.wikipedia.org/wiki/X86_instruction_listings#Original_MMX_instructions

To control the optional sse2 within a program, you use CPU_FLAGS_X86 ; however it is the gentoo dev that control if the package have use of CPU_FLAGS_X86 or not ; and if the package could be build with or without sse2.

#3
In case 3, the gentoo dev have properly set CPU_FLAGS_X86 to force sse2, because firefox have announce they will drop building a firefox with an host unable to handle sse2 ; see https://support.mozilla.org/en-US/kb/your-hardware-no-longer-supported
Nothing you could really do except using another browser.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54596
Location: 56N 3W

PostPosted: Tue Oct 09, 2018 5:35 pm    Post subject: Reply with quote

krinn,

The CHOST changing guide was written for when glibc dropped i386 support and many users that had installed the i386 stage3 in error needed to change their CHOST, so that they could update glibc. That's ancient history. 64 bit CPUs were just a dream :)

The CHOST sets which CPU the toolchain will run on. It does not affect other things.

C(XX)FLAGS control how C/C++ is compiled. Ebuilds can filter C(XX)FLAGS, so while passing -no-sse2 is the right thing to do, it may never get to the build system.
That will vary by package.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Brakio
n00b
n00b


Joined: 12 Nov 2018
Posts: 3

PostPosted: Mon Nov 12, 2018 9:34 pm    Post subject: Reply with quote

Hi,

I'm in a similar situation as mmogilvi.
I have an AMD Athlon XP, and was maintaining a gentoo on it because it is one the only one distributions that do not use SSE2 instructions for some applications, for instance Firefox that was crashing in several other distributions, even if using i686 configuration (because gentoo allows to configure software suited for my PC an to maintain them regularly and not too much difficultly).

Gentoo is well running on that PC (at least as efficiently as a linux on a raspberry-pi, but with more ram, a better video card and a good audio card), and I thought it would be sufficient for my son, for simple internet browsing, and simple tasks like text editing.

But now that (for a few months already) I cannot have up to date browser (Firefox) I'm wondering if I need to get rid of it, even if it still properly works for all the other things....
or just use a frozen Firefox version that will not be maintained anymore....


krinn wrote:

#3
In case 3, the gentoo dev have properly set CPU_FLAGS_X86 to force sse2, because firefox have announce they will drop building a firefox with an host unable to handle sse2 ; see https://support.mozilla.org/en-US/kb/your-hardware-no-longer-supported
Nothing you could really do except using another browser.


I have tried a lot of other browsers but none of them are building or properly working except those in text mode (lynx for instance)
or with a really little graphic support (maybe links if I remember)?

I did not find a decent browser that do not require SSE2. Correct me if I am wrong, but as Webkit requires SSE2 for a longer time than Firefox, Chrome/Chromium and other Webkit based browsers requires SSE2 too. What are good alternatives?

EDIT: I also tried Dillo. It is currently one of the better alternatives I have found, but it still really limited.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 6920

PostPosted: Tue Nov 13, 2018 1:45 pm    Post subject: Reply with quote

Netsurf is better than Dillo for most uses, and doesn't require SSE2 at all (it's designed primarily for low-end ARM).
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54596
Location: 56N 3W

PostPosted: Tue Nov 13, 2018 5:49 pm    Post subject: Reply with quote

mmogilvi,

mmogilvi wrote:
When built with gcc-6.4.0-r1, the qtcore-5.9.6 package builds the qHash function to require some SSE2 instructions even when I explicitly add the -mno-sse2 option to CFLAGS/CXXFLAGS.

The problem may be some hand optimised assembly code that assumes the use of sse2 rather than anything gcc is doing.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Brakio
n00b
n00b


Joined: 12 Nov 2018
Posts: 3

PostPosted: Wed Nov 14, 2018 8:43 pm    Post subject: Reply with quote

NeddySeagoon wrote:
mmogilvi,

mmogilvi wrote:
When built with gcc-6.4.0-r1, the qtcore-5.9.6 package builds the qHash function to require some SSE2 instructions even when I explicitly add the -mno-sse2 option to CFLAGS/CXXFLAGS.

The problem may be some hand optimised assembly code that assumes the use of sse2 rather than anything gcc is doing.


If I remember, I solved this manually, by modifying Makefiles or the ebuild, because it was using explicitly sse2 on the gcc command line even if the CHOST /CFLAGS were correct (I know I could have helped community by submitting a patch somewhere but my modifications were not clean and would have affected those that want sse2). This was not optimized assembly.

EDIT:
In fact no, looking at my history log, I have used a patch from someone else, and I think it did the job:
https://bugs.gentoo.org/648004
Back to top
View user's profile Send private message
Brakio
n00b
n00b


Joined: 12 Nov 2018
Posts: 3

PostPosted: Wed Nov 14, 2018 9:06 pm    Post subject: Reply with quote

Ant P. wrote:
Netsurf is better than Dillo for most uses, and doesn't require SSE2 at all (it's designed primarily for low-end ARM).


Thank you, I'll try this one.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum