[tutorial] BadRAM 2025 update

ecko · Tux's lil' helper Joined: 04 Jul 2010 Posts: 116

I have been dealing with RAM issues, and I found the existing information online to be slightly outdated, so I decided to write a short tutorial.

Previous references on this forum:

https://forums.gentoo.org/viewtopic-t-1140476-start-0.html "[SOLVED] qtwebengine fails to build" (2021)
https://forums.gentoo.org/viewtopic-t-1165113-start-0.html "badmem static allocation of memory hole" (2023)

Symptoms

I had compiler segfault when building in big packages, such as dev-qt/qtwebengine-6.8.2, or other qt or kde applications. With --keep-going in the options and going until the end, I would relaunch it and often it would then be successful, though on very large packages like qtwebengine, or chromium, it would fail again.

Memtest86+

I installed sys-apps/memtest86+-7.20 and rebooted into it. Memtest86+ runs 10 different tests (e.g. simple read/write, block move, modulo 20) and on my machine at 1 GB/min, meaning 2 hours for the 128 GB I have.

Anytime during the test, press <F1><F4><F4> to go to the badram mode, where the output is most usable for the solution. Instead of listing individual failed bytes (which could be thousands), memtest86+ tries to summarize them into ranges using a mask. The output is limited to 10 lines.

Here is the output in my case, copied from a mobile phone picture:

pietinger · Posted: Sun Feb 09, 2025 9:58 am Post subject:

Moved from Kernel & Hardware to Documentation, Tips & Tricks.
_________________
https://wiki.gentoo.org/wiki/User:Pietinger

NeddySeagoon · Posted: Sun Feb 09, 2025 1:44 pm Post subject:

ecko,

Two things
A memtest failure does not always mean the RAM is faulty.
Often, removing the RAM and refitting it fixes the problem. Its called 'wiping the contacts'.

This information would be better as a Wiki page. It can get lost/forgotten here.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

ecko · Tux's lil' helper Joined: 04 Jul 2010 Posts: 116

In my case I tested with the 4 DIMMs together (initial configuration), then tested them one by one (I numbered them with a pencil), then two by two, then all 4 again. The results were consistent with 2 particular DIMMs reporting errors, and not the other 2. I did not attempt to clean the contacts though. You're right, reseating the components should be mentioned as part of the procedure.

I will consider creating an account on the wiki and move the howto there so more contributors can improve the instructions.