Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Cannot emerge world
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
sam_
Developer
Developer


Joined: 14 Aug 2020
Posts: 1737

PostPosted: Tue Jun 25, 2024 7:17 am    Post subject: Reply with quote

The ICE is weird in that it's abnormal to get so many within a single package even if something is pretty badly broken (but a real bug).

Death within GCC's hash tables or garbage collector almost always means HW failure (usually bad RAM, but possibly overclocking-induced or overheating). How long did you run memtest for? Is your CPU overclocked?
Back to top
View user's profile Send private message
logrusx
Veteran
Veteran


Joined: 22 Feb 2018
Posts: 1833

PostPosted: Tue Jun 25, 2024 8:28 am    Post subject: Reply with quote

sam_ wrote:
The ICE is weird in that it's abnormal to get so many within a single package even if something is pretty badly broken (but a real bug).

Death within GCC's hash tables or garbage collector almost always means HW failure (usually bad RAM, but possibly overclocking-induced or overheating). How long did you run memtest for? Is your CPU overclocked?


It's pretty consistent though. I have some experience with bad ram and it's not like that at all. Actually back in the days I had faulty CPU too and it's never the same error. It just hangs in many and very different situations, sometimes just while idling.

I'm leaning towards a combination of a bug and circumstances.

Best Regards,
Georgi
Back to top
View user's profile Send private message
sam_
Developer
Developer


Joined: 14 Aug 2020
Posts: 1737

PostPosted: Tue Jun 25, 2024 9:30 am    Post subject: Reply with quote

I've seen bad RAM manifest exactly like this. As I said, the hash table in GCC (and the GC) are sensitive to any corruption and hence easily trip over it. (That's why I said it. I deal with real ICEs all the time, and I also get plenty of bogus reports caused by HW.)

Anyway, if it isn't (though I'm sceptical), I'll need preprocessed source from running the failing command manually in the builddir and appending -save-temps. It'll create a .ii file. See also https://wiki.gentoo.org/wiki/GCC_ICE_reporting_guide.
Back to top
View user's profile Send private message
logrusx
Veteran
Veteran


Joined: 22 Feb 2018
Posts: 1833

PostPosted: Tue Jun 25, 2024 5:30 pm    Post subject: Reply with quote

sam_ wrote:
I've seen bad RAM manifest exactly like this. As I said, the hash table in GCC (and the GC) are sensitive to any corruption and hence easily trip over it. (That's why I said it. I deal with real ICEs all the time, and I also get plenty of bogus reports caused by HW.)

Anyway, if it isn't (though I'm sceptical), I'll need preprocessed source from running the failing command manually in the builddir and appending -save-temps. It'll create a .ii file. See also https://wiki.gentoo.org/wiki/GCC_ICE_reporting_guide.


Yes, it actually makes sense. Those things consume the most memory and the chance of breaking exactly there is significantly higher.

cfgauss wrote:
fedeliallalinea wrote:
Code:
internal compiler error: Segmentation fault

Usually this error is related to a hardware issue.

I used smartctl to test my NVMEs (no hard disk), memtest86+ to test my memory, and s-tui to test my CPU and the hardware passes these tests.

How would I go about checking for software problems?


Smartctl is not a test actually. I don't remember if it could schedule a read/write test on the nvme. I remember I could test my HDD from the BIOS of my old ThinkPad, but I don't know about NVMEs.

Memtest86+ should be run for at least 8 passes (which I usually mention and I don't know why I didn't ask about it in this thread) and I remember somebody here on the forums had it run even more to discover memory errors. Like 12th pass or something. I think we later discovered the user was using an overclocking profile for the memory, counting on some "warranties" and claims made by the memory vendor. Could you run it overnight and make sure you're not using any overclocking memory profile, so it can really test the memory very hard? It would be the easiest to replace a faulty memory module, compared to hunting for the issue anywhere else.

What tests did you run on the NVME and for how long did you run memory tests? Did you happen to remember which pass they run up to?

Best Regards,
Georgi
Back to top
View user's profile Send private message
cfgauss
l33t
l33t


Joined: 18 May 2005
Posts: 716
Location: USA

PostPosted: Wed Jun 26, 2024 5:15 pm    Post subject: Reply with quote

sam_ wrote:
Is your CPU overclocked?

No.

I tested my 128 GB of RAM by running memtest86+ from grub overnight. This resulted in 4 passes and 0 errors. I'm away from my box for a week and only have ssh access. When I get back, for how many passes should I run memtest86+?

I'll then report the results here.
Back to top
View user's profile Send private message
logrusx
Veteran
Veteran


Joined: 22 Feb 2018
Posts: 1833

PostPosted: Wed Jun 26, 2024 5:49 pm    Post subject: Reply with quote

cfgauss wrote:
When I get back, for how many passes should I run memtest86+?


At least 8 is recommended in the documentation, if I remember correctly.

Best Regards,
Georgi
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20150

PostPosted: Fri Jun 28, 2024 12:42 am    Post subject: Reply with quote

cfgauss wrote:
sam_ wrote:
Is your CPU overclocked?

No.

I tested my 128 GB of RAM by running memtest86+ from grub overnight. This resulted in 4 passes and 0 errors. I'm away from my box for a week and only have ssh access. When I get back, for how many passes should I run memtest86+?

I'll then report the results here.
I had a memory issue on the 2nd channel of a motherboard. The only obvious symptom was that the system would reboot while trying to compile a hardened kernel and only a hardened kernel. There were no other apparent problems. That was a long time ago, but I had to run memtest for quite a while. Then when it finally found something, I had to move the physical RAM around to identify that the errors were associated with a particular slot on the motherboard and not the memory itself. I _think_ I recall that overnight wasn't long enough. I'd imagine it takes longer to go through maybe 16 times the amount of RAM I had.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum