Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] Cannot emerge world
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2  
Reply to topic    Gentoo Forums Forum Index Portage & Programming
View previous topic :: View next topic  
Author Message
sam_
Developer
Developer


Joined: 14 Aug 2020
Posts: 2038

PostPosted: Tue Jun 25, 2024 7:17 am    Post subject: Reply with quote

The ICE is weird in that it's abnormal to get so many within a single package even if something is pretty badly broken (but a real bug).

Death within GCC's hash tables or garbage collector almost always means HW failure (usually bad RAM, but possibly overclocking-induced or overheating). How long did you run memtest for? Is your CPU overclocked?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2526

PostPosted: Tue Jun 25, 2024 8:28 am    Post subject: Reply with quote

sam_ wrote:
The ICE is weird in that it's abnormal to get so many within a single package even if something is pretty badly broken (but a real bug).

Death within GCC's hash tables or garbage collector almost always means HW failure (usually bad RAM, but possibly overclocking-induced or overheating). How long did you run memtest for? Is your CPU overclocked?


It's pretty consistent though. I have some experience with bad ram and it's not like that at all. Actually back in the days I had faulty CPU too and it's never the same error. It just hangs in many and very different situations, sometimes just while idling.

I'm leaning towards a combination of a bug and circumstances.

Best Regards,
Georgi
Back to top
View user's profile Send private message
sam_
Developer
Developer


Joined: 14 Aug 2020
Posts: 2038

PostPosted: Tue Jun 25, 2024 9:30 am    Post subject: Reply with quote

I've seen bad RAM manifest exactly like this. As I said, the hash table in GCC (and the GC) are sensitive to any corruption and hence easily trip over it. (That's why I said it. I deal with real ICEs all the time, and I also get plenty of bogus reports caused by HW.)

Anyway, if it isn't (though I'm sceptical), I'll need preprocessed source from running the failing command manually in the builddir and appending -save-temps. It'll create a .ii file. See also https://wiki.gentoo.org/wiki/GCC_ICE_reporting_guide.
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2526

PostPosted: Tue Jun 25, 2024 5:30 pm    Post subject: Reply with quote

sam_ wrote:
I've seen bad RAM manifest exactly like this. As I said, the hash table in GCC (and the GC) are sensitive to any corruption and hence easily trip over it. (That's why I said it. I deal with real ICEs all the time, and I also get plenty of bogus reports caused by HW.)

Anyway, if it isn't (though I'm sceptical), I'll need preprocessed source from running the failing command manually in the builddir and appending -save-temps. It'll create a .ii file. See also https://wiki.gentoo.org/wiki/GCC_ICE_reporting_guide.


Yes, it actually makes sense. Those things consume the most memory and the chance of breaking exactly there is significantly higher.

cfgauss wrote:
fedeliallalinea wrote:
Code:
internal compiler error: Segmentation fault

Usually this error is related to a hardware issue.

I used smartctl to test my NVMEs (no hard disk), memtest86+ to test my memory, and s-tui to test my CPU and the hardware passes these tests.

How would I go about checking for software problems?


Smartctl is not a test actually. I don't remember if it could schedule a read/write test on the nvme. I remember I could test my HDD from the BIOS of my old ThinkPad, but I don't know about NVMEs.

Memtest86+ should be run for at least 8 passes (which I usually mention and I don't know why I didn't ask about it in this thread) and I remember somebody here on the forums had it run even more to discover memory errors. Like 12th pass or something. I think we later discovered the user was using an overclocking profile for the memory, counting on some "warranties" and claims made by the memory vendor. Could you run it overnight and make sure you're not using any overclocking memory profile, so it can really test the memory very hard? It would be the easiest to replace a faulty memory module, compared to hunting for the issue anywhere else.

What tests did you run on the NVME and for how long did you run memory tests? Did you happen to remember which pass they run up to?

Best Regards,
Georgi
Back to top
View user's profile Send private message
cfgauss
l33t
l33t


Joined: 18 May 2005
Posts: 726
Location: USA

PostPosted: Wed Jun 26, 2024 5:15 pm    Post subject: Reply with quote

sam_ wrote:
Is your CPU overclocked?

No.

I tested my 128 GB of RAM by running memtest86+ from grub overnight. This resulted in 4 passes and 0 errors. I'm away from my box for a week and only have ssh access. When I get back, for how many passes should I run memtest86+?

I'll then report the results here.
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2526

PostPosted: Wed Jun 26, 2024 5:49 pm    Post subject: Reply with quote

cfgauss wrote:
When I get back, for how many passes should I run memtest86+?


At least 8 is recommended in the documentation, if I remember correctly.

Best Regards,
Georgi
Back to top
View user's profile Send private message
pjp
Administrator
Administrator


Joined: 16 Apr 2002
Posts: 20521

PostPosted: Fri Jun 28, 2024 12:42 am    Post subject: Reply with quote

cfgauss wrote:
sam_ wrote:
Is your CPU overclocked?

No.

I tested my 128 GB of RAM by running memtest86+ from grub overnight. This resulted in 4 passes and 0 errors. I'm away from my box for a week and only have ssh access. When I get back, for how many passes should I run memtest86+?

I'll then report the results here.
I had a memory issue on the 2nd channel of a motherboard. The only obvious symptom was that the system would reboot while trying to compile a hardened kernel and only a hardened kernel. There were no other apparent problems. That was a long time ago, but I had to run memtest for quite a while. Then when it finally found something, I had to move the physical RAM around to identify that the errors were associated with a particular slot on the motherboard and not the memory itself. I _think_ I recall that overnight wasn't long enough. I'd imagine it takes longer to go through maybe 16 times the amount of RAM I had.
_________________
Quis separabit? Quo animo?
Back to top
View user's profile Send private message
cfgauss
l33t
l33t


Joined: 18 May 2005
Posts: 726
Location: USA

PostPosted: Sat Jul 06, 2024 9:05 pm    Post subject: Reply with quote

sam_ wrote:
Death within GCC's hash tables or garbage collector almost always means HW failure (usually bad RAM, but possibly overclocking-induced or overheating). How long did you run memtest for?

I ran memtest86+ for about three days (12 complete passes) with no error. This bug report describes a user who has the same error message compiling dev-qt/qtdeclarative as I do. He's using gcc 14.1.1 with an AMD Ryzen Threadripper 1950X. I'm using the same gcc version but with AMD Ryzen Threadripper 2950X. He was able to compile with CLANG as am I.

Does this suggest recompiling gcc or some other approach?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2526

PostPosted: Sun Jul 07, 2024 6:40 pm    Post subject: Reply with quote

Finally I emerged gcc-14.1.1_p20240622 and qtdeclarative-6.7.2 didn't break with it.

cfgauss wrote:
sam_ wrote:
Death within GCC's hash tables or garbage collector almost always means HW failure (usually bad RAM, but possibly overclocking-induced or overheating). How long did you run memtest for?

I ran memtest86+ for about three days (12 complete passes) with no error. This bug report describes a user who has the same error message compiling dev-qt/qtdeclarative as I do. He's using gcc 14.1.1 with an AMD Ryzen Threadripper 1950X. I'm using the same gcc version but with AMD Ryzen Threadripper 2950X. He was able to compile with CLANG as am I.

Does this suggest recompiling gcc or some other approach?


I don't think so. If you read comment#4 it redirects to a bug regarding gcc as early as version 10 and znver1. I guess this is what you hit. Znver3 here, I guess that's why I didn't hit it.

Best Regards,
Georgi
Back to top
View user's profile Send private message
cfgauss
l33t
l33t


Joined: 18 May 2005
Posts: 726
Location: USA

PostPosted: Sun Jul 07, 2024 9:58 pm    Post subject: Reply with quote

cfgauss wrote:
Does this suggest recompiling gcc or some other approach?

The Gentoo Wiki describes how to create /etc/portage/env/compiler-clang to use with /etc/package/package.env to list packages to be compiled by Clang.

Is it "safe" to have Clang compile, say, only packages that fail with an ICE error and keep gcc for all the others?

If it is "safe", can or should an inclusion be an entire category in /etc/package/package.env, e.g. dev-qt/* compiler-clang?
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2526

PostPosted: Mon Jul 08, 2024 4:29 am    Post subject: Reply with quote

Oh, last night I must have read only up to "Does this suggest recompiling gcc", I'm sorry

Yes, it seems it should be some other approach like the one you've taken.

Given that clang wiki suggests setting up gcc fallback, it should be safe to have clang compile some packages. As to whether it should be a category in your env file, I don't know. First you should test if portage supports it.

Best Regards,
Georgi
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Portage & Programming All times are GMT
Goto page Previous  1, 2
Page 2 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum