View previous topic :: View next topic |
Author |
Message |
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Wed Sep 27, 2017 12:57 am Post subject: Re: Doing another system build |
|
|
pjp wrote: | vaxbrat wrote: | The fab info on the chip is week 9 | That sucks, and suggests it will be a while before it will be "safe" to pick one up. |
I'm seeing great deals on Bulldozer and would snatch one up if my mobo was am3+ instead of am3 and am2+. Maybe I should check mobo prices and see if they are cheap too. I could reuse my memory then instead of buying DDR4. Screw AMD and their "If it doesn't crash on Win10, we don't care if it crashes on Linux" attitude. |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20583
|
Posted: Wed Sep 27, 2017 2:11 am Post subject: |
|
|
They're replacing CPUs though,aren't they? That would work for me. If I had the spare cash for an entirely new system, I'd buy one. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Wed Sep 27, 2017 2:27 am Post subject: |
|
|
pjp wrote: | They're replacing CPUs though,aren't they? That would work for me. If I had the spare cash for an entirely new system, I'd buy one. |
It sounds like they are making people jump through hoops and taking a long time. I had a new car whose battery died at three months on a holiday. GM picked up the car, towed it to the dealer and replaced the battery THAT DAY. |
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Wed Sep 27, 2017 7:08 am Post subject: |
|
|
because many batteries are already produce...
you are comparing a batteries with latest cpu, they have start mass production few days before release, now that it is fix, it will again take days before they have enough good units to feed everyone.
I wonder if amd plan to keep selling bad units stock, because windows users are not really able to see the problem.
That would kill second hands market for ryzen.
And resellers buying in grey market are fucked for good and will keep selling bad units |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Wed Sep 27, 2017 1:25 pm Post subject: |
|
|
krinn wrote: | because many batteries are already produce...
you are comparing a batteries with latest cpu, they have start mass production few days before release, now that it is fix, it will again take days before they have enough good units to feed everyone.
I wonder if amd plan to keep selling bad units stock, because windows users are not really able to see the problem.
That would kill second hands market for ryzen.
And resellers buying in grey market are fucked for good and will keep selling bad units |
Very true. However, the RMA'd units could be resold to mass market PC sellers at a discount. It won't matter to 96+% that just watch youtube on Win10. |
|
Back to top |
|
|
mir3x Guru
Joined: 02 Jun 2012 Posts: 455
|
Posted: Wed Sep 27, 2017 2:16 pm Post subject: |
|
|
wrc1944 wrote: |
Is this pretty normal for requesting an AMD RMA, or should I complain and/or re-submit a new request? Maybe they are swamped with RMA's now that the word is out?
|
They replied to me in 23hours and 30 minutes, like they have some deadline 24h to reply. So just send them another message. And remove line where u demand RMA, they might not like ppl demanding, it should be their idea to give u RMA .
I wrote and attached screenshot of failed compilation:
Quote: | > > Its Ryzen 7 1700X. Gcc fails in compilation sometimes, it cannot
> > compile gcc7.2 itself in 70% cases, gcc7.1 works mostly ok, but 6.3
> > and 6.4 fail sometimes, gcc7.2 very often. In few cases it stucked
> > somewhere, like zombie process but it wasn't shown as zombie.
> > Motherboard is asus-prime b350 plus ( i upgraded bios yesterday to
> > 0808 version, it didnt helped) Ram is Kingston HyperX HX424C15FBK4/64
> > Bios set ram to 2400 by default. I removed 2 sticks and checked with
> > 2x16GB. Also checked with 1866Mhz. Fan is silentiumpc Fortis 3. K10
> > from kernel 4.13 shows temperature on idle about 30-35C. With hard use
> > max 49C. Bios shows 41-43C"
|
_________________ Sent from Windows |
|
Back to top |
|
|
nasaiya Apprentice
Joined: 17 May 2007 Posts: 157
|
Posted: Wed Sep 27, 2017 3:03 pm Post subject: Re: possible kernel gotcha |
|
|
Naib wrote: | do you have RCU configured, especially CONFIG_RCU_NOCB_CPU
also do you have a BIOS option associated with Cstate... try turning that off |
Would you mind explaining this a little better? I'm having daily hard lockups as well. What are the correct settings for this?
I also have had the segfault issues but since disabling cool & quiet and c9 (or c6 or whatever it was in the bios), and rebuilding everything with gcc 7.1 everything has worked perfectly. I even went 4 days without a lockup but the lockup problem is back _________________ If it ain't broke - fix it till it is! |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20583
|
Posted: Wed Sep 27, 2017 3:54 pm Post subject: |
|
|
Tony0945 wrote: | pjp wrote: | They're replacing CPUs though,aren't they? That would work for me. If I had the spare cash for an entirely new system, I'd buy one. |
It sounds like they are making people jump through hoops and taking a long time. I had a new car whose battery died at three months on a holiday. GM picked up the car, towed it to the dealer and replaced the battery THAT DAY. | I'm doubtful that is solely because they are users of Linux. Maybe they aren't really well staffed for handling RMAs (in which case they should have gone through the seller). Or maybe they have a limited supply available to them. There are a lot of potential issues. I'm not saying they are handling it as good as it could be, but I don't see anything "malicious." Maybe that's just because I don't want Intel to be my only choice. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
thumper Guru
Joined: 06 Dec 2002 Posts: 554 Location: Venice FL
|
Posted: Wed Sep 27, 2017 8:52 pm Post subject: Re: possible kernel gotcha |
|
|
nasaiya wrote: | Naib wrote: | do you have RCU configured, especially CONFIG_RCU_NOCB_CPU
also do you have a BIOS option associated with Cstate... try turning that off |
Would you mind explaining this a little better? I'm having daily hard lockups as well. What are the correct settings for this?
I also have had the segfault issues but since disabling cool & quiet and c9 (or c6 or whatever it was in the bios), and rebuilding everything with gcc 7.1 everything has worked perfectly. I even went 4 days without a lockup but the lockup problem is back |
Code: | CONFIG_RCU_NOCB_CPU:
Use this option to reduce OS jitter for aggressive HPC or
real-time workloads. It can also be used to offload RCU
callback invocation to energy-efficient CPUs in battery-powered
asymmetric multiprocessors.
This option offloads callback invocation from the set of
CPUs specified at boot time by the rcu_nocbs parameter.
For each such CPU, a kthread ("rcuox/N") will be created to
invoke callbacks, where the "N" is the CPU being offloaded,
and where the "x" is "b" for RCU-bh, "p" for RCU-preempt, and
"s" for RCU-sched. Nothing prevents this kthread from running
on the specified CPUs, but (1) the kthreads may be preempted
between each callback, and (2) affinity or cgroups can be used
to force the kthreads to run on whatever set of CPUs is desired.
Say Y here if you want to help to debug reduced OS jitter.
Say N here if you are unsure.
Symbol: RCU_NOCB_CPU [=y]
Type : boolean
Prompt: Offload RCU callback processing from boot-selected CPUs
Location:
-> General setup
-> RCU Subsystem
Defined at kernel/rcu/Kconfig:218
Depends on: (TREE_RCU [=y] || PREEMPT_RCU [=n]) && (RCU_EXPERT [=y] || NO_HZ_FULL [=n])
Selected by: NO_HZ_FULL [=n] && <choice> && !ARCH_USES_GETTIMEOFFSET [=n] && GENERIC_CLOCKEVENTS [=y]\
&& SMP [=y] && HAVE_CONTEXT_TRACKING [=y] && HAVE_VIRT_CPU_ACCOUNTING_GEN [=y] |
I can confirm that it used to work, I'm not sure when but I think the problem returned in 4.13 and this bit was removed:
Code: | CONFIG_RCU_NOCB_CPU_ALL=y |
I had not paid close enough attention but it seems something needs to be set on the kernel command line at boot time.
So I'm just digging into this now, this bit seems to be important:
Code: | from the set of CPUs specified at boot time by the rcu_nocbs parameter. |
Explained better here:
Code: | The RCU_NOCB_CPU_ALL=y Kconfig option, which causes all CPUs
to be offloaded. On a 16-CPU system, this is equivalent to
"rcu_nocbs=0-15" |
George |
|
Back to top |
|
|
Bloot Tux's lil' helper
Joined: 10 Mar 2006 Posts: 99 Location: Barcelona
|
Posted: Thu Sep 28, 2017 5:20 pm Post subject: |
|
|
I'm done with AMD.
I sent my faulty CPU two weeks ago, they said they'd send me a replacement as soon as I'd send them the parcel tracking number. They wrote yesterday saying they were restocking their warehouse, and that it would not be completed until october 2nd. That means 3 weeks, at best, since I sent my faulty processor.
I liked Ryzen very much but this is unacceptable, I'd rather contact Amazon if I knew it would take this long. I'll sell it the moment it arrives, if it ever does.
Sorry for the rant, I needed it. |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20583
|
Posted: Thu Sep 28, 2017 7:23 pm Post subject: |
|
|
Sorry to hear you're that disappointed by a delay, but everyone has their limits. Mine was the Intel monopoly. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
Bloot Tux's lil' helper
Joined: 10 Mar 2006 Posts: 99 Location: Barcelona
|
Posted: Thu Sep 28, 2017 8:15 pm Post subject: |
|
|
I wouldn't exactly call a three weeks wait a delay, but nevermind.
Hope they can finally adress this problem and replace every faulty unit, I won't be on the boat anymore. It was nice to share Ryzen experiences with you all though
Cheers |
|
Back to top |
|
|
mir3x Guru
Joined: 02 Jun 2012 Posts: 455
|
Posted: Thu Sep 28, 2017 8:29 pm Post subject: |
|
|
Bloot wrote: | I'm done with AMD.
I sent my faulty CPU two weeks ago, they said they'd send me a replacement as soon as I'd send them the parcel tracking number. They wrote yesterday saying they were restocking their warehouse, and that it would not be completed until october 2nd. That means 3 weeks, at best, since I sent my faulty processor.
|
It sucks epsecially AMD wrote
Quote: | You will receive your original processor
approximately 3-5 business days from the ship date from the AMD
Returns Center, depending upon your location. |
U can read forum there, one guy got that email and new CPU few hours later ...
https://community.amd.com/thread/215773?start=1650&tstart=0
Anyway, at least u got such email, I didn't. Seems at least one more week waiting.
15 mins after I sent them tracking number, they sent me email how to pack CPU and new - another RMA number to write on pack label, but I already sent !!!
EDIT: watch and say fck AMD, fck CPUs, fck computers :
https://www.youtube.com/watch?v=5g9zxduFtSM _________________ Sent from Windows |
|
Back to top |
|
|
nasaiya Apprentice
Joined: 17 May 2007 Posts: 157
|
Posted: Sun Oct 01, 2017 3:31 pm Post subject: Re: possible kernel gotcha |
|
|
thumper wrote: |
...
George |
Thanks, enabling all that seems to have done the trick (so far anyway - no lockups in several days)... now I suppose I'll have to decide whether to start a return or keep gcc 7.1 forever...
I didn't add anything to the kernel command line btw so I don't know if that kernel feature is actually doing anything, but at least it's not locking up anymore. _________________ If it ain't broke - fix it till it is! |
|
Back to top |
|
|
vaxbrat l33t
Joined: 05 Oct 2005 Posts: 731 Location: DC Burbs
|
Posted: Sun Oct 01, 2017 5:08 pm Post subject: Seems to be stabilizing here finally |
|
|
My Taichi based build on kernel 4.12.12 has been up for over two days now without a lockup. The last thing I tweaked on that one is the SOC VCC getting set to 1.18 volts. I did leave cool and quiet turned on. I need to get my cluster on the same version of Ceph Jewel before I put a couple of OSD daemons on it.
My Gaming k4 build was still locking up yesterday even though I made the VCC change. I was about to turn off cool and quiet and considered switching from on-demand to performance cpu governor but I decided to switch to 4.13.3 on it and took George's advice above about the RCU settings. It's been good so far and even appears able to use all 16 cores to emerge mesa with gcc 6.4.0. I'm going to hold off on 7 until there's better agreement about both stability and zen support. I've been pleasantly surprised by amdgpu support and wine allowing me to play Fallout New Vegas. The last time I had that working was with a Geforce and Nvidia-drivers before all the insanity started happening with that and the KDE Plasma compositor in opengl mode. |
|
Back to top |
|
|
thumper Guru
Joined: 06 Dec 2002 Posts: 554 Location: Venice FL
|
Posted: Sun Oct 01, 2017 8:47 pm Post subject: Re: possible kernel gotcha |
|
|
nasaiya wrote: | thumper wrote: |
...
George |
Thanks, enabling all that seems to have done the trick (so far anyway - no lockups in several days)... now I suppose I'll have to decide whether to start a return or keep gcc 7.1 forever...
I didn't add anything to the kernel command line btw so I don't know if that kernel feature is actually doing anything, but at least it's not locking up anymore. |
I am using GCC 7.2 and it's working fantastic for me, since the latest update for my BIOS, no more segfaults, seems like kernel 4.13.3 had a fix for it too.
But the lockups still occurred randomly until I added this to my kernel command line in /etc/default/grub as well as those .config changes.
I have the 1800X, and so far so good.
I'm not sure if these bit's are relevant, but I'm using these versions of the tools
Code: |
sys-devel/libtool-2.4.6-r3
sys-devel/binutils-2.28.1
sys-libs/glibc-2.23-r4
sys-devel/gcc-7.2.0
sys-kernel/linux-headers-4.13
|
George |
|
Back to top |
|
|
Naib Watchman
Joined: 21 May 2004 Posts: 6069 Location: Removed by Neddy
|
Posted: Sun Oct 01, 2017 9:43 pm Post subject: Re: possible kernel gotcha |
|
|
thumper wrote: | nasaiya wrote: | thumper wrote: |
...
George |
Thanks, enabling all that seems to have done the trick (so far anyway - no lockups in several days)... now I suppose I'll have to decide whether to start a return or keep gcc 7.1 forever...
I didn't add anything to the kernel command line btw so I don't know if that kernel feature is actually doing anything, but at least it's not locking up anymore. |
I am using GCC 7.2 and it's working fantastic for me, since the latest update for my BIOS, no more segfaults, seems like kernel 4.13.3 had a fix for it too.
But the lockups still occurred randomly until I added this to my kernel command line in /etc/default/grub as well as those .config changes.
I have the 1800X, and so far so good.
I'm not sure if these bit's are relevant, but I'm using these versions of the tools
Code: |
sys-devel/libtool-2.4.6-r3
sys-devel/binutils-2.28.1
sys-libs/glibc-2.23-r4
sys-devel/gcc-7.2.0
sys-kernel/linux-headers-4.13
|
George | what -march are you using for gcc-7.2 ? _________________ #define HelloWorld int
#define Int main()
#define Return printf
#define Print return
#include <stdio>
HelloWorld Int {
Return("Hello, world!\n");
Print 0; |
|
Back to top |
|
|
thumper Guru
Joined: 06 Dec 2002 Posts: 554 Location: Venice FL
|
Posted: Sun Oct 01, 2017 9:54 pm Post subject: Re: possible kernel gotcha |
|
|
Naib wrote: |
what -march are you using for gcc-7.2 ? |
Using native in the kernel as well.
George |
|
Back to top |
|
|
amaroc Tux's lil' helper
Joined: 13 Nov 2005 Posts: 99
|
Posted: Mon Oct 02, 2017 2:12 pm Post subject: Ryzen upgrade |
|
|
As this is "Gentoo Chat" and it's Ryzen related I post it here - even if I do not really contribute to the Ryzen technical discussion. Think it might be interesting to some - otherwise please skip.
Summary about an upgrade to Ryzen CPU
Two weeks a go my good old Phenom II X6 1055T (6x2.80GHz) died all of a sudden. It might have been the motherboard as well as there was literally nothing - no screen, no beep no other sign of life. I tried some disconnect/connect cycles for cards and cables around, CMOS battery replacement and RTC reset, measured power supply - but no joy. So either CPU or mobo died somehow. Ok, this is what may happen after 7 years and more than 10k hours running.
Time to think about an update or other options.
- Android tablet only? - not much fun, not really
- Laptop? - the tablet allows for this flexibility already
So there shall be a desktop update.
- ARM or x86? - I've got still some x86 code around and w/ ARM I would very likely end up with Android someday - no!
- AMD or Intel? - One year ago I would probably have decided for Intel but there is Ryzen now for some time.
OK, so reading about Ryzen than - OMG - even this thread and the predecessor are full of segfault messages, memory incompatibilities, etc.
In addition - do I want to continue with Gentoo? I'm still running grub legacy, never thought about UEFI and overall - I want the system up and running asap as at least online banking needs to get restored soon.That will require a proper browser in a graphical environment - so I will need a quick graphics card decision as well.
- AMD or nVidia? - AMD seems to be more on open source - so that was easy
- Old or new chip family? - As my screen is also almost ten years old I should look for something more recent. I was brave and decided for the RX550.
So, mobo decision was easy - ASUS prime B350-PLUS as there is an (onboard) SPDIF out and the reviews are not bad. For DDR4 I decided for SR Crucial - 2400 speed grade seems to be sufficient and 2x 8GB is an improvement over the old 12GB. A Ryzen 1700 with the bundled fan should also be sufficient. As my 2GB HDD was a bit tight already I decided for a 8TB update as well. Everything else - 1TB SSD, DVD-RW, BD-ROM, case, power supply, etc. should stay. The mobo requires an 8-pin EATX 12V connector but my power supply has a 4-pin ATX 12V connector - Google said that this is OK.
So, I ordered the items above and after some days I could start the assembly, booted into the BIOS, made an update to the recent BIOS version - here we are.
Now - what to boot? Even if I have recent backups on external drives I wanted to re-use my installation as much as possible and therefore doing a tar backup of my SSD root partition seemed to be a good idea.
A little bit out of curiosity I decided for Ubuntu 17.10 beta as there should be a recent kernel, graphics, etc.
I ended up with acpi=off and a gnome desktop. Not bad and OK for a tar backup but chroot and compile a kernel on one core out of eight? That's no fun. So I tared my SSD root partition just in case and decided for a Gentoo installation medium.
What - the CD image does not support UEFI and the DVD is about one year old? As I'm lazy I decided against UEFI and switched to the legacy boot scheme with the Gentoo CD image.
The genkernel booted fine, all 8 cores were there and I could chroot. I took the opportunity to go for grub2 according wiki and followed the Ryzen kernel instruction. Playing safe I decided for Generic-x86-64, gcc 5.4.0 and -march=x86-64.
The next boot showed a nice grub2 menu but the kernel crashed again - with good old 80x25 I couldn't see much. I tried memtest86 from the install medium - it crashed as well. It took me some time to understand that the old memtest86 does not like the Ryzen so I decided to continue the journey.
Why not being brave and use the old Phenom kernel? Surprise - it booted and even Xorg and KDE did start - wow! OK, I was on six cores rather than eight (+8x HT) as my Phenom customization was still in but that was acceptable for a kernel update. The old ATI radeon code obviously worked even for Xorg and I didn't care about sw rendering at all.
So I modified the kernel slightly to have eight cores plus ht - that was easy.
Later I performed an incremetal kernel update process according Ryzen kernel instruction and found the IOMMU option to be the root cause. Google told me that this might be an issue on ASUS and B350 motherboards - but I do not really need this feature right now - so it stays out for now.
As I was OK with the current graphics solution and Xorg-log and system-log stayed quite I decided for some stability tests. gcc compile gave me some non Ryzen issues for tmpfs inodes and docbook issues but beside that everythings runs rock solid. I did a bunch of parallel emerges on gcc and webkit-gtk utilizing all cores and a lot of memory and didn't had any issue.
Clearly there will be more work to do for gcc flags, AMDGPU, sensors, etc. but that's not urgent at all.
Summary:
I'm very pleased with the update process and after 12 years with Gentoo I have to admit - Gentoo still rocks |
|
Back to top |
|
|
Naib Watchman
Joined: 21 May 2004 Posts: 6069 Location: Removed by Neddy
|
Posted: Tue Oct 03, 2017 9:08 am Post subject: Re: possible kernel gotcha |
|
|
thumper wrote: | Naib wrote: |
what -march are you using for gcc-7.2 ? |
Using native in the kernel as well.
George | ok now this is odd... I have been using -march=znver1 since I converted to GCC-6.x
GCC-7.2 is the only package I have ever had issue with (I have done pretty much all of the stress testing as I fear I have a dodgy chip).
I changed my march to native and it builds oO I then built the toolchain twice (libtools, binutils, glibc, gcc) with no problems
I left emerge continually building gcc-7.2 overnight and everyone built fine...
-march=native should imply -march=znver1 so the results should be identical. So either there is a bug in gcc building or maybe there was an inconsistency in my toolchain that over the last few weeks multiple rebuilds helped align. _________________ #define HelloWorld int
#define Int main()
#define Return printf
#define Print return
#include <stdio>
HelloWorld Int {
Return("Hello, world!\n");
Print 0; |
|
Back to top |
|
|
Chewi Developer
Joined: 01 Sep 2003 Posts: 886 Location: Edinburgh, Scotland
|
Posted: Tue Oct 03, 2017 9:14 am Post subject: |
|
|
Hello all. I'm back after going through the RMA process. I was 2½ weeks without my desktop. Could have been worse but I expected better. It only took 4 days for my old CPU to reach the warehouse in the Netherlands but it took them another 13 days to deliver the replacement. At least it gave me time to play with my ARM box.
I can see that the replacement was manufactured in 1730 (July) and the good news is that the segfaults appear to have gone after re-enabling ASLR. The not so good news is that I still got a freeze overnight after disabling the RCU stuff. There's a chance that this may have been down to the amd-staging DC stuff but I have my doubts. Trying without that at the moment. I could turn the RCU stuff back on but not having C6 sucks for power usage and I really want to get to the bottom of it. Now that the segfaults are ruled out, I will try to find out more. Maybe I'll contact Gigabyte.
@nasaiya, what motherboard do you have? |
|
Back to top |
|
|
fcl Tux's lil' helper
Joined: 31 Dec 2016 Posts: 77
|
Posted: Tue Oct 03, 2017 3:00 pm Post subject: |
|
|
To be fair C6 doesn't lower power usage THAT much. I wouldn't worry about it |
|
Back to top |
|
|
thumper Guru
Joined: 06 Dec 2002 Posts: 554 Location: Venice FL
|
Posted: Tue Oct 03, 2017 7:52 pm Post subject: Re: possible kernel gotcha |
|
|
Naib wrote: | ok now this is odd... I have been using -march=znver1 since I converted to GCC-6.x
GCC-7.2 is the only package I have ever had issue with (I have done pretty much all of the stress testing as I fear I have a dodgy chip).
I changed my march to native and it builds oO I then built the toolchain twice (libtools, binutils, glibc, gcc) with no problems
I left emerge continually building gcc-7.2 overnight and everyone built fine...
-march=native should imply -march=znver1 so the results should be identical. So either there is a bug in gcc building or maybe there was an inconsistency in my toolchain that over the last few weeks multiple rebuilds helped align. |
I've been using GCC 7.2 since a few days before it hit portage proper, 7.1 prior to that.
For what its worth I rebuilt my toolchain plus some other packages a half dozen times or so out of fear when the segfaults were showing up, until I got a clean pass.
Used this: Code: | #!/bin/bash
emerge -1v sys-kernel/linux-headers sys-libs/glibc sys-devel/binutils-config sys-libs/binutils-libs sys-devel/binutils dev-libs/boost sys-devel/gcc-config sys-devel/gcc sys-devel/libtool sys-devel/llvm sys-devel/clang |
George |
|
Back to top |
|
|
Naib Watchman
Joined: 21 May 2004 Posts: 6069 Location: Removed by Neddy
|
Posted: Tue Oct 03, 2017 8:24 pm Post subject: Re: possible kernel gotcha |
|
|
thumper wrote: | Naib wrote: | ok now this is odd... I have been using -march=znver1 since I converted to GCC-6.x
GCC-7.2 is the only package I have ever had issue with (I have done pretty much all of the stress testing as I fear I have a dodgy chip).
I changed my march to native and it builds oO I then built the toolchain twice (libtools, binutils, glibc, gcc) with no problems
I left emerge continually building gcc-7.2 overnight and everyone built fine...
-march=native should imply -march=znver1 so the results should be identical. So either there is a bug in gcc building or maybe there was an inconsistency in my toolchain that over the last few weeks multiple rebuilds helped align. |
I've been using GCC 7.2 since a few days before it hit portage proper, 7.1 prior to that.
For what its worth I rebuilt my toolchain plus some other packages a half dozen times or so out of fear when the segfaults were showing up, until I got a clean pass.
Used this: Code: | #!/bin/bash
emerge -1v sys-kernel/linux-headers sys-libs/glibc sys-devel/binutils-config sys-libs/binutils-libs sys-devel/binutils dev-libs/boost sys-devel/gcc-config sys-devel/gcc sys-devel/libtool sys-devel/llvm sys-devel/clang |
George | good to know. Funny thing is I swore I was using 7.2 but then on closer inspection i would appear I wasn't & then any attempt to build it failed at exactly the same point. I feared a chip issue and was rigorously applying every stress-test that Ryzen7 users were stating would cause issues (none would on my setup). Fearing this 7.2 was a chip issue I did everything recommended when it was (disable smt, -j1, disable ....) and still no luck.
there should be zero difference between -march=native and -march=znver1 yet -march=native worked and then repeated rebuilds worked every single time... Maybe this co-coincided with a binutils bump who knows (another one today).
when gcc-7.2 did build via gcc-7.1+march=native I rebuilt it 3times in a row to be sure. Then I switched to gcc-7.2 and rebuilt the toolchain a number of times emerge libtool glibc binutils gcc #toolchain rebuild Then left emerge gcc in a loop overnight and no problems...
I am now doing an emerge -e @world to rebuild all with gcc-7.2
VERY VERY ODD... but it is working _________________ #define HelloWorld int
#define Int main()
#define Return printf
#define Print return
#include <stdio>
HelloWorld Int {
Return("Hello, world!\n");
Print 0; |
|
Back to top |
|
|
Chewi Developer
Joined: 01 Sep 2003 Posts: 886 Location: Edinburgh, Scotland
|
Posted: Tue Oct 03, 2017 10:04 pm Post subject: |
|
|
So disabling DC didn't help. Now I've disabled "Global C-State Control" in the BIOS, which I gather disables C6. That should hold up but I want to check it makes a difference.
When it did freeze, I was able to capture the kernel output using netconsole. To the other people experiencing freezes, it would be great if you could try netconsole so that we can compare notes.
Code: | INFO: rcu_preempt detected stalls on CPUs/tasks:
\x096-...: (0 ticks this GP) idle=c54/0/0 softirq=72512/72512 fqs=0
\x097-...: (10 GPs behind) idle=248/0/0 softirq=55050/55050 fqs=0
\x098-...: (1 GPs behind) idle=0a0/0/0 softirq=53085/53086 fqs=0
\x099-...: (8 GPs behind) idle=1cc/0/0 softirq=24670/24670 fqs=0
\x0910-...: (5 GPs behind) idle=a98/0/0 softirq=48138/48138 fqs=0
\x0911-...: (1 GPs behind) idle=f20/0/0 softirq=25923/25924 fqs=0
\x09
(detected by 2, t=63859 jiffies, g=189965, c=189964, q=171)
Sending NMI from CPU 2 to CPUs 6:
Sending NMI from CPU 2 to CPUs 7:
Sending NMI from CPU 2 to CPUs 8:
Sending NMI from CPU 2 to CPUs 9:
Sending NMI from CPU 2 to CPUs 10:
NETDEV WATCHDOG: wan0 (igb): transmit queue 0 timed out
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at /home/chewi/Projects/linux/net/sched/sch_generic.c:316 dev_watchdog+0x212/0x220
Modules linked in:
it87(O)
hwmon_vid
netconsole
ip6table_mangle
nf_log_ipv6
nf_conntrack_ipv6
nf_defrag_ipv6
xt_connmark
iptable_mangle
xt_helper
ipt_REJECT
nf_reject_ipv4
nf_log_ipv4
nf_log_common
xt_LOG
xt_limit
nf_conntrack_ipv4
nf_defrag_ipv4
xt_tcpudp
xt_multiport
xt_conntrack
nf_conntrack_ftp
nf_conntrack_tftp
nf_conntrack_irc
nf_conntrack_pptp
nf_conntrack_proto_gre
nf_conntrack
ip6table_filter
ip6_tables
iptable_filter
ip_tables
x_tables
nfsd
auth_rpcgss
oid_registry
lockd
grace
bnep
cachefiles
fscache
bluetooth
ecdh_generic
xfs
ftdi_sio
usbserial
kvm_amd
kvm
snd_hda_codec_realtek
snd_hda_codec_generic
amdgpu
snd_hda_codec_hdmi
snd_hda_intel
snd_hda_codec
irqbypass
snd_hwdep
crct10dif_pclmul
crc32_pclmul
ghash_clmulni_intel
mfd_core
drm_kms_helper
pcbc
cfbfillrect
syscopyarea
cfbimgblt
sysfillrect
sysimgblt
fb_sys_fops
aesni_intel
cfbcopyarea
ttm
drm
aes_x86_64
snd_hda_core
snd_pcm
snd_timer
snd
tun
ccp
crypto_simd
glue_helper
cryptd
ppp_generic
slhc
loop
crc32c_intel
alx
igb
mdio
i2c_algo_bit
sunrpc
[last unloaded: netconsole]
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.13.4-01268-g72d693146ebf #61
Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming 5/AX370-Gaming 5, BIOS F9a 09/08/2017
task: ffffffff8180e480 task.stack: ffffffff81800000
RIP: 0010:dev_watchdog+0x212/0x220
RSP: 0018:ffff88041ec03e90 EFLAGS: 00010286
RAX: 0000000000000037 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88041ec0c8f8 RDI: ffff88041ec0c8f8
RBP: ffff88040781441c R08: 0000000000000001 R09: 000000000000047b
R10: 0000000000001000 R11: 000000002b300077 R12: ffff880407814000
R13: 0000000000000000 R14: 0000000000000008 R15: ffff880408a30940
FS: 0000000000000000(0000) GS:ffff88041ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff137422000 CR3: 000000040a109000 CR4: 00000000003406f0
Call Trace:
<IRQ>
? qdisc_rcu_free+0x40/0x40
? qdisc_rcu_free+0x40/0x40
? call_timer_fn.isra.6+0x11/0x70
? expire_timers+0x92/0xa0
? run_timer_softirq+0x9f/0xd0
? tick_sched_timer+0x4c/0x70
? timerqueue_add+0x52/0x80
? ktime_get+0x36/0x98
? __do_softirq+0xc9/0x208
? irq_exit+0xa3/0xa8
? smp_apic_timer_interrupt+0x5e/0x80
? apic_timer_interrupt+0x7f/0x90
</IRQ>
? acpi_idle_do_entry+0x2b/0x40 |
|
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|