View previous topic :: View next topic |
Author |
Message |
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jun 27, 2018 5:20 pm Post subject: [SOLVED] compiling kernel from minimal cd freezes new system |
|
|
On a new custom built Intel Core i9 7940X based system, I am using the 2018-04-15 AMD64 Minimal Installation CD and latest stage-3 tar. In the chroot environment, when I run 'make' to compile the kernel, the kernel compiles for a couple of min. then the system freezes and finally five to seven seconds later, with no action from me, the system reboots. I tried the make with both "make -j28" and "make -j8". The system freezes on both, but in different places of the compile. (I have not yet tried with just a single thread.) There are NO error messages in the compile output or in /var/log/messages (not in chroot).
I thought this might be a hardware issue. 18 hours of memtest-86 (single threaded) show no memory errors. I tried several CPU stress tests from the "Ultimate Boot CD" http://www.ultimatebootcd.com/. Although I could get the CPU up to 80degC, using one of the tests (28 threads) for 20 min. I could not get an error. Also, system System Rescue CD http://www.system-rescue-cd.org/ works without issue.
Finally I tried to "emerge gcc" in the Minimal Install CD chroot environment and after a min. or so the system froze again and then self-rebooted.
Any suggestions on installing Gentoo on my new system? (I have been using Gentoo for years, (back to starting with a stage 1 install) and have never had an install issue.)
----
Things already on my try list:
Minimal install one more time, this time compiling the kernel with a single thread. (ie. no -jN for a make argument).
If that fails:
Try running a LiveCD version of Knoppix (32-bit) or Ubuntu or Mint (64-bit) and see if I can compile a kernel that way. If so I guess I could try a chroot environment from that Linux Distro.
Could this have anything to do with the kernel on the Minimal Install CD's family being set to "Generic-X86-64" My kernel configuration (that freezes the system) has Processor family set to "Core 2/newer Xeon" ?
Last edited by jagdpanther on Sun Jul 22, 2018 4:12 pm; edited 1 time in total |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Wed Jun 27, 2018 5:26 pm Post subject: |
|
|
Can you try systemrescue cd for installation?
Can you post the rest of your new system? |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jun 27, 2018 6:11 pm Post subject: |
|
|
System Specs:
Intel Core i9 7940X
64 GB Ram (Corsair Vengeance - 1.2V at 2666)
Asus WS X299 Sage (May BIOS firmware update)
... (because Asus doesn't sell the WS X299 PRO in N America)
Nvidia (EVGA) 1070
Samsung 970 NVMe SSD
Crucial MX500 SATA SSD
Seagate SATA HDD
Seasonic 850W PSU
----
I'll try the install via System Rescue CD tonight or tomorrow morning. |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Wed Jun 27, 2018 6:59 pm Post subject: |
|
|
On the Asus Webside, there is a Bios 0502 from 2018/5/8 availaible. |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jun 27, 2018 7:28 pm Post subject: |
|
|
Keruskerfuerst:
Thanks. I have already upgraded the BIOS to 0502. That did not fix the freeze issue. |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Wed Jun 27, 2018 7:36 pm Post subject: |
|
|
Then you should try linux dvds like Systemrescue CD or similar. |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jun 27, 2018 7:47 pm Post subject: |
|
|
Quote: | Then you should try linux dvds like Systemrescue CD or similar. |
Does it matter if Linux LiveCD that I chroot from is 32-bit or 64-bit ? |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Thu Jun 28, 2018 5:47 am Post subject: |
|
|
I think it should be a 64 Bit Live CD/DVD. |
|
Back to top |
|
|
The Doctor Moderator
Joined: 27 Jul 2010 Posts: 2678
|
Posted: Thu Jun 28, 2018 6:12 am Post subject: |
|
|
jagdpanther wrote: | Quote: | Then you should try linux dvds like Systemrescue CD or similar. |
Does it matter if Linux LiveCD that I chroot from is 32-bit or 64-bit ? | A 32 bit kernel cannot build a 64 bit system so yes, if you want to build a 64 bit install (and you do) you must use a 64 bit media. _________________ First things first, but not necessarily in that order.
Apologies if I take a while to respond. I'm currently working on the dematerialization circuit for my blue box. |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Thu Jun 28, 2018 3:58 pm Post subject: |
|
|
Opensuse has a live medium.
Ubuntu can also boot into live mode. |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Fri Jun 29, 2018 3:11 pm Post subject: |
|
|
Thanks for the suggestions.
I did get a little further using "System Rescue CD" and chrooting from it. I successfully compiled a new kernel (gentoo-sources-4.17.3). However, from both the System Rescue CD chroot environment, and the successfully booted new system, if I run a large compile, like "emerge gcc" I eventually receive the same system freeze, without any log messages indicating what the problem is.
(During one test, I was also running "watch sensors" in a another xterm (both using ssh to connect to the new system) when the freeze occurred, gcc was doing some "checking" and there was almost NO CPU load (temps all around 35C) so I don't think there is a CPU temp issue. I am starting to suspect a hardware issue, perhaps the PSU or motherboard. If I can duplicate the error in a LiveCD Ubuntu run, I'll try a different power supply.) |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Fri Jun 29, 2018 3:32 pm Post subject: |
|
|
Can you run a system full load with
a) 3D Mark
b) PC Mark
c) Prime
d) Cinebench
.... |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Mon Jul 02, 2018 4:04 pm Post subject: |
|
|
Keruskerfuerst wrote: | Can you run a system full load with
a) 3D Mark
b) PC Mark
c) Prime
d) Cinebench
.... |
Good idea. I'll work on those. |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Mon Jul 02, 2018 4:20 pm Post subject: |
|
|
So far here it what fails on this new system. Fails = system freeze followed five senconds later by an automatic reboot.
(When I am running from the SSD and not a LiveCD there are no errors or warnings in /var/log/messages.)
Fails:
1. Gentoo minimal install CD in chroot environment: compiling Linux kernel
2. Gentoo miniimal install CD in chroot environment: emerge gcc
3. System Rescue CD in chroot environment: emerge gcc
Note that compiling the Linux kernel works from System Rescue CD in chroot which is how I installed the system.
The following work without issue and do NOT cause a freeze:
Works:
1. 20 min. of CPU Burn
2. 18 hours of memtest-86
3. Ubuntu Live CD: I could compile gcc source (using the Ubuntu provided gcc). This works compiling to /dev/shm and to a mount point on the NVMe drive where gentoo is installed.
During one failure I happened to be watching the console at Alt-F12 and saw a kernel panic. The auto reboot cleared the screen before I could copy down the error message. I'll try to replicate this and take a picture with a camera of the screen. |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Tue Jul 03, 2018 5:55 am Post subject: |
|
|
I have worked in a PC production and had to repair PCs.
When the PC did not work properly, often the RAM was faulty.
Is the RAM overclocked? XMP or something similar.
I have Kingston HyperX RAM in my computer and the PC work without any hard lock or... |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jul 04, 2018 2:40 am Post subject: |
|
|
The only overclock on this system is the RAM via XMP. (Which gives 1.2V at 2666 MHZ.)
I am running Corsair memory and it passed 18 hours of Memtest86.
It is not written to /var/log/messages, but I took a picture of today's system freeze with a camera. It appears there is a "Machine Check Exception: 5" Here is a picture of the screen. Any idea what it means?
https://www.dropbox.com/sh/go58lo8z5s48d57/AACJ3lrHNmGEQrYlVxAjVY0ia?dl=0&preview=HardwareError.png
Last edited by jagdpanther on Sat Jul 07, 2018 10:58 pm; edited 1 time in total |
|
Back to top |
|
|
The Doctor Moderator
Joined: 27 Jul 2010 Posts: 2678
|
Posted: Wed Jul 04, 2018 2:52 am Post subject: |
|
|
I'm surprised no one has brought this up is the ventilation an issue? It sounds like an overheating. If it where me I'd increase ventilation around the computer and see what happens. This is the cheapest test, so worth looking into. Maybe with half the box open or similar.
EDIT: no, scratch that. Your error is a bad ram error. Memtest only identifies some bad ram. No failures does not mean it is necessarily working. _________________ First things first, but not necessarily in that order.
Apologies if I take a while to respond. I'm currently working on the dematerialization circuit for my blue box. |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Wed Jul 04, 2018 4:15 am Post subject: |
|
|
I have disabled XMP due to hard locks of my computer. |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jul 04, 2018 4:57 pm Post subject: |
|
|
Quote: | no, scratch that. Your error is a bad ram error. Memtest only identifies some bad ram. No failures does not mean it is necessarily working. |
Although I ran 18 hours of Memtest86+ on this system without error this was using the single-threaded option. When I try to run Memtest86+ in multi-threaded mode it instantly causes a system freeze. (I was not too concerned about the multi-threaded mode failure as I have seen several other systems, includeing big Dell servers, fail about 10 min. into multi-threaded mode Memtest86+ run but run fine in the single -threaded mode for 24 hours and then run normally for months of uptime without issue as matlab compute servers.)
Quote: | I have disabled XMP due to hard locks of my computer. |
I just disabled XMP in bios, rebooted and tried to emerge GCC. As "normal" on this sytem, about 10 to 15 min. into the emerge there was the system freeze. (I also note the freeze had the same "CPU 13" error as the posted picture above.)
It does not matter if I have XMP enabled or disabled, I still get the freeze.
(I wonder why I do NOT have the freeze if I LiveCD boot Ubuntu 64-bit and compile GCC by hand (with make -J28)).
I chose the memory on this system because with enabled XMP it still runs at 1.2V and because it was on the Asus motherboard's list of certified memory.
Corsair: CMK64GX4M4A2666C16. (Vengeance LPX)
I wonder if I should replace it with the identical type of memory or try something else. Perhaps Corsair Dominator memory that is on Asus's list? (I have no plans, other than enabling XMP to overclock the system, I'm more interested in 24x7 stability.)
Also, could this be a power supply problem? I think I'll temporarily put in an older power supply and see if the issue persistes. (It probably will.) |
|
Back to top |
|
|
ali3nx l33t
Joined: 21 Sep 2003 Posts: 732 Location: Winnipeg, Canada
|
Posted: Wed Jul 04, 2018 5:36 pm Post subject: Re: compiling kernel from minimal cd freezes new system |
|
|
One key issue worth considering is many newer generations of pc hardware have become much more intolerant of not running an OS booted in uefi mode and perhaps more commonly booted with any legacy compatibility bios features enabled.
As the gentoo minimal livecd's do not support uefi boot this may be a contributing source of hardware stability issues running on very new uefi based pc hardware.
My Gentoo workstation running a skylake intel 6700k and an nvidia 1060 graphics card entiely disabling any legacy compatibility bios features such as the launch CSM and fast boot entirely fixes several hardware compatibility problems with running Linux.
The legacy compat CSM and fast boot are both enabled as defaults on every skylake and coffee lake pc build i have and every one of them running Linux or windows have had both of those bios features disabled. The uefi firmware boot times are longer booting in "pure efi mode" but it's necessary for my nvidia graphics card to not throw kernel warnings about vga console driver incompatibility.
Code: | [ 6.615376] NVRM: Your system is not currently configured to drive a VGA console
[ 6.615379] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
[ 6.615381] NVRM: requires the use of a text-mode VGA console. Use of other console
[ 6.615383] NVRM: drivers including, but not limited to, vesafb, may result in
[ 6.615384] NVRM: corruption and stability problems, and is not supported. |
While this is just one example of legacy bios compat via the launch CSM or fast boot directly causing a hardware and software driver conflict there certainly could be others that merit entirely disabling launch CSM and fast boot if your uefi bios permits doing so.
The best explanation i could provide as to why the launch CSM is a source of conflicts with newer pc hardware could perhaps be described by envisioning the initialization process a UEFI bios completes. When the launch csm is enabled that bios "feature module" is processed first which captures any hardware it's able to or requires negotiation with to register any hardware device requiring hardware compatibility layer.
Older legacy hardware such as pre uefi era pci or pci express expansion cards could certainly benefit from this but when newer hardware negotiates with the launch csm module before the pure uefi bios code this could potentially be a cause of stability or system misbehavior issues with pc hardware that does not require the services of a "hardware interpreter" _________________ Compiling Gentoo since version 1.4
Thousands of Gentoo Installs Completed
Emerged on every continent but Antarctica
Compile long and Prosper! |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jul 04, 2018 7:41 pm Post subject: |
|
|
The kernel compile from a chroot environment works without issue. emerging gcc from the System Rescue CD chroot environment still fails.
The only gcc compile that works on my new system were ones I did from a Ubuntu 64-bit LiveCD. (I tried both with the source and build directory in /dev/shm and on my NVMe drive. Both worked.) I wonder if the Ubuntu 64-bit LiveCD uses a UEFI boot?
Today I also tried replacing my new Seasonic 850W power supply with an older (2014) non-modular 860W power supply. The same freeze happens during a GCC emerge. I guess there is not a power supply issue.
Quote: | While this is just one example of legacy bios compat via the launch CSM or fast boot directly causing a hardware and software driver conflict there certainly could be others that merit entirely disabling launch CSM and fast boot if your uefi bios permits doing so. |
My new motherboard (Asus WS X299 Sage) allows me to turn off CSM and fast boot. (I already have fast boot turned off.) There are also options for boot devices: "UEFI driver first".
Because I am getting near the 30 day return window for my memory, I think I will both replace the memory per The Doctor post and try a rebuild to use a UEFI boot.
Should I start a UEFI boot build via the Gentoo 2016-07-04 Hybird ISO (Live DVD)? Is there some newer Linux Live DVD that I can chroot from? (I have build about a half-dozen Gentoo systems, all using legacy BIOS. A UEFI build will be new for me.) |
|
Back to top |
|
|
Keruskerfuerst Advocate
Joined: 01 Feb 2006 Posts: 2289 Location: near Augsburg, Germany
|
Posted: Wed Jul 04, 2018 7:54 pm Post subject: |
|
|
Does the installation of binary distro work? |
|
Back to top |
|
|
ali3nx l33t
Joined: 21 Sep 2003 Posts: 732 Location: Winnipeg, Canada
|
Posted: Wed Jul 04, 2018 8:05 pm Post subject: |
|
|
Quote: | Should I start a UEFI boot build via the Gentoo 2016-07-04 Hybird ISO (Live DVD)? Is there some newer Linux Live DVD that I can chroot from? |
The Gentoo Livedvd is easily a year outdated and the kernel version could likely not be up to date with hardware support requirements necessary to run such a new cpu. This is likely also why so many gentoo users with newer hardware recommend using sysrescuecd because the kernel version included with the current version of sysrescuecd is Linux 4.14
I haven't personally used the gentoo Livedvd recently but i have doubts that the livedvd kernel is even Linux version 4 considering the last Livedvd update was exactly two years ago in July 4th 2016. Why Gentoo is still providing a two year old Livedvd is something that should have been addressed long ago
Overall sysrescuecd should be the better choice between the two but there very well could be other hardware issues here that need to be addressed that would be very challenging to diagnose without being able to complete a hands on hardware diagsostic.
Quote: | I have build about a half-dozen Gentoo systems, all using legacy BIOS. A UEFI build will be new for me. |
UEFI gentoo installs are so simple that when i completed my first one and looked at what i had accomplished the most challenging and ironically simple thing to come to terms with was that correct disk partitioning is the first requirement for uefi boot to function and the first disk partition must be fat32 formatted.
Yes fat32.. that one blew my mind the most.
Using parted to make your partition layout you only require a minimum of two partitions for uefi boot to be functional or three of you want a swap partition which can still be wise as a disaster mitigation strategy for OOM Killer or if you want to use hibernation.
uefi partiton layout really only requires this..
Here's the general parted commands i typically use to create a uefi compatible partition layout using parted
start parted with optimal partition boundary sector alignment for optimal performance
Code: | parted -a optimal /dev/sdX |
or if you have an nvme ssd
Code: | parted -a optimal /dev/nvmeXXX |
The rest of the sauce
Code: | mklabel gpt
mkpart ESP fat32 1MiB 513MiB
mkpart primary linux-swap 513MiB 2561MiB
mkpart primary ext4 2561MiB 100%
set 1 boot on
name 2 swap
name 3 rootfs |
K.I.S.S uefi compatible partition layout
Quote: | fenrir ~ # parted -l /dev/sda
Model: ATA WDC WD2003FZEX-0 (scsi)
Disk /dev/sda: 2000GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 538MB 537MB fat32 ESP boot, esp
2 538MB 4832MB 4294MB linux-swap(v1) swap
3 4832MB 2000GB 1996GB ext4 rootfs
sda1 is fat32 uefi filesystem mounted to /boot/EFI
sda3 is rootfs obviously |
One other thing you really want to try with uefi boot is using UUID based disk mounts in fstab. with GPT partition labels using uuid disk mount ID's has become the modern standard.
Code: | fenrir ~ # blkid
/dev/sda1: UUID="77A1-1E9F" TYPE="vfat" PARTLABEL="ESP" PARTUUID="cefa4dd4-94c6-47c9-aa77-5d9ba976f8a4"
/dev/sda2: UUID="5b9bbf2a-4842-4439-99e5-10549eee8c3e" TYPE="swap" PARTLABEL="swap" PARTUUID="96e97a1d-d5dd-4e06-bf41-d3bcb4cd8f54"
/dev/sda3: UUID="c73b7d61-2bc4-416c-9238-d6494394d75d" TYPE="ext4" PARTLABEL="rootfs" PARTUUID="c189b859-a40f-4311-b234-2b351e750081"
UUID=77A1-1E9F /boot/EFI vfat noauto,defaults 1 2
UUID=c73b7d61-2bc4-416c-9238-d6494394d75d / ext4 defaults 0 1
UUID=5b9bbf2a-4842-4439-99e5-10549eee8c3e none swap sw 0 0
UUID=75a471c3-75f2-41c8-bd49-5384f49bf1d8 /home ext4 defaults 0 1
#/dev/cdrom /mnt/cdrom auto noauto,ro 0 0
|
Added bonus if you check the root directory of any booted sysrescuecd there's a Linux kernel config in the root directory that has all the options preset for uefi boot to function.
Also last thing to wrap your nugget around with uefi boot is the bootloader isnt the disk MBR. when you configure grub (i still greatly prefer just using grub bootloader) grub interfaces with efibootmgr which creates a boot entry in the bios firmware pointing towards a kernel binary in the ESA uefi boot partition.
I recently updated a small hobby project of mine you could use to test hardware stability with Gentoo.
https://www.reddit.com/r/Gentoo/comments/8st37h/prebuilt_gentoo_systemdplasma_install_tarball/
I precompiled a uefi boot compliant chroot build of gentoo using the plasma systemd profile that I'm definitely certain has no system inconsistency issues. If you can get that chroot tarball booted and compile gcc with uefi active see how the results work for you.
It's something you could use to stability test as a software tool using a completed gentoo chroot build using software prepared by someone with fifteen years experience using gentoo. Try getting the chroot build booted first then do stability testing.
Lastlly if you have been using full unmasked ~arch builds you've discovered why using full unmasked ~arch builds is unstable.
"I enabled it because it looked cool!" doesn't always work well when your the master of your own destiny using Gentoo Linux _________________ Compiling Gentoo since version 1.4
Thousands of Gentoo Installs Completed
Emerged on every continent but Antarctica
Compile long and Prosper!
Last edited by ali3nx on Thu Jul 05, 2018 2:28 am; edited 7 times in total |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jul 04, 2018 8:27 pm Post subject: |
|
|
Quote: | Does the installation of binary distro work? |
I don't know. I did not try. The only binary distro I tried was the current Ubuntu 64-bit livecd. In that environment I could not get a freeze. I tried compiling gcc source using the Ubuntu provided gcc (apt get ...) |
|
Back to top |
|
|
jagdpanther l33t
Joined: 22 Nov 2003 Posts: 762
|
Posted: Wed Jul 04, 2018 8:37 pm Post subject: |
|
|
ali3nx: Thanks for all the UEFI build info.
Quote: | One other thing you really want to try with uefi boot is using UUID based disk mounts in fstab. with GPT partition labels using uuid disk mount ID's has become the modern standard. |
Instead of UUID labels, because I was and will use gpt, what about PARTLABEL? That allows you to use the label names you assign in parted. |
|
Back to top |
|
|
|