View previous topic :: View next topic |
Author |
Message |
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Fri Aug 05, 2016 5:28 am Post subject: Newly built kernels are unbootable, provoke power cycle |
|
|
I'm running a linux kernel, version 4.1.12 that I built back in November 2015, and trying to update to 4.1.15-rc1.
I emerged the package, emerged @module-rebuild, used make oldconfig and then make && make modules_install, copied over the kernel into my /boot and reran the grub configuration maker, all as per usual. Incidentally, the .config for the 4.1.15 kernel and the /proc/config for my running 4.1.12 kernel are identical (except for the header, obviously).
However I can't boot into the new kernel at all. As soon as I select it in grub, the machine resets (hardware down and hardware up - as though the power had been cycled). No panic seems to happen, no messages are visible on the screen (beyond what grub itself echoes) and nothing is written to the /var/log/kern.log - it is as though the failed boot never happened. Old kernels, be it 4.1.12 or previously built kernels I have lying around, all work. Booting into windows works too.
I tried booting into the new kernel via kexec to see if it was a grub problem, but as soon as I do kexec -e the exact same reset happens. Again, old kernels work just fine with kexec, and again the last thing written to the kern.log are the shutdown messages from the working 4.1.12 kernel.
I've tried with linux 4.4.6 to see if it was something funny with 4.1.15, but the same problem persists. This leads me to suspect that it's something in how the kernel is being built, but I can't figure out what. cat /proc/version for the 4.1.12 kernel shows:
Code: |
Linux version 4.1.12-gentoo (root@Silence) (gcc version 4.9.3 (Gentoo 4.9.3 p1.2, pie-0.6.3) ) #1 SMP Tue Nov 3 18:44:09 EST 2015
|
While the gcc version shows:
Code: |
gcc (Gentoo 4.9.3 p1.5, pie-0.6.4) 4.9.3
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
So they look to be by and large the same versions of gcc, so that's probably not the problem.
I'm running the commercial ati video drivers, but I usually don't have trouble with those beyond gui applications crashing and what have you. I kinda doubt they're the cause here, just because it seems as though a video driver issue would at least cause the kernel to panic.
I'm wondering if someone around here can lend a hand, cause I'm really at my wits' end here.
Thanks! |
|
Back to top |
|
|
Syl20 l33t
Joined: 04 Aug 2005 Posts: 621 Location: France
|
Posted: Fri Aug 05, 2016 10:20 am Post subject: Re: Newly built kernels are unbootable, provoke power cycle |
|
|
escozzia wrote: | I emerged the package, emerged @module-rebuild, used make oldconfig and then make && make modules_install, copied over the kernel into my /boot and reran the grub configuration maker, all as per usual. |
I don't know if that will solve your problem, but you should emerge @module-rebuild after installing your newly built kernel.
If not, could you pastebin your .config and give us more informations about your hardware ? |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Fri Aug 05, 2016 1:11 pm Post subject: Re: Newly built kernels are unbootable, provoke power cycle |
|
|
Syl20 wrote: |
I don't know if that will solve your problem, but you should emerge @module-rebuild after installing your newly built kernel. |
I did not know that, but unfortunately doing that didn't solve the problem
Syl20 wrote: |
If not, could you pastebin your .config and give us more informations about your hardware ? |
Sure, the .config is up here.
My hardware is pretty prosaic, I'm running a four core i5-3750K on an Asus P8 Z77-V LX mobo, the graphics card is an AMD Radeon 7850 HD, and I'm using 8GB of ram.
Here's what lspci has to say:
Code: |
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.4 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 5 (rev c4)
00:1c.5 PCI bridge: Intel Corporation 82801 PCI Bridge (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation Z77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Pitcairn PRO [Radeon HD 7850 / R7 265 / R9 270 1024SP]
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
04:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 03)
|
And here's what lscpu says
Code: |
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 58
Model name: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz
Stepping: 9
CPU MHz: 3400.000
CPU max MHz: 3401.0000
CPU min MHz: 1600.0000
BogoMIPS: 6820.38
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-3
|
|
|
Back to top |
|
|
Maitreya Guru
Joined: 11 Jan 2006 Posts: 445
|
Posted: Fri Aug 05, 2016 2:59 pm Post subject: |
|
|
Quote: |
used make oldconfig
|
Although it being a very nice helper, it is far from perfect.
Consider using menuconfig to check if everything you need is still there and stuff you don't need is gone. |
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Fri Aug 05, 2016 3:02 pm Post subject: |
|
|
can get emerge --info? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54790 Location: 56N 3W
|
Posted: Fri Aug 05, 2016 5:35 pm Post subject: |
|
|
escozzia,
I suspect that you have made the kernel for the wrong CPU.
The kernel does not always detect this. What happens then is that you get an illegal instruction exception before the exception handler is set up, so the system resets.
Your CPU is
Code: | Model name: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz |
Your Code: | # Automatically generated file; DO NOT EDIT.
# Linux/x86 4.1.15-gentoo-r1 Kernel Configuration | .config looks OK in the CPU department.
What of your 4.4.6 .config ? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Fri Aug 05, 2016 11:50 pm Post subject: |
|
|
krinn wrote: |
can get emerge --info?
|
Sure thing
NeddySeagoon wrote: | escozzia,
I suspect that you have made the kernel for the wrong CPU.
The kernel does not always detect this. What happens then is that you get an illegal instruction exception before the exception handler is set up, so the system resets.
Your CPU is
Code: | Model name: Intel(R) Core(TM) i5-3570K CPU @ 3.40GHz |
Your Code: | # Automatically generated file; DO NOT EDIT.
# Linux/x86 4.1.15-gentoo-r1 Kernel Configuration | .config looks OK in the CPU department.
What of your 4.4.6 .config ? |
Hmm, that makes sense, but the kernel that I have working is 4.1.12, with both 4.1.15 and 4.4.6 giving me trouble.
The particularly weird thing, which also gets to Maitreya's point about make oldconfig is that the working 4.1.12 and the non working 4.1.15 are basically identical:
Code: |
Silence src # diff linux-4.1.15-gentoo-r1/.config linux-4.1.12-gentoo/.config
3c3
< # Linux/x86 4.1.15-gentoo-r1 Kernel Configuration
---
> # Linux/x86 4.1.12-gentoo Kernel Configuration
Silence src #
|
(Of course, there's lots of differences between the working 4.1.12 and the non working 4.4.6, but from what I can tell most of those look like normal kernel upgrade stuff.)
Edit:
the 4.4.6 also looks okay for my CPU:
Code: |
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
|
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54790 Location: 56N 3W
|
Posted: Sat Aug 06, 2016 7:29 am Post subject: |
|
|
escozzia,
That's not the CPU kernel settings I had in mind.
Its Code: | │ │ Processor family (Opteron/Athlon64/Hammer/K8) ---> │ │
│ │ [*] Supported processor vendors ---> | These menu items.
Use wgetpaste to put your entire .config on a pastebin. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Sat Aug 06, 2016 2:46 pm Post subject: |
|
|
NeddySeagoon wrote: | escozzia,
That's not the CPU kernel settings I had in mind.
Its Code: | │ │ Processor family (Opteron/Athlon64/Hammer/K8) ---> │ │
│ │ [*] Supported processor vendors ---> | These menu items.
Use wgetpaste to put your entire .config on a pastebin. |
Sure, here it is: http://bpaste.net/show/57ecfdf0fa4d |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54790 Location: 56N 3W
|
Posted: Sat Aug 06, 2016 3:48 pm Post subject: |
|
|
escozzia,
That looks mostly harmless.
Please explain how you configured your 4.4.6 kernel. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Sat Aug 06, 2016 3:51 pm Post subject: |
|
|
NeddySeagoon wrote: | escozzia,
That looks mostly harmless.
Please explain how you configured your 4.4.6 kernel. |
copied over /proc/config.gz (this one: http://bpaste.net/show/86df5ef2838c) into .config, then make oldconfig (saying N to most of the new stuff as prompted), then make && make modules_install. Lastly copied it over onto /boot/ and grub2-mkconfig -o /boot/grub/grub.cfg |
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Sat Aug 06, 2016 4:24 pm Post subject: |
|
|
How could you use kernel without FHANDLE and have a working system? You don't use (e)udev?
However it should not disturb kernel early boot, your symptoms seems more like cpu is refusing/cannot understand your kernel.
What gives a simple "file /boot/kernel_4.4.6.binary" output? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54790 Location: 56N 3W
|
Posted: Sat Aug 06, 2016 4:27 pm Post subject: |
|
|
escozzia,
That's the right answer, so I'm beginning to suspect its not the .config.
Can you rebuild your 4.1.12 and see if it still works?
If you edit the Makefile, at the top it will have something like
Code: | VERSION = 4
PATCHLEVEL = 6
SUBLEVEL = 0
EXTRAVERSION = -gentoo | Change the EXTRAVERSION so that everything is kept separate.
Take care that EXTRAVERSION does not end with whitespace.
Start the build with to force everything to be rebuilt.
Did you ever edit .config with $EDITOR? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Sat Aug 06, 2016 4:35 pm Post subject: |
|
|
krinn wrote: | How could you use kernel without FHANDLE and have a working system? You don't use (e)udev? |
I do use udev, and while it does complain about the missing FHANDLE when I emerge it, it's never actually given me trouble for some reason. Luck of the ignorant I suppose.
krinn wrote: | However it should not disturb kernel early boot, your symptoms seems more like cpu is refusing/cannot understand your kernel.
What gives a simple "file /boot/kernel_4.4.6.binary" output? |
Code: |
arch/x86/boot/bzImage: Linux kernel x86 boot executable bzImage, version 4.4.6-gentoo (root@Silence) #4 SMP Thu Aug 4 23:14:18 EDT 2016, RO-rootFS, swap_dev 0x4, Normal VGA
|
NeddySeagoon wrote: |
escozzia,
That's the right answer, so I'm beginning to suspect its not the .config.
Can you rebuild your 4.1.12 and see if it still works?
|
Unfortunately I cannot - I've since (stupidly) depcleaned gentoo-sources-4.1.12 away, and the ebuild is gone:
Code: |
Silence linux-4.1.12-gentoo # ls /usr/portage/sys-kernel/gentoo-sources/gentoo-sources-4.1.*
/usr/portage/sys-kernel/gentoo-sources/gentoo-sources-4.1.15-r1.ebuild /usr/portage/sys-kernel/gentoo-sources/gentoo-sources-4.1.29.ebuild
/usr/portage/sys-kernel/gentoo-sources/gentoo-sources-4.1.27.ebuild
|
Do you know if there's somewhere I can get an archival gentoo-sources-4.1.12 ebuild?
NeddySeagoon wrote: |
Did you ever edit .config with $EDITOR?
|
Nope, always via the make *config commands |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54790 Location: 56N 3W
|
Posted: Sat Aug 06, 2016 5:05 pm Post subject: |
|
|
escozzia,
The ebuild will be in git but I don't know how to dig it out.
There are a few other threads on the forums on getting a single file out of git.
The sources will still be available. You may even have them unless you have cleaned your distfiles. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Sat Aug 06, 2016 5:16 pm Post subject: |
|
|
i still have gentoo-sources-4.1.12.ebuild
Code: | # Copyright 1999-2015 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Id$
EAPI="5"
ETYPE="sources"
K_WANT_GENPATCHES="base extras experimental"
K_GENPATCHES_VER="16"
K_DEBLOB_AVAILABLE="0"
K_KDBUS_AVAILABLE="1"
inherit kernel-2
detect_version
detect_arch
KEYWORDS="alpha amd64 ~arm ~arm64 -hppa ia64 ~mips ppc ppc64 ~s390 ~sh sparc x86"
HOMEPAGE="https://dev.gentoo.org/~mpagano/genpatches"
IUSE="experimental"
DESCRIPTION="Full sources including the Gentoo patchset for the ${KV_MAJOR}.${KV_MINOR} kernel tree"
SRC_URI="${KERNEL_URI} ${GENPATCHES_URI} ${ARCH_URI}"
pkg_postinst() {
kernel-2_pkg_postinst
einfo "For more info on this patchset, and how to report problems, see:"
einfo "${HOMEPAGE}"
}
pkg_postrm() {
kernel-2_pkg_postrm
}
|
And you might also have a copy in : /var/db/pkg/sys-kernel/gentoo-sources-4.1.12 |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Sat Aug 06, 2016 5:23 pm Post subject: |
|
|
Ok, I was able to emerge gentoo-sources-4.1.12 and build a new 4.1.12 kernel, with the exact same config as my working kernel:
Code: |
Silence linux-4.1.12-gentoo # !diff
diff config .config
3c3
< # Linux/x86 4.1.12-gentoo Kernel Configuration
---
> # Linux/x86 4.1.12-test-kernel Kernel Configuration
|
But I can't boot into the newly built 4.1.12 either, same reset problem. |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Sat Aug 06, 2016 5:54 pm Post subject: |
|
|
What does your grub kernel configuration file look like? |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54790 Location: 56N 3W
|
Posted: Sat Aug 06, 2016 7:36 pm Post subject: |
|
|
escozzia,
That's progress. It confirms that its not the kernel .config.
What does show?
I am beginning to suspect either hardware or your toolchain. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Sat Aug 06, 2016 7:56 pm Post subject: |
|
|
NeddySeagoon wrote: | escozzia,
That's progress. It confirms that its not the kernel .config.
What does show?
I am beginning to suspect either hardware or your toolchain. |
Code: |
Silence ~ # gcc-config -l
[1] x86_64-pc-linux-gnu-4.9.3 *
|
Which I think looks like the same gcc that built the working kernel:
Code: |
Silence ~ # cat /proc/version
Linux version 4.1.12-gentoo (root@Silence) (gcc version 4.9.3 (Gentoo 4.9.3 p1.2, pie-0.6.3) ) #1 SMP Tue Nov 3 18:44:09 EST 2015
|
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54790 Location: 56N 3W
|
Posted: Sat Aug 06, 2016 8:26 pm Post subject: |
|
|
escozzia,
Humour me a little ...
Here what I've done
Code: | emerge =gentoo-sources-4.4.6 -1
cd /usr/src/linux-4.4.6-gentoo/
wget https://bpaste.net/raw/57ecfdf0fa4d
cp 57ecfdf0fa4d .config
make oldconfig
make -j10
scp arch/x86/boot/bzImage neddyseagoon@dev.gentoo.org:/home/neddyseagoon/public_hlml |
Then I logged in, sorted out the mess and renamed the kernel to escozzia_bzImage
Oh, make oldconfig did nothing, which was expected.
That's gentoo-sources-4.4.6 built on my system with gcc-5.4.0. Its only the bzImage file to put into /boot.
You can have the modules if you want but they are not needed for this test.
Put that file into /boot. Tell grub about it, then try to boot it. What happens? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
krinn Watchman
Joined: 02 May 2003 Posts: 7470
|
Posted: Sat Aug 06, 2016 8:53 pm Post subject: |
|
|
escozzia wrote: | NeddySeagoon wrote: |
I am beginning to suspect either hardware or your toolchain. |
Which I think looks like the same gcc that built the working kernel:
Code: |
Silence ~ # cat /proc/version
Linux version 4.1.12-gentoo (root@Silence) (gcc version 4.9.3 (Gentoo 4.9.3 p1.2, pie-0.6.3) ) #1 SMP Tue Nov 3 18:44:09 EST 2015
|
|
That's not the same gcc, that's just the same version.
From your post #1, your current gcc is gcc (Gentoo 4.9.3 p1.5, pie-0.6.4) 4.9.3 |
|
Back to top |
|
|
escozzia n00b
Joined: 08 May 2011 Posts: 14
|
Posted: Sat Aug 06, 2016 9:47 pm Post subject: |
|
|
krinn wrote: |
That's not the same gcc, that's just the same version.
From your post #1, your current gcc is gcc (Gentoo 4.9.3 p1.5, pie-0.6.4) 4.9.3
|
Whoop, you're right, my bad - so there's definitely been a gcc change since.
NeddySeagoon wrote: | escozzia,
Humour me a little ...
Here what I've done
Code: | emerge =gentoo-sources-4.4.6 -1
cd /usr/src/linux-4.4.6-gentoo/
wget https://bpaste.net/raw/57ecfdf0fa4d
cp 57ecfdf0fa4d .config
make oldconfig
make -j10
scp arch/x86/boot/bzImage neddyseagoon@dev.gentoo.org:/home/neddyseagoon/public_hlml |
Then I logged in, sorted out the mess and renamed the kernel to escozzia_bzImage
Oh, make oldconfig did nothing, which was expected.
That's gentoo-sources-4.4.6 built on my system with gcc-5.4.0. Its only the bzImage file to put into /boot.
You can have the modules if you want but they are not needed for this test.
Put that file into /boot. Tell grub about it, then try to boot it. What happens? |
Ah, yours booted perfectly!
I think that points to a toolchain issue. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54790 Location: 56N 3W
|
Posted: Sat Aug 06, 2016 10:50 pm Post subject: |
|
|
escozzia,
Looks like it. You cannot rely on your toolchain to (re)build itself correctly as it seems to be broken.
Its worth booting into memtest to see what that says about your hardware. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|