View previous topic :: View next topic |
Author |
Message |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sat Jul 20, 2013 11:34 pm Post subject: Kernel keeps renaming root device [UNSOLVABLE] |
|
|
I can no longer boot because the root device I specify is never correct. The root device is supposed to be /dev/sda5. However, it appears that the kernel is now giving it a different name every time I try to boot.
When I try to boot, I almost always get a kernel panic because the kernel has named the device "/dev/sde" or "/dev/sdg" or anything BUT /dev/sda.
If the kernel names the boot device /dev/sdc, I'll edit the line in grub's menu.lst to "root=/dev/sdc5"--but then the kernel switched the name on me again and names the boot device something else. There is no way for me to guess what the kernel is going to name the boot device and I can't find a way to make it stop.
I've tried specifying the root device by UUID in menu.lst. I found some instructions online for doing this. It doesn't work--grub doesn't seem to be able to do that sort of thing. (Then why are there instructions for it?)
I've tried creating udev rules to try to keep the device names from changing, but that didn't work either.
What can I do to keep the device names from changing so I can boot?
Last edited by exhausted on Thu Apr 03, 2014 1:48 am; edited 2 times in total |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sat Jul 20, 2013 11:36 pm Post subject: |
|
|
Specifying the UUID for / in /etc/fstab doesn't seem to help either. |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55162 Location: 56N 3W
|
Posted: Sun Jul 21, 2013 12:18 am Post subject: |
|
|
exhausted,
Explain the storage devices attached your your system and how they are connected.
For finding root by UUID, you need an initrd. It works for me. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sun Jul 21, 2013 12:37 am Post subject: |
|
|
Thanks for the quick reply!
Gah! I forgot that the kernel can't interpret UUIDs passed directly to it via a bootloader. No wonder my attempt at getting Grub to work by specifying the UUID didn't work. Since I'm not using an initramfs and would prefer not to, maybe instead of trying to specify the root device by UUID, I should go back to my original plan: Use a rule to force udev to use the device names I specify.
The devices are two solid state drives connected via SATA and a couple of external hard drives connected via USB. The root device is a SATA SSD. |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sun Jul 21, 2013 1:00 am Post subject: |
|
|
Okay, here's what I've done:
I've changed the relevant line in menu.lst back to specifying "root=/dev/sda5".
I created a udev rule file, /etc/udev/rules.d/20-persistent_disk_name.rules with the following rule:
Code: | SUBSYSTEM=="scsi", ATTRS{model}=="INTEL SSDSA2CW08", KERNEL=="sd*", SYMLINK+="sda%n" |
I still can't boot because the kernel is still changing the name of the boot device. The udev rule doesn't appear to do anything.
Edit: Maybe the udev rules are useless for forcing a specific device name for a boot device? Is it not possible to force the correct root device name using a udev rule? |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sun Jul 21, 2013 1:08 am Post subject: |
|
|
I have also tried specifying the root device in /etc/fstab by UUID:
Code: | UUID=ebbc6ab0-0c0f-4d21-98e6-63ac2ee4d84d / reiserfs defaults,noatime,data=ordered,notail 1 2 |
This doesn't seem to work either. |
|
Back to top |
|
 |
VoidMage Watchman


Joined: 14 Oct 2006 Posts: 6196
|
Posted: Sun Jul 21, 2013 1:43 am Post subject: |
|
|
If your disk has GPT partition table, you could boot by PARTUUID (well, if your kernel is recent enough, you could even do it with MBR partition, though it's a bit quirky there). |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sun Jul 21, 2013 1:54 am Post subject: |
|
|
I'm still using the old MBR partition scheme.
As for the kernel, I'm using 3.8.13-gentoo.
[rant]
This just seems patently insane. The way device names behaved in Linux has worked for many years. The device names were predictable. They didn't just change at fate's whimsy every time a system booted. The new behavior makes it impossible to know what the device names are going to be from one boot to another without--and this is my major gripe--providing an option to retain the old behavior and without providing a practical way to keep the names from changing. I'm all for progress and changes for the better, but this... this is insane! I can't even boot because I don't know what the name of my boot device is going to be!
[/rant] |
|
Back to top |
|
 |
PaulBredbury Watchman


Joined: 14 Jul 2005 Posts: 7310
|
Posted: Sun Jul 21, 2013 3:20 am Post subject: |
|
|
PARTUUID does not need an initrd. Works fine in syslinux:
Code: | LABEL Current
LINUX /boot/3.9.10-x86_64
APPEND root=PARTUUID=00020ed2-01 rootfstype=ext4 usbhid.mousepoll=2 apparmor=1 blah blah |
The kernel shows the PARTUUID values on the right-hand side, during bootup.
Edit: Hopefully removed confusion of UUID with PARTUUID.
Last edited by PaulBredbury on Sun Jul 21, 2013 4:15 am; edited 1 time in total |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sun Jul 21, 2013 3:34 am Post subject: |
|
|
Now I'm confused as all hell. I've read a lot of documentation that specifically states that you can't just pass the UUID of a partition to the Linux kernel as a boot parameter because the kernel can't interpret it. This explains why it doesn't work with grub.
I don't understand how you're getting it to work.
If I try to specify the root device in menu.lst by UUID, it does not work.
How are you getting it to work?
I suspect that you might be confusing UUID with PARTUUID. |
|
Back to top |
|
 |
The Doctor Moderator


Joined: 27 Jul 2010 Posts: 2678
|
Posted: Sun Jul 21, 2013 4:05 am Post subject: |
|
|
Observation: I don't think writing a udev rule is going to do anything because if your kernel can't mount your root partition udev and your rule will not even be loaded as they reside on your root partition. The only way udev will play any role in this is if you are using an initramfs with udev in which case you may as well mount your root partition directly.
Short term possibility: Disconnect your external drives to see if that helps. If the names are still switching randomly at least you will have a 50% of booting.
Oh, and PaulBredbury is using syslinux instead of grub. It may be worth trying a different boot loader to see if that is the problem. Syslinux doesn't have as many features as the new grub, which I find to be a distinct advantage because it makes it much easer to use. _________________ First things first, but not necessarily in that order.
Apologies if I take a while to respond. I'm currently working on the dematerialization circuit for my blue box. |
|
Back to top |
|
 |
PaulBredbury Watchman


Joined: 14 Jul 2005 Posts: 7310
|
Posted: Sun Jul 21, 2013 4:14 am Post subject: |
|
|
exhausted wrote: | confusing UUID with PARTUUID. |
I suppose I am
So, why don't you forget about udev rules (which run too late to be helpful) and use PARTUUID
I know that PARTUUID works, because my USB-connected phone steals the sda name if it's plugged in during boot What is Linus thinking?? |
|
Back to top |
|
 |
Hu Administrator

Joined: 06 Mar 2007 Posts: 23327
|
Posted: Sun Jul 21, 2013 4:21 am Post subject: |
|
|
exhausted wrote: | I'm still using the old MBR partition scheme.
As for the kernel, I'm using 3.8.13-gentoo.
[rant]
This just seems patently insane. The way device names behaved in Linux has worked for many years. The device names were predictable. They didn't just change at fate's whimsy every time a system booted. The new behavior makes it impossible to know what the device names are going to be from one boot to another without--and this is my major gripe--providing an option to retain the old behavior and without providing a practical way to keep the names from changing. I'm all for progress and changes for the better, but this... this is insane! I can't even boot because I don't know what the name of my boot device is going to be!
[/rant] | When did this break for you? I am not aware of any recent changes in the kernel rules for how to name SCSI/SATA devices. However, some systems have been known to exhibit a random discovery order, particularly when using external USB devices. |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55162 Location: 56N 3W
|
Posted: Sun Jul 21, 2013 10:58 am Post subject: |
|
|
exhausted,
You didn't explain the storage devices attached your your system and how they are connected.
Your lspci output would be useful too. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
dwbowyer Apprentice

Joined: 18 Apr 2008 Posts: 155
|
Posted: Sun Jul 21, 2013 10:32 pm Post subject: |
|
|
Not sure this helps OP, but might point in the right direction:
On some systems, Mixed PATA (legacy IDE) and SATA internal drives can exhibit this behavior too, if you have unplugged and replugged one of them. It's not random though, as the names just swap. I've had to unplug all drives and plug them back in, in the order I've wanted them named. It's also why it's not advised to mix CONFIG_IDE in the kernel along with the SATA drivers. |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sat Aug 03, 2013 9:24 pm Post subject: |
|
|
Everything was perfect until an update. I believe that it was either a kernel update or a udev update that caused the problem. The kernel is no longer assigning the name /dev/sda to the boot device. I must be able to specify the boot partition by device name; I can't use UUID or anything else. If the kernel doesn't assign the correct name to the boot device, I can't boot.
NeddySeagoon wrote: | You didn't explain the storage devices attached your your system and how they are connected. |
My apologies. My storage devices are two solid state drives connected via SATA and a couple of external hard drives connected via USB. The root device is a SATA SSD which has always been named sda.
Here's my lspci output:
Code: | 00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, medium devsel, latency 32
Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
I/O behind bridge: 0000a000-0000bfff
Memory behind bridge: fa400000-fa5fffff
Capabilities: [c0] HyperTransport: Slave or Primary Interface
Capabilities: [f0] HyperTransport: Interrupt Discovery and Configuration
00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
Subsystem: Advanced Micro Devices [AMD] AMD-8111 LPC
Flags: bus master, 66MHz, medium devsel, latency 0
00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03) (prog-if 8a [Master SecP PriP])
Subsystem: Advanced Micro Devices [AMD] AMD-8111 IDE
Flags: medium devsel
[virtual] Memory at 000001f0 (32-bit, non-prefetchable) [size=8]
[virtual] Memory at 000003f0 (type 3, non-prefetchable)
[virtual] Memory at 00000170 (32-bit, non-prefetchable) [size=8]
[virtual] Memory at 00000370 (type 3, non-prefetchable)
I/O ports at ffa0 [size=16]
00:07.2 SMBus: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0 (rev 02)
Subsystem: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0
Flags: medium devsel, IRQ 9
I/O ports at c480 [size=32]
00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
Subsystem: Advanced Micro Devices [AMD] AMD-8111 ACPI
Flags: medium devsel
00:07.5 Multimedia audio controller: Advanced Micro Devices [AMD] AMD-8111 AC97 Audio (rev 03)
Subsystem: Tyan Computer Device 2885
Flags: bus master, medium devsel, latency 32, IRQ 17
I/O ports at c800 [size=256]
I/O ports at cc00 [size=64]
Kernel driver in use: snd_intel8x0
00:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, medium devsel, latency 32
Bus: primary=00, secondary=02, subordinate=04, sec-latency=32
Memory behind bridge: fa600000-fa8fffff
Prefetchable memory behind bridge: 00000000ca000000-00000000ca1fffff
Capabilities: [a0] PCI-X bridge device
Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration
Capabilities: [c0] HyperTransport: Slave or Primary Interface
00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
Subsystem: Advanced Micro Devices [AMD] Device 36c0
Flags: bus master, medium devsel, latency 0
Memory at fa9ff000 (64-bit, non-prefetchable) [size=4K]
00:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, medium devsel, latency 32
Bus: primary=00, secondary=05, subordinate=05, sec-latency=32
Capabilities: [a0] PCI-X bridge device
Capabilities: [b8] HyperTransport: Interrupt Discovery and Configuration
00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01) (prog-if 10 [IO-APIC])
Subsystem: Advanced Micro Devices [AMD] Device 36c0
Flags: bus master, medium devsel, latency 0
Memory at fa9fe000 (64-bit, non-prefetchable) [size=4K]
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
Flags: fast devsel
Capabilities: [80] HyperTransport: Host or Secondary Interface
Capabilities: [a0] HyperTransport: Host or Secondary Interface
Capabilities: [c0] HyperTransport: Host or Secondary Interface
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
Flags: fast devsel
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
Flags: fast devsel
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
Flags: fast devsel
Kernel driver in use: k8temp
01:00.0 USB controller: Advanced Micro Devices [AMD] AMD-8111 USB OHCI (rev 0b) (prog-if 10 [OHCI])
Subsystem: Advanced Micro Devices [AMD] AMD-8111 USB OHCI
Flags: bus master, medium devsel, latency 32, IRQ 19
Memory at fa5fd000 (32-bit, non-prefetchable) [size=4K]
Kernel driver in use: ohci_hcd
01:00.1 USB controller: Advanced Micro Devices [AMD] AMD-8111 USB OHCI (rev 0b) (prog-if 10 [OHCI])
Subsystem: Advanced Micro Devices [AMD] AMD-8111 USB OHCI
Flags: bus master, medium devsel, latency 32, IRQ 19
Memory at fa5fe000 (32-bit, non-prefetchable) [size=4K]
Kernel driver in use: ohci_hcd
01:0a.0 Multimedia audio controller: VIA Technologies Inc. ICE1712 [Envy24] PCI Multi-Channel I/O Controller (rev 02)
Subsystem: VIA Technologies Inc. M-Audio Delta 1010
Flags: bus master, medium devsel, latency 32, IRQ 16
I/O ports at b080 [size=32]
I/O ports at b000 [size=16]
I/O ports at ac00 [size=16]
I/O ports at a880 [size=64]
Capabilities: [80] Power Management version 1
Kernel driver in use: snd_ice1712
01:0b.0 Mass storage controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
Subsystem: Silicon Image, Inc. SiI 3114 SATALink Controller
Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 17
I/O ports at bc00 [size=8]
I/O ports at b880 [size=4]
I/O ports at b800 [size=8]
I/O ports at b480 [size=4]
I/O ports at b400 [size=16]
Memory at fa5ffc00 (32-bit, non-prefetchable) [size=1K]
Expansion ROM at fa500000 [disabled] [size=512K]
Capabilities: [60] Power Management version 2
Kernel driver in use: sata_sil
02:07.0 PCI bridge: Hint Corp HB6 Universal PCI-PCI bridge (non-transparent mode) (rev 15) (prog-if 00 [Normal decode])
Flags: bus master, medium devsel, latency 32
Bus: primary=02, secondary=03, subordinate=03, sec-latency=32
Memory behind bridge: fa600000-fa6fffff
Capabilities: [80] Power Management version 2
Capabilities: [90] CompactPCI hot-swap <?>
Capabilities: [a0] Vital Product Data
02:08.0 PCI bridge: Pericom Semiconductor Device e111 (rev 02) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, medium devsel, latency 32
Bus: primary=02, secondary=04, subordinate=04, sec-latency=0
Memory behind bridge: fa700000-fa7fffff
Prefetchable memory behind bridge: 00000000ca000000-00000000ca0fffff
Capabilities: [80] PCI-X bridge device
Capabilities: [a8] Subsystem: Device 0000:0000
Capabilities: [b0] Express PCI/PCI-X to PCI-Express Bridge, MSI 00
Capabilities: [d8] Vital Product Data
Capabilities: [f0] MSI: Enable- Count=1/1 Maskable- 64bit+
02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5703X Gigabit Ethernet (rev 02)
Subsystem: Tyan Computer Device 2885
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 24
Memory at fa8e0000 (64-bit, non-prefetchable) [size=64K]
Expansion ROM at fa8b0000 [disabled] [size=64K]
Capabilities: [40] PCI-X non-bridge device
Capabilities: [48] Power Management version 2
Capabilities: [50] Vital Product Data
Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
Kernel driver in use: tg3
03:00.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b Link Layer Controller (rev 01) (prog-if 10 [OHCI])
Subsystem: AFAVLAB Technology Inc Device 702a
Flags: bus master, medium devsel, latency 32, IRQ 15
Memory at fa6ff800 (32-bit, non-prefetchable) [size=2K]
Memory at fa6f8000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [44] Power Management version 2
03:01.0 USB controller: NEC Corporation OHCI USB Controller (rev 43) (prog-if 10 [OHCI])
Subsystem: Siig Inc Device 131f
Flags: bus master, medium devsel, latency 32, IRQ 27
Memory at fa6fd000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] Power Management version 2
Kernel driver in use: ohci_hcd
03:01.1 USB controller: NEC Corporation OHCI USB Controller (rev 43) (prog-if 10 [OHCI])
Subsystem: Siig Inc Device 131f
Flags: bus master, medium devsel, latency 32, IRQ 24
Memory at fa6fe000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [40] Power Management version 2
Kernel driver in use: ohci_hcd
03:01.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04) (prog-if 20 [EHCI])
Subsystem: Siig Inc Device 00e0
Flags: bus master, medium devsel, latency 32, IRQ 25
Memory at fa6ff400 (32-bit, non-prefetchable) [size=256]
Capabilities: [40] Power Management version 2
Kernel driver in use: ehci_hcd
04:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller (rev 04) (prog-if 30 [XHCI])
Subsystem: NEC Corporation uPD720200 USB 3.0 Host Controller
Flags: bus master, fast devsel, latency 0, IRQ 27
Memory at fa7fe000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [50] Power Management version 3
Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
Capabilities: [90] MSI-X: Enable- Count=8 Masked-
Capabilities: [a0] Express Endpoint, MSI 00
Kernel driver in use: xhci_hcd
06:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-8151 System Controller (rev 14)
Subsystem: Advanced Micro Devices [AMD] AMD-8151 System Controller
Flags: bus master, medium devsel, latency 0
Memory at <ignored> (32-bit, prefetchable) [size=128M]
Capabilities: [a0] AGP version 3.0
Capabilities: [c0] HyperTransport: Slave or Primary Interface
Kernel driver in use: agpgart-amd64
06:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8151 AGP Bridge (rev 14) (prog-if 00 [Normal decode])
Flags: bus master, 66MHz, medium devsel, latency 32
Bus: primary=06, secondary=07, subordinate=07, sec-latency=32
Memory behind bridge: faa00000-feafffff
Prefetchable memory behind bridge: ca300000-ea2fffff
07:00.0 VGA compatible controller: NVIDIA Corporation NV40 [GeForce 6800 Ultra] (rev a1) (prog-if 00 [VGA controller])
Flags: bus master, 66MHz, medium devsel, latency 248, IRQ 16
Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (32-bit, prefetchable) [size=256M]
Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
[virtual] Expansion ROM at feae0000 [disabled] [size=128K]
Capabilities: [60] Power Management version 2
Capabilities: [44] AGP version 3.0
Kernel driver in use: nvidia
Kernel modules: nvidia |
|
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sat Aug 03, 2013 9:33 pm Post subject: |
|
|
I've been working on this off and on for so long, I'm probably not far away from giving up. I suspect that there are several courses of action I could try:
- Try downgrading udev and/or the kernel. This can't be a very good option. I can't hang on to some old version of udev and/or kernel forever.
- Reinstall from scratch. I really don't want to do that. Even if I reinstall from scratch, what would prevent this exact same problem from happening again?
- Try upgrading from a backup. I have a complete backup of my system. Unfortunately, that backup is a year old. (I'm currently running that backup at the moment.) Would it be practical to reload from a year-old backup and try updating it? There's also the risk that I'd run into the same problem after updating the kernel and/or udev, whichever it was that screwed up my system.
Do any of those options seem like a good idea? |
|
Back to top |
|
 |
The Doctor Moderator


Joined: 27 Jul 2010 Posts: 2678
|
Posted: Sat Aug 03, 2013 9:44 pm Post subject: |
|
|
Quote: | Do any of those options seem like a good idea? |
Not really. You can't update a year old install. Installing from scratch probably won't do it since it won't fix the problem. Downgrading the kernel may help. As I pointed out before, udev isn't a player if you can't mount your root since it resides there.
Better: unplug you external drives and see if that helps.
Or: Play with using a PARTUUID for root. As PaulBredbury said, it works and you can boot with it. You can use UUID for everything else. _________________ First things first, but not necessarily in that order.
Apologies if I take a while to respond. I'm currently working on the dematerialization circuit for my blue box. |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55162 Location: 56N 3W
|
Posted: Sat Aug 03, 2013 9:45 pm Post subject: |
|
|
exhausted,
Nope, none of those are good ideas, for the reasons you listed.
Updating a one year old Gentoo is an interesting intellectual exercise but a reinstall would be faster.
Explain what storage devices you have attached to your system and the physical attachment, e.g. USB, SATA, PATA, SCSI ...
Also post the output of /sbin/blkid _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
ulenrich Veteran

Joined: 10 Oct 2010 Posts: 1483
|
Posted: Sat Aug 03, 2013 10:36 pm Post subject: |
|
|
Truely, this is an exhausting story. If your issue would be correctly diagnosed you will probably lough out loud about !
The cause could be a very minor behavioral error on your side: If you for example used some backup method on the partition level and duplicated UUIDs and labels by restoring to another partition,
or something like that ... |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sat Aug 03, 2013 10:52 pm Post subject: |
|
|
The Doctor wrote: | unplug you external drives and see if that helps. |
That's a great suggestion. I did try it, though, to no avail.
The Doctor wrote: | Play with using a PARTUUID for root. |
If I understand correctly, I can try the following:
1. Recompile the 3.3.8 kernel on my working year-old system installed from backup) to support PARTUUIDs.
2. Boot the recompiled kernel to find out what the PARTUUIDs are.
3. Chroot into the broken installation, recompile the broken installation's kernel to support PARTUUIDs.
4. Edit the broken installation's fstab to specify / by PARTUUID. (This step isn't necessary at all, is it?)
5. Edit grub's menu.lst to pass Code: | root=whatever the PARUUID turns out to be | to the kernel via GRUB.
I'll wait a bit to see anybody sees any flaws in this plan and then I'll try it, probably tomorrow morning.
Quote: | Explain what storage devices you have attached to your system and the physical attachment, e.g. USB, SATA, PATA, SCSI ... |
I must be really thick. I've stated that I have two SSDs attached via SATA and two external HDDs attached via USB. This isn't the information you're asking for, is it? (I greatly appreciate your patience. I'm actually a computing technology veteran, but I'm definitely not the sharpest tool in the shed.) |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sat Aug 03, 2013 10:56 pm Post subject: |
|
|
ulenrich wrote: | Truely, this is an exhausting story. If your issue would be correctly diagnosed you will probably lough out loud about ! |
Yes, I suspect that you are quite right--this problem might very well turn out to have a truly ridiculous cause. |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Sat Aug 03, 2013 11:03 pm Post subject: |
|
|
It's got to be the kernel. Obviously, (as The Doctor has already pointed out) udev has nothing to do with this. It's got everything to do with how the newer kernel deals with the hardware. There were no hardware or firmware changes. I am absolutely certain of that. The kernel just isn't behaving the same when it comes to assigning bus names.
UPDATE:
I chrooted into the broken installation and compiled a 3.3.8-gentoo kernel for it. I built and installed the kernels and modules. I was able to boot the broken installation using the 3.3.8 kernel!
Everything seemed perfectly fine until about five or six minutes later: The system spontaneously rebooted. ARGH!
I have verified that I can boot the broken installation using an older kernel. Older kernels assign the expected /dev/sda5 bus name to the root partition. However, the system is apparently unstable when booted using an older kernel. It will work for a few minutes, then spontaneously reboot.
I checked my /var/log/messages file (I'm using syslog-ng). Everything looks normal to me except for a machine check error. Here's the last several lines of the log:
Code: | Aug 4 00:13:02 amd64-at login[2833]: ROOT LOGIN on '/dev/tty2'
Aug 4 00:14:21 amd64-at acpid: client connected from 2870[0:0]
Aug 4 00:14:21 amd64-at acpid: 1 client rule loaded
Aug 4 00:15:03 amd64-at ntpd_intres[2751]: host name not found: 4.ntp.bytestacker.com
Aug 4 00:15:03 amd64-at ntpd_intres[2751]: host name not found: 5.ticker.cis.sac.accd.edu
Aug 4 00:15:04 amd64-at ntpd_intres[2751]: host name not found: 6.sundial.cis.sac.accd.edu
Aug 4 00:15:04 amd64-at ntpd_intres[2751]: host name not found: 7.ntppub.tamu.edu
Aug 4 00:15:05 amd64-at ntpd_intres[2751]: host name not found: 8.chrono.cis.sac.accd.edu
Aug 4 00:15:05 amd64-at ntpd_intres[2751]: host name not found: 9.tick.jpunix.net
Aug 4 00:15:06 amd64-at ntpd_intres[2751]: host name not found: 10.ntp.tmc.edu
Aug 4 00:15:06 amd64-at ntpd_intres[2751]: host name not found: 11.ac-ntp1.net.cmu.edu
Aug 4 00:15:07 amd64-at ntpd_intres[2751]: host name not found: 12.ac-ntp0.net.cmu.edu
Aug 4 00:15:07 amd64-at ntpd_intres[2751]: host name not found: 13.ac-ntp2.net.cmu.edu
Aug 4 00:15:47 amd64-at kernel: mtrr: no MTRR for d0000000,10000000 found
Aug 4 00:15:58 amd64-at acpid: client 2870[0:0] has disconnected
Aug 4 00:15:58 amd64-at acpid: client connected from 2981[0:0]
Aug 4 00:15:58 amd64-at acpid: 1 client rule loaded
Aug 4 00:16:07 amd64-at ntpd_intres[2751]: parent died before we finished, exiting
Aug 4 00:17:26 amd64-at kernel: [Hardware Error]: Machine check events logged
Aug 4 00:18:15 amd64-at acpid: client 2981[0:0] has disconnected
Aug 4 00:18:15 amd64-at acpid: client connected from 3017[0:0]
Aug 4 00:18:15 amd64-at acpid: 1 client rule loaded
Aug 4 00:19:01 amd64-at acpid: client 3017[0:0] has disconnected
Aug 4 00:19:01 amd64-at acpid: client connected from 3044[0:0]
Aug 4 00:19:01 amd64-at acpid: 1 client rule loaded
Aug 4 00:19:38 amd64-at acpid: client 3044[0:0] has disconnected
Aug 4 00:19:38 amd64-at acpid: client connected from 3071[0:0]
Aug 4 00:19:38 amd64-at acpid: 1 client rule loaded
Aug 4 00:23:53 amd64-at kernel: mtrr: no MTRR for d0000000,10000000 found
Aug 4 00:24:00 amd64-at acpid: client 3071[0:0] has disconnected
Aug 4 00:24:00 amd64-at acpid: client connected from 3159[0:0]
Aug 4 00:24:00 amd64-at acpid: 1 client rule loaded
Aug 4 00:31:43 amd64-at acpid: client 3159[0:0] has disconnected
Aug 4 00:31:43 amd64-at acpid: client connected from 3186[0:0]
Aug 4 00:31:43 amd64-at acpid: 1 client rule loaded
Aug 4 00:32:50 amd64-at acpid: client 3186[0:0] has disconnected
Aug 4 00:32:50 amd64-at acpid: client connected from 3214[0:0]
Aug 4 00:32:50 amd64-at acpid: 1 client rule loaded |
What the heck? A machine check exception? I'm inclined to believe that this is not actually a hardware fault. This system has run nonstop for about a week using my backup Gentoo installation with no sign of any hardware problems. |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 55162 Location: 56N 3W
|
Posted: Sun Aug 04, 2013 11:04 am Post subject: |
|
|
exhausted,
Your PARTUUID is sound provided that 3.3.8 supports PARTUUIDs for anything other then GPT.
Its fairly new for MSDOS Partition tables. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
exhausted n00b

Joined: 20 Jul 2013 Posts: 28 Location: Scanner Hell
|
Posted: Thu Apr 03, 2014 1:47 am Post subject: |
|
|
After many months of trying to solve this problem, I deem this unsolvable.
It's truly a freakish problem. The Linux kernel appears to be assigning device names unpredictably and changing up the names with every boot. Every boot, it's essentially a roll of the dice. This only affects newer versions of the kernel.
I was forced to wipe the SSD and install Gentoo from scratch, which probably turned out to be a great idea. The previous installation was from 2005 and had built up a great deal of cruft. There were lots of configuration files that are no longer used, different files used for some things, the location of some files have changed--there's just been a lot that's happened since 2005. Installing from scratch got me a much cleaner system.
The new installation uses the latest stable kernel from gentoo-sources with no problems. sda is always sda, sdb is always sdb, et cetera.
Many thanks to everybody for their help with this weird issue. |
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|