View previous topic :: View next topic |
Author |
Message |
linux_os2 Apprentice
Joined: 29 Aug 2018 Posts: 252 Location: Zedelgem Belgium
|
Posted: Mon Apr 29, 2024 2:43 pm Post subject: SOLVED Extremely slow startup of gnome. |
|
|
After adding a new gentoo system on a new partition it takes minutes until the logon screen appears,
then after filling in the password after some time screen go's blank. Once I was able to login via ssh and saw that gdm times out.
Motherboard ASUS Z10PE-D16 WS
128 GB ECC memory
2 ssd's of 1 TB in fake raid 1
1 ssd of 1 TB as swap
1 NVME of 1 TB
6 drives of 18 TB in raid 5 (pure mdadm)
On the fake raid:
p1 Windows recovery
p2 EFI system partition (created by windows)
p3 Microsoft reserved partition
p4 Basic data partition , Windows 10
p5 Gentoo 1
p6 Gentoo 1
p7 Gentoo 1
p8 Bootable ISO's
p9 CloneZilla
p12 LFS
p13 LFS
p14 LFS
When a new partition is added for the new Gentoo then the same symptom occurs on all gentoo partitions on this raid array.
Later when the new partition is deleted; the problem remains, even when all partitions from 6 and higher are deleted.
When the partitions Windows are deleted the problem disappears.
When Windows is reinstalled, the problem comes back.
When Gentoo backup of a freshly taken backup is restored on a NVME partition and adjust the uuid then OK logon normal.
IMHO this is not a software problem.
Perhaps something is kept in the non-volatile memory of the motherboard.
Is it possible to manage the non-volatile memory?
Already a reset of Firmware settings is done.
Last edited by linux_os2 on Wed May 01, 2024 4:33 pm; edited 1 time in total |
|
Back to top |
|
|
linux_os2 Apprentice
Joined: 29 Aug 2018 Posts: 252 Location: Zedelgem Belgium
|
Posted: Mon Apr 29, 2024 5:06 pm Post subject: |
|
|
Did a bios update: same result: NOK |
|
Back to top |
|
|
linux_os2 Apprentice
Joined: 29 Aug 2018 Posts: 252 Location: Zedelgem Belgium
|
Posted: Mon Apr 29, 2024 6:51 pm Post subject: |
|
|
Also updated the firmware...still no luck |
|
Back to top |
|
|
linux_os2 Apprentice
Joined: 29 Aug 2018 Posts: 252 Location: Zedelgem Belgium
|
Posted: Tue Apr 30, 2024 9:02 am Post subject: |
|
|
Installed ubuntu 14.04 on a new partition of the Raid 1 drive.
Surprisingly this system behave normally, logon is possible. |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2430
|
Posted: Tue Apr 30, 2024 9:45 am Post subject: |
|
|
Most likely kernel configuration problem, I've requested your thread to be moved to Kernel & Hardware as it'll get more attention from more of the right people there.
Best Regards,
Georgi |
|
Back to top |
|
|
linux_os2 Apprentice
Joined: 29 Aug 2018 Posts: 252 Location: Zedelgem Belgium
|
Posted: Tue Apr 30, 2024 11:19 am Post subject: |
|
|
untill now kernel: kernel-config-6.8.3-gentoo-x86_64 was booted
tried an older: kernel-config-6.8.3-gentoo-x86_64, same result
journalctl:
kernel-config-6.8.0-gentoo-x86_64:
kernel-config-6.8.3-gentoo-x86_64: |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5109 Location: Bavaria
|
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5109 Location: Bavaria
|
Posted: Tue Apr 30, 2024 1:15 pm Post subject: |
|
|
It is hard to check a .config without knowing the machine ... therefore we often ask for a (complete) "dmesg" (*) after booting the machine.
There is always the question of whether the .config was created for a kernel that runs natively or in a VM.
I couldn't find any faulty options quickly, except that this could be extremely slow:
Code: | # CONFIG_DRM_RADEON is not set
# CONFIG_DRM_AMDGPU is not set
# CONFIG_DRM_NOUVEAU is not set
# CONFIG_DRM_I915 is not set
# CONFIG_DRM_XE is not set
CONFIG_DRM_VGEM=y |
*) https://wiki.gentoo.org/wiki/User:Pietinger/Overview_of_System_Information _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
linux_os2 Apprentice
Joined: 29 Aug 2018 Posts: 252 Location: Zedelgem Belgium
|
Posted: Tue Apr 30, 2024 5:21 pm Post subject: |
|
|
pietinger wrote: | It is hard to check a .config without knowing the machine ... therefore we often ask for a (complete) "dmesg" (*) after booting the machine.
There is always the question of whether the .config was created for a kernel that runs natively or in a VM. |
Gentoo runs directly on the hardware no VM.
dmesg with the problem: https://bpa.st/ZAQQ
dmesg on the clone of the failing system on a different drive-partition, on same hardware of course. https://bpa.st/5JQA
hardware is explained in first post.
The video-card is a ASUS GTX 1080TI
I use x11-drivers/nvidia-drivers-550.67 |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5109 Location: Bavaria
|
Posted: Tue Apr 30, 2024 7:29 pm Post subject: |
|
|
To be honest, I don't quite understand what you mean by "Clone" ... and since I'm not familiar with systemd, I can't say anything about it. Do you mean with "Clone" you are using the same system image on the same machine, but only started from a different partition ?
This doesnt look correct:
Code: | [ 3.420036] GPT:Primary header thinks Alt. header is not at the end of the disk.
[ 3.420039] GPT:1951170559 != 1953525167
[ 3.420041] GPT:Alternate GPT header not at the end of the disk.
[ 3.420042] GPT:1951170559 != 1953525167
[ 3.420043] GPT: Use GNU Parted to correct GPT errors.
[ 3.420050] sdd: sdd1 sdd2 sdd3 sdd4 sdd5 sdd6
[ 3.420104] sd 7:0:0:0: [sdh] supports TCG Opal
[ 3.420108] sd 7:0:0:0: [sdh] Attached SCSI disk
[ 3.420267] GPT:Primary header thinks Alt. header is not at the end of the disk.
[ 3.420270] GPT:1951170559 != 1953525167
[ 3.420272] GPT:Alternate GPT header not at the end of the disk.
[ 3.420273] GPT:1951170559 != 1953525167
[ 3.420274] GPT: Use GNU Parted to correct GPT errors.
[ 3.420282] sde: sde1 sde2 sde3 sde4 sde5 sde6
[ 3.421651] sd 5:0:0:0: [sdd] supports TCG Opal
[ 3.421654] sd 5:0:0:0: [sdd] Attached SCSI disk
[ 3.421822] sd 4:0:0:0: [sde] supports TCG Opal
[ 3.421825] sd 4:0:0:0: [sde] Attached SCSI disk
[ 3.449671] Alternate GPT is invalid, using primary GPT.
[ 3.449680] sdc: sdc1
[ 3.449802] sd 6:0:0:0: [sdc] Attached SCSI disk
[ 3.450835] Alternate GPT is invalid, using primary GPT.
[ 3.450839] sdg: sdg1
[ 3.450948] sd 8:0:0:0: [sdg] Attached SCSI disk |
If it is the same kernel configuration, then it SHOULD not be the kernel ... BUT ... you are using options which cost (a lot of) performance:
Code: | CONFIG_KGDB=y
CONFIG_DEBUG_VM=y
CONFIG_DEBUG_PREEMPT=y
CONFIG_SCHEDSTATS=y |
(Yes, I know CONFIG_DEBUG_PREEMPT=y was a default with 6.1, but now it is not anymore in the default config)
You have a XEON CPU, so I think you will need NUMA and you have enabled it with CONFIG_NUMA_BALANCING=y but I dont understand this message:
Code: | [ 0.883158] pci_bus 0000:7f: Unknown NUMA node; performance will be reduced |
Maybe check which device is pci_bus 0000:7f
I also have seen you have tried to search with tracing ... maybe disable it if not used anymore:
Code: | CONFIG_RCU_TORTURE_TEST=m
CONFIG_FTRACE=y
CONFIG_BOOTTIME_TRACING=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_STACK_TRACER=y
CONFIG_SCHED_TRACER=y
CONFIG_HWLAT_TRACER=y
CONFIG_MMIOTRACE=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_TRACER_SNAPSHOT=y
CONFIG_BRANCH_PROFILE_NONE=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_KPROBE_EVENTS=y
CONFIG_UPROBE_EVENTS=y
CONFIG_BPF_EVENTS=y
CONFIG_DYNAMIC_EVENTS=y
CONFIG_PROBE_EVENTS=y
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE_MCOUNT_USE_CC=y
CONFIG_TRACING_MAP=y
CONFIG_SYNTH_EVENTS=y
CONFIG_HIST_TRIGGERS=y
CONFIG_RING_BUFFER_BENCHMARK=m
CONFIG_TRACE_EVAL_MAP_FILE=y
CONFIG_RUNTIME_TESTING_MENU=y
CONFIG_ASYNC_RAID6_TEST=m |
Last but lot least I would suggest to remove all disks, install only one and test this system booting from only this one disk. If it is okay then install a second disk and test again ... a fake RAID5 using only 2 discs (with many partitions on these 2 discs) is never a good idea ... you will have too much concurrent traffic. _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
linux_os2 Apprentice
Joined: 29 Aug 2018 Posts: 252 Location: Zedelgem Belgium
|
Posted: Wed May 01, 2024 8:40 am Post subject: |
|
|
pietinger wrote: | To be honest, I don't quite understand what you mean by "Clone" ... and since I'm not familiar with systemd, I can't say anything about it. Do you mean with "Clone" you are using the same system image on the same machine, but only started from a different partition ? |
My clone procedure is as follows:take backup from a system (in this case the operational one where the problem does not exist) with partclone piped through zstd
restore the backup with zstd piped trough partclone on a partition of the disk (in this case on the fake raid 1 where windows is)
assign a new uuid to the new fs
adjust fstab on new system
chroot into that system
run grub-mkconfig
run grub-mkconfig on old system.
This doesnt look correct:
Code: |
[ 3.420036] GPT:Primary header thinks Alt. header is not at the end of the disk.
[ 3.420039] GPT:1951170559 != 1953525167
[ 3.420041] GPT:Alternate GPT header not at the end of the disk.
[ 3.420042] GPT:1951170559 != 1953525167
[ 3.420043] GPT: Use GNU Parted to correct GPT errors.
[ 3.420050] sdd: sdd1 sdd2 sdd3 sdd4 sdd5 sdd6
[ 3.420104] sd 7:0:0:0: [sdh] supports TCG Opal
[ 3.420108] sd 7:0:0:0: [sdh] Attached SCSI disk
[ 3.420267] GPT:Primary header thinks Alt. header is not at the end of the disk.
[ 3.420270] GPT:1951170559 != 1953525167
[ 3.420272] GPT:Alternate GPT header not at the end of the disk.
[ 3.420273] GPT:1951170559 != 1953525167
[ 3.420274] GPT: Use GNU Parted to correct GPT errors.
[ 3.420282] sde: sde1 sde2 sde3 sde4 sde5 sde6
[ 3.421651] sd 5:0:0:0: [sdd] supports TCG Opal
[ 3.421654] sd 5:0:0:0: [sdd] Attached SCSI disk
[ 3.421822] sd 4:0:0:0: [sde] supports TCG Opal
[ 3.421825] sd 4:0:0:0: [sde] Attached SCSI disk
[ 3.449671] Alternate GPT is invalid, using primary GPT.
[ 3.449680] sdc: sdc1
[ 3.449802] sd 6:0:0:0: [sdc] Attached SCSI disk
[ 3.450835] Alternate GPT is invalid, using primary GPT.
[ 3.450839] sdg: sdg1
[ 3.450948] sd 8:0:0:0: [sdg] Attached SCSI disk |
this is due to the pact the raids are fake |
|
Back to top |
|
|
linux_os2 Apprentice
Joined: 29 Aug 2018 Posts: 252 Location: Zedelgem Belgium
|
Posted: Wed May 01, 2024 4:16 pm Post subject: |
|
|
pietinger wrote: |
Last but lot least I would suggest to remove all disks, install only one and test this system booting from only this one disk. If it is okay then install a second disk and test again ... a fake RAID5 using only 2 discs (with many partitions on these 2 discs) is never a good idea ... you will have too much concurrent traffic. |
Did not remove drives.
Deleted the fake array in the firmware setup.
Recreated one containing the 2 original ssd's.
Initialized the new array, takes some time, so I think everything is wiped during the procces.
Installed Windows 10.
Shrunk the windows partition.
restored backup of the gentoo partition and the efi partition
and hurrah the gentoo system on the raid partition behaves normal!!!
will refrain from making to much partition on the raid.
Another symptom was also that grub-mkconfig did not detect the windows partition on the raid.
Now again.
It is very hard to tell what went wrong.
The gdm that was so slow was not alone.
logging in via ssh was also very slow. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5109 Location: Bavaria
|
Posted: Wed May 01, 2024 5:33 pm Post subject: |
|
|
linux_os2 wrote: | It is very hard to tell what went wrong. |
Yes, I believe this. I have seen how much you have already verified (I would not be able to check a ftrace). _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
|