View previous topic :: View next topic |
Author |
Message |
klammerj n00b
Joined: 19 Jan 2023 Posts: 18
|
Posted: Thu May 09, 2024 6:58 pm Post subject: How do I debug those openrc hangs on shutdown? |
|
|
Good evening,
Typically there's about a 10percent chance the box locks up.
often it's after or at localmount or sysklogd, so the kernel logs won't be much help.
I'd need console output. However, occasional modesets during shutdown make it difficult to impossible
to guess what services are still running and what are not.
I've done a fair bit of guessing and changing the need <whatever> clauses.
So far without much effect.
I seem to have version 0.48 of openrc installed. |
|
Back to top |
|
|
ian.au l33t
Joined: 07 Apr 2011 Posts: 606 Location: Australia
|
Posted: Fri May 10, 2024 1:25 am Post subject: |
|
|
klammerj,
openrc stable is at 0.54 and 0.48 was replaced in January here, so I assume you're holding off updating your box? Why is that?
There isn't any useful info beyond that in your post, so everybody would be playing the same guessing game as you, your problem could be almost anything. If you've set your kernel for parallel execution the observed location of the hang might be quite misleading.
Code: | gw-01 ~ # grep -i parallel /usr/src/linux/.config
CONFIG_HOTPLUG_PARALLEL=y |
I'd say the likely culprit is either network or ACPI configuration - if it's the latter there should be some failed services in dmesg. You can try Code: | gw-01 ~ # dmesg | grep -i error | and see if that helps.
If network, it will depend on your network management setup. If you're running any sort of server or connecting to remote fs via NFS or similar you should mention those.
You should try posting a description of your hardware, lspci etc, emerge --info, dmesg, network management, any remote fs services and kernel configs here for someone to take a look at if you really want help with this. |
|
Back to top |
|
|
klammerj n00b
Joined: 19 Jan 2023 Posts: 18
|
Posted: Fri May 10, 2024 8:18 am Post subject: |
|
|
ian.au wrote: | klammerj,
openrc stable is at 0.54 and 0.48 was replaced in January here, so I assume you're holding off updating your box? Why is that?
|
Last time I did a larger emerge it took a week or so to fix everything that got hosed in the process.
I'll update that pkg and see what happens...
ian.au wrote: |
There isn't any useful info beyond that in your post, so everybody would be playing the same guessing game as you, your problem could be almost anything. If you've set your kernel for parallel execution the observed location of the hang might be quite misleading.
Code: | gw-01 ~ # grep -i parallel /usr/src/linux/.config
CONFIG_HOTPLUG_PARALLEL=y |
|
I'm using distribution kernel 6.1.60-gentoo-dist
grep -i parallel /usr/src/linux/.config
# Raw/parallel NAND flash controllers
# CONFIG_AD7606_IFACE_PARALLEL is not set
ian.au wrote: |
I'd say the likely culprit is either network or ACPI configuration - if it's the latter there should be some failed services in dmesg. You can try Code: | gw-01 ~ # dmesg | grep -i error | and see if that helps.
|
dmesg | grep -i error
[ 0.268797] acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_ERROR)
[ 1.013014] RAS: Correctable Errors collector initialized.
[ 22.396100] dracut: Mounting /dev/sda2 with -o defaults,noatime,lazytime,noiversion,inode_readahead_blks=2,delalloc,errors=remount-ro
[ 45.335372] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
ian.au wrote: |
If network, it will depend on your network management setup. If you're running any sort of server or connecting to remote fs via NFS or similar you should mention those.
You should try posting a description of your hardware, lspci etc, emerge --info, dmesg, network management, any remote fs services and kernel configs here for someone to take a look at if you really want help with this. |
Yes, there's network shares involved and kerberos and gssd.. but
I'd rather not divulge too much about my network setup.
I know it sez `noob' there on the left, but that's just coz I rarely post to fora. I've been using this stuff since about 1990..
What I'd like is just some way to print [123689] before every shutdown script to indicate what
services are still running(and a way to assign arbitrary numbers
to the individual service scripts). I'd be done debugging this in an hour,
instead of messing around for months now. |
|
Back to top |
|
|
klammerj n00b
Joined: 19 Jan 2023 Posts: 18
|
Posted: Fri Jun 14, 2024 4:09 am Post subject: |
|
|
It worked well for a while, but today it got stuck again.
the rc.log has not recorded the event(as usual).
These were the lines visible on the screen:
Code: |
display-manager | * Sending signal 15 to PID 2541 ... [ ok ]
acpid | * Will stop /usr/sbin/acpid
acpid | * Will stop processes of `/usr/sbin/acpid'
termencoding |termencoding | * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh /etc/init.d/termencoding stop
acpid | * Sending signal 15 to PID 2027 ...
display-manager-setup |display-manager-setup | * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh /etc/init.d/display-manager-setup stop
sysklogd |sysklogd | * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh /etc/init.d/sysklogd stop
sysklogd | * Stopping sysklogd ...
sysklogd | * Will stop /usr/sbin/syslogd
sysklogd | * Will stop PID 1894
netmount | * Failed to simply unmount filesystems
nfsclient |nfsclient | * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh /etc/init.d/nfsclient stop
rpc.gssd |rpc.gssd | * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh /etc/init.d/rpc.gssd stop
rpc.idmapd |rpc.idmapd | * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh /etc/init.d/rpc.idmapd stop
rpc.gssd | * Stopping gssd ...
rpc.gssd | * Will stop /usr/sbin/rpc.gssd
rpc.gssd | * Will stop processes of `/usr/sbin/rpc.gssd'
rpc.gssd | * Sending signal 15 to PID 2378 ...
rpc.idmapd | * Stopping idmapd ...
rpc.idmapd | * Will stop /usr/sbin/rpc.idmapd
rpc.idmapd | * Will stop processes of `/usr/sbin/rpc.idmapd'
rpc.idmapd | * Sending signal 15 to PID 2377 ...
rpcbind |rpcbind | * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh /etc/init.d/rpcbind stop
rpc.pipefs |rpc.pipefs | * Executing: /lib/rc/sh/openrc-run.sh /lib/rc/sh/openrc-run.sh /etc/init.d/rpc.pipefs stop
rpcbind | * Stopping rpcbind ...
rpcbind | * Will stop /sbin/rpcbind
rpc.pipefs | * Unmounting RPC pipefs ...
|
(then nothing)
[Administrator edit: changed [quote] tags to [code] tags to preserve output layout. -Hu] |
|
Back to top |
|
|
C5ace Guru
Joined: 23 Dec 2013 Posts: 484 Location: Brisbane, Australia
|
Posted: Fri Jun 14, 2024 9:49 am Post subject: |
|
|
Maybe reading this helps:
https://wiki.gentoo.org/wiki/Nfs-utils#OpenRC
Quote: | Unresponsiveness of the system
The system may become unresponsive during shutdown when the NFS client attempts to unmount exported directories after udev has stopped. To prevent this a local.d script can be used to forcibly unmount the exported directories during shutdown.
Create the file nfs.stop:
FILE /etc/local.d/nfs.stop
/bin/umount -a -f -t nfs,nfs4
Set the according file bits:
root #chmod a+x /etc/local.d/nfs.stop |
_________________ Observation after 30 years working with computers:
All software has known and unknown bugs and vulnerabilities. Especially software written in complex, unstable and object oriented languages such as perl, python, C++, C#, Rust and the likes. |
|
Back to top |
|
|
klammerj n00b
Joined: 19 Jan 2023 Posts: 18
|
Posted: Sun Jun 16, 2024 5:05 pm Post subject: |
|
|
Thank you. I had not done that yet... |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|