View previous topic :: View next topic |
Author |
Message |
eaf n00b
Joined: 27 Apr 2018 Posts: 12
|
Posted: Thu Dec 19, 2024 6:48 pm Post subject: GBs of memory wasted in Percpu thanks to stale cgroups |
|
|
Hi,
I've been hunting a significant memory leak on my system where every day the amount of used memory would go up by a few GB. I'm not talking about caches, buffers, ARC, etc, I'm talking about Percpu in /proc/meminfo that climbed all the way up to 50GB at some point.
I think I've traced it down to cgroups (because I also noticed that I had an explosion of them) and then to elogind and OpenRC.
Apparently, elogind creates a new cgroup for every new login. With cgroups v.1 it was also setting a per-cgroup release_agent to /lib64/elogind/elogind-cgroups-agent that was supposed to be called when the corresponding cgroup became empty. That agent would then cleanup the empty cgroup. On systemd installations the cleanup would be done by systemd.
With cgroups v.2 the cleanup mechanism has changed, someone is now supposed to be monitoring the corresponding cgroup.events file, and when that file has "populated 0" in it, get rid of the cgroup. I guess, elogind does not support this cleanup mechanism, because tens of thousands of empty cgroups were left lying around on my system.
I think, and I may be totally wrong here, that the issue is that OpenRC by default mounts cgroups v.2 under /sys/fs/cgroup, and elogind doesn't know how to do cgroup cleanup for v.2.
Has anybody observed this pileup of unused cgroups and Percpu memory on their setups? Am I perhaps missing some sort of an /etc/init.d service that I neglected to activate and that would do this cleanup automatically thereby avoiding this pileup?
Thanks! |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1341 Location: Richmond Hill, Canada
|
Posted: Thu Dec 19, 2024 7:01 pm Post subject: |
|
|
How to find this "cground pile up" symptom? Or if I don't see it in a obvious way that mean I don't have this situation? |
|
Back to top |
|
|
eaf n00b
Joined: 27 Apr 2018 Posts: 12
|
Posted: Thu Dec 19, 2024 7:24 pm Post subject: |
|
|
"grep Percpu /proc/meminfo" was showing tens of GB allocated by "per cpu" allocators.
"cat /proc/cgroups" was showing tens of thousands of groups on my setup. Once I noticed that, I looked for cgroups in /sys/fs/cgroup that had empty cgroup.procs file or "populated 0" in cgroup.events file. Most of those groups counted by /proc/cgroups were found empty. Upon destroying them, the Percpu in /proc/meminfo dropped from 50GB to 2GB.
This box sees a ton of ssh and sftp traffic, which I guess accounts for the rapid growth of abandoned per-session cgroups. |
|
Back to top |
|
|
Hu Administrator
Joined: 06 Mar 2007 Posts: 22876
|
Posted: Thu Dec 19, 2024 7:37 pm Post subject: |
|
|
With what version(s) of elogind did you observe this? The output of emerge --pretend --verbose sys-apps/openrc sys-auth/elogind might be useful. |
|
Back to top |
|
|
eaf n00b
Joined: 27 Apr 2018 Posts: 12
|
Posted: Thu Dec 19, 2024 7:42 pm Post subject: |
|
|
Code: | [ebuild R ] sys-apps/openrc-0.54.2::gentoo USE="netifrc pam sysvinit unicode -audit -bash -caps -debug -newnet -s6 (-selinux) -sysv-utils" 245 KiB
[ebuild R ] sys-auth/elogind-252.9-r2::gentoo USE="acl pam policykit -audit -cgroup-hybrid -debug -doc (-selinux) -test" 1,878 KiB |
|
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1341 Location: Richmond Hill, Canada
|
Posted: Thu Dec 19, 2024 8:40 pm Post subject: |
|
|
eaf wrote: | "grep Percpu /proc/meminfo" was showing tens of GB allocated by "per cpu" allocators.
"cat /proc/cgroups" was showing tens of thousands of groups on my setup. Once I noticed that, I looked for cgroups in /sys/fs/cgroup that had empty cgroup.procs file or "populated 0" in cgroup.events file. Most of those groups counted by /proc/cgroups were found empty. Upon destroying them, the Percpu in /proc/meminfo dropped from 50GB to 2GB.
This box sees a ton of ssh and sftp traffic, which I guess accounts for the rapid growth of abandoned per-session cgroups. |
Thanks for the information.
Code: | me@rpi5 ~ $ cat /proc/meminfo |grep Per
Percpu: 1664 kB |
Code: | me@rpi5 ~ $ cat /proc/cgroups
#subsys_name hierarchy num_cgroups enabled
cpuset 0 93 1
cpu 0 93 1
cpuacct 0 93 1
blkio 0 93 1
memory 0 93 0
devices 0 93 1
freezer 0 93 1
net_cls 0 93 1
perf_event 0 93 1
net_prio 0 93 1
pids 0 93 1 |
Linux rpi5 6.6.31+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.6.31-1+rpt1 (2024-05-29) aarch64 GNU/Linux
This is on RPI 5 with rpi 16k page knernel. |
|
Back to top |
|
|
eaf n00b
Joined: 27 Apr 2018 Posts: 12
|
Posted: Thu Dec 19, 2024 8:48 pm Post subject: |
|
|
That's cool, and that's what I would expect to see too. But aren't you running Debian, and likely systemd too? I'm wondering, if perhaps I'm seeing some conflicting configuration on Gentoo where OpenRC mounts cgroups v.2 and elogind can't cope with it. But I didn't specially configure any of that, it's all default. |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1341 Location: Richmond Hill, Canada
|
Posted: Thu Dec 19, 2024 9:03 pm Post subject: |
|
|
eaf wrote: | That's cool, and that's what I would expect to see too. But aren't you running Debian, and likely systemd too? I'm wondering, if perhaps I'm seeing some conflicting configuration on Gentoo where OpenRC mounts cgroups v.2 and elogind can't cope with it. But I didn't specially configure any of that, it's all default. |
No, I am just using the RPI's kernel, my rootfs is Gentoo based.
my make.profile is Code: | make.profile -> ../../var/db/repos/gentoo/profiles/default/linux/arm64/23.0/desktop/gnome/systemd | So yes. I am using systemd.
Code: | me@rpi5 ~ $ mount|grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot) |
|
|
Back to top |
|
|
sublogic Apprentice
Joined: 21 Mar 2022 Posts: 283 Location: Pennsylvania, USA
|
Posted: Thu Dec 19, 2024 11:38 pm Post subject: |
|
|
I see it too! But not on the same scale as eaf. Code: | $ mount | grep cgroup
none on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
$ ls /sys/fs/cgroup
10 31 51 70 92 memory.stat
11 32 52 71 c1 openrc.apt-cacher-ng
12 33 53 72 c2 openrc.avahi-daemon
13 34 54 73 c3 openrc.bluetooth
14 35 55 76 c4 openrc.cronie
15 36 56 78 cgroup.controllers openrc.cupsd
16 37 57 79 cgroup.max.depth openrc.dbus
17 38 58 8 cgroup.max.descendants openrc.display-manager
18 39 59 80 cgroup.procs openrc.distccd
19 4 6 81 cgroup.stat openrc.net.wlp6s0
20 40 60 82 cgroup.subtree_control openrc.ntpd
21 41 61 83 cgroup.threads openrc.rasdaemon
22 42 62 84 cpu.stat openrc.rpc.idmapd
23 43 63 85 cpu.stat.local openrc.rpc.statd
24 45 64 86 cpuset.cpus.effective openrc.rpcbind
25 46 65 87 cpuset.mems.effective openrc.rsyncd
26 47 66 88 elogind openrc.sshd
27 48 67 89 io.cost.model openrc.sysklogd
28 49 68 9 io.cost.qos openrc.udev
29 5 69 90 io.stat
30 50 7 91 memory.reclaim |
Among the two-digit cgroups, 80 and c2 are my xfce4 session and a tigervnc session. The others are stale. Code: | $ grep -l populated\ 1 /sys/fs/cgroup/??/cgroup.events
/sys/fs/cgroup/80/cgroup.events
/sys/fs/cgroup/c2/cgroup.events
$ grep -l populated\ 0 /sys/fs/cgroup/??/cgroup.events
/sys/fs/cgroup/10/cgroup.events
/sys/fs/cgroup/11/cgroup.events
...
/sys/fs/cgroup/91/cgroup.events
/sys/fs/cgroup/92/cgroup.events
/sys/fs/cgroup/c1/cgroup.events
/sys/fs/cgroup/c3/cgroup.events
/sys/fs/cgroup/c4/cgroup.events |
|
|
Back to top |
|
|
eaf n00b
Joined: 27 Apr 2018 Posts: 12
|
Posted: Fri Dec 20, 2024 1:50 am Post subject: |
|
|
It's definitely elogind that's creating these groups:
Code: | mkdir("/sys/fs/cgroup/4041", 0755) = 0 |
Interestingly, its source code does have some inotify handlers, and it should be able to recognize changes to cgroups_events and should be able to do cleanup. Yet, it doesn't.
Also, if I change /etc/rc.conf to mount /sys/fs/cgroup in "legacy" mode, then elogind starts creating cgroups in a different place, and then openrc controller takes care of the cleanup by running /lib/rc/sh/cgroup-release-agent.sh for each released group:
Code: | mkdir("/sys/fs/cgroup/openrc/5", 0755) = 0 |
I figure, I'll open an issue with elogind devs, and perhaps they'll tell me right off the bat what's missing here. |
|
Back to top |
|
|
sam_ Developer
Joined: 14 Aug 2020 Posts: 2038
|
|
Back to top |
|
|
eaf n00b
Joined: 27 Apr 2018 Posts: 12
|
Posted: Sun Dec 22, 2024 6:14 pm Post subject: |
|
|
I poked around elogind source code, and I kinda wish I didn't. There's a lot of #if 0 sprinkled all over the place, hundreds of lines of commented code at a time, and the functions that are supposed to setup monitoring of cgroup.events files are never even called. It might be intentional, they do say right before a big chunk of disabled inotify code that "elogind is not init, and does not install the agent here." And I get it that elogind was extracted from systemd, so some scars are supposed to be present, but boy was that an invasive surgery, and things were just left patched and bandaged throughout the code. No reaction from elogind folks about the issue. I start thinking that we're just lucky that whatever works works.
So, the options to avoid the leak appear to be:
- Switch to the original systemd;
- Change /etc/rc.conf to mount cgroups v.1, and then openrc will take care of the cleanup;
- Set up a cronjob to scan empty cgroups and delete them manually.
|
|
Back to top |
|
|
sam_ Developer
Joined: 14 Aug 2020 Posts: 2038
|
Posted: Mon Dec 23, 2024 4:23 am Post subject: |
|
|
eaf wrote: | I poked around elogind source code, and I kinda wish I didn't. There's a lot of #if 0 sprinkled all over the place, hundreds of lines of commented code at a time, and the functions that are supposed to setup monitoring of cgroup.events files are never even called. It might be intentional, they do say right before a big chunk of disabled inotify code that "elogind is not init, and does not install the agent here." And I get it that elogind was extracted from systemd, so some scars are supposed to be present, but boy was that an invasive surgery, and things were just left patched and bandaged throughout the code. No reaction from elogind folks about the issue. I start thinking that we're just lucky that whatever works works.
[...]
|
I'm afraid that I've held this opinion too for quite some time. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|