View previous topic :: View next topic |
Author |
Message |
causality Apprentice
Joined: 03 Jun 2006 Posts: 239
|
Posted: Tue Dec 10, 2024 11:54 pm Post subject: Fast Swap in unused VRAM |
|
|
So, here is the output of "swapon":
Code: | # swapon
NAME TYPE SIZE USED PRIO
/dev/sdb2 partition 2.1G 564.6M -2
/dev/loop16 partition 1.9G 1.2G 32
|
This was after several hours of setting vm.swappiness to a super aggressive value of 133 (from 20) just to see what would happen. Now it's sitting at 80.
The swap partition at /dev/sdb2 is backed by spinning rust. On a good day, it might get 90-100MB/s.
The swap partition currently at /dev/loop16 is video ram. I have an Nvidia Geforce GTX 1060 6GB card. Even running 3d games under WINE, I never saw more than a few GB used as reported by nvidia-smi. Here's my current (no 3d games) output:
Code: | # nvidia-smi
Tue Dec 10 18:35:20 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.135 Driver Version: 550.135 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1060 6GB Off | 00000000:01:00.0 On | N/A |
| 0% 47C P8 6W / 120W | 2547MiB / 6144MiB | 8% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2306 G /usr/bin/X 10MiB |
| 0 N/A N/A 2414 G /usr/bin/kwin_wayland 38MiB |
| 0 N/A N/A 2448 G /usr/libexec/xdg-desktop-portal-kde 1MiB |
| 0 N/A N/A 2483 G /usr/bin/Xwayland 9MiB |
| 0 N/A N/A 2537 G /usr/bin/kded6 1MiB |
| 0 N/A N/A 2543 G /usr/bin/ksmserver 1MiB |
| 0 N/A N/A 2549 G ...c/polkit-kde-authentication-agent-1 1MiB |
| 0 N/A N/A 2553 G /usr/libexec/org_kde_powerdevil 1MiB |
| 0 N/A N/A 2556 G /usr/bin/plasmashell 35MiB |
| 0 N/A N/A 2558 G /usr/bin/kaccess 1MiB |
| 0 N/A N/A 2566 G /usr/bin/kdeconnectd 1MiB |
| 0 N/A N/A 2602 G /usr/libexec/kactivitymanagerd 1MiB |
| 0 N/A N/A 2605 G /usr/bin/kmix 1MiB |
| 0 N/A N/A 2606 G /usr/bin/konsole 1MiB |
| 0 N/A N/A 2620 G /usr/bin/dolphin 1MiB |
| 0 N/A N/A 2622 G /usr/lib64/firefox/firefox 145MiB |
| 0 N/A N/A 2627 G /usr/lib64/thunderbird/thunderbird 56MiB |
| 0 N/A N/A 2668 G /usr/bin/python3.12 1MiB |
| 0 N/A N/A 2670 G /usr/bin/kclockd 1MiB |
| 0 N/A N/A 2951 G /usr/bin/kded5 1MiB |
| 0 N/A N/A 3427 C vramfs 2110MiB |
| 0 N/A N/A 6559 G /usr/bin/audacious 1MiB |
| 0 N/A N/A 12941 G ...er/plugins --ozone-platform=wayland 1MiB |
| 0 N/A N/A 15745 G /opt/vivaldi/vivaldi-bin 1MiB |
| 0 N/A N/A 15810 G ...yeDropper --variations-seed-version 38MiB |
+-----------------------------------------------------------------------------------------+
|
So, lots of room.
I found an overlay called "myrvolay" available in the standard "eselect" options. I enabled that. Then I installed a program from this overlay called sys-fs/vramfs. I had to emerge it with the "no dependencies" option to get it to build (if you aren't comfortable with unusual or unstable software, don't do any of this).
It kept wanting to depend on "dev-libs/clhpp" but there is no such package. However, a working package does exist and is called "dev-cpp/clhpp" and does work. Hence, the "no dependencies" emerge option to get it to work, after manually installing "dev-cpp/clhpp" package.
This program uses FUSE and OpenCL to reserve a portion of video RAM as a block device. It works with proprietary nvidia drivers. You can then use that device as swap space, or a RAM disk, or whatever you like. I wanted a faster swap. It was a sparse file so I had to create a loop device atop it so that "swapon" didn't complain about a file with holes.
I get around 2.5 GB/s transfer rate from this, which is far better than the 90-100MB/s of the swap partition backed by an HDD spinning rust.
But, if the vramfs process that's holding the video RAM itself, should get swapped out, you get a hard system lock, a perfect catch-22. The thing holding the swap space got swapped, so there's nothing to unswap it so that it can access anything else. That was a reset switch press. So, I placed it into its own cgroup and set it up to never be swapped. This has boosted my system performance. It's like a free RAM upgrade. I wrote a script and placed it into /etc/local.d/. This is that script:
Code: |
#!/bin/bash
#Treat unset variables as errors
set -u
#Kill any previous process and deactivate all swap so this command can be re-run if needed
killall vramfs &> /dev/null
swapoff /dev/loop* &> /dev/null
swapoff /tmp/vram/swapfile &> /dev/null
swapoff -a &> /dev/null
#Activate disk (HDD) swap as specified in /etc/fstab
swapon -a &> /dev/null
#Make sure we're starting from scratch
rm -f /tmp/vram/swapfile &> /dev/null
losetup -d /tmp/vram/swapfile &> /dev/null
#Setup a swappable area allocated in VRAM
#This "2G" figure is decimal GB, which is 1970MiB, not 2048 MiB (2GiB), as confirmed by nvidia-smi
exec vramfs /tmp/vram 2G &> /dev/null &
#Use Linux cgroups to make sure that the vramfs process doesn't, itself, get swapped out
#because that would create a deadlock and likely freeze the system
cgcreate -g memory:vramfs
cgset -r memory.swap.max=0 vramfs
VRAMFS_PID=$(pgrep vramfs)
cgclassify -g memory:vramfs ${VRAMFS_PID}
#Create a swapfile in the space
#dd is a slower but more portable way to create a swapfile with no holes, as needed by swapon
#(instead of using "dd if=/dev/zero of=/tmp/vram/swapfile bs=1MiB count=$((2*1024))"
#Use truncate to create a sparse swap file for efficiency, and the loop device to present it to
#swapon as though it has no holes
#Find the next available kernel loop device
LOOPDEV=$(losetup -f)
#Create a swap file slightly smaller than the allocated vram -- has to be smaller than the allocated vramfs
truncate -s 1969MiB /tmp/vram/swapfile &> /dev/null || { echo "Error creating vram swap file"
exit 1
}
#Set recommended permissions
chmod 0600 /tmp/vram/swapfile
#Activate the loop device
losetup ${LOOPDEV} /tmp/vram/swapfile &> /dev/null || { echo "Error setting up vram loop device"
exit 1
}
#Create swap on the loop device
mkswap ${LOOPDEV} &> /dev/null
#Activate swap -- use a higher (arbitrary) priority of 32, as disk swap defaults to -2
swapon -p 32 ${LOOPDEV} &> /dev/null && echo Activating vram swwap || echo "Error enabling vram swap space"
exit 1
}
echo Allocated \~2GB of vram swap space |
Moved from "Unsupported Software" to "Documentation, Tips & Tricks". -- Zucca |
|
Back to top |
|
|
mattst88 Developer
Joined: 28 Oct 2004 Posts: 423
|
Posted: Fri Jan 24, 2025 6:04 pm Post subject: Re: Fast Swap in unused VRAM |
|
|
causality wrote: | But, if the vramfs process that's holding the video RAM itself, should get swapped out, you get a hard system lock, a perfect catch-22. The thing holding the swap space got swapped, so there's nothing to unswap it so that it can access anything else. |
FWIW, it looks like this issue was fixed by this commit: https://github.com/Overv/vramfs/commit/829b1f2c259da2eb63ed3d4ddef0eeddb08b99e4 _________________ My Wiki page |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|