View previous topic :: View next topic |
Author |
Message |
ritzmax72 Tux's lil' helper
Joined: 10 Aug 2014 Posts: 114
|
Posted: Sun Oct 27, 2024 8:59 am Post subject: AMD 7900 xtx poor performance on steam |
|
|
Yesterday I upgraded from nvidia 2070 super to 7900 xtx. Got it working on gentoo: I installed amdgpu, (USE="vulkan,vulkan-overlay) media-libs/mesa.
Then I ran Elden Ring from Steam using proton GE 9-5. I am just getting 41-50 fps at "Best Quality", "2560x1440p", fullscreen.
This is almost slightly worse than what I used to get using 2070 super.
Also glxgears and vkgears are somehow stuck at 60 fps.
I will dump some info here:
Code: |
emerge --info media-libs/mesa
media-libs/mesa-24.1.7::gentoo was built with the following:
USE="X llvm lm-sensors (opengl) proprietary-codecs vaapi vdpau vulkan vulkan-overlay wayland zstd -d3d9 -debug -opencl -osmesa (-selinux) -test -unwind -valgrind -xa" ABI_X86="32 (64) (-x32)" CPU_FLAGS_X86="sse2" LLVM_SLOT="18 -15 -16 -17" VIDEO_CARDS="radeonsi -d3d12 (-freedreno) -intel -lavapipe (-lima) -nouveau -nvk (-panfrost) -r300 -r600 -radeon (-v3d) (-vc4) -virgl (-vivante) -vmware -zink"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync merge-wait multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
emerge --info x11-drivers/xf86-video-amdgpu
x11-drivers/xf86-video-amdgpu-23.0.0::gentoo was built with the following:
USE="-udev" ABI_X86="(64)"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync merge-wait multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-z,lazy"
|
I was expecting amdgpu+mesa with vulkan use flag would be enough to make steam game works. Clearly either I am doing something wrong or something wrong with my GPU. |
|
Back to top |
|
|
Maitreya Guru
Joined: 11 Jan 2006 Posts: 445
|
Posted: Sun Oct 27, 2024 10:46 am Post subject: |
|
|
It being stuck at 60 sounds a awful lot like it is locked to vsync.
BTW , what kernel version and firmware are you running?
And which icd is used from /usr/share/vulkan/icd.d/? |
|
Back to top |
|
|
ritzmax72 Tux's lil' helper
Joined: 10 Aug 2014 Posts: 114
|
Posted: Sun Oct 27, 2024 11:24 am Post subject: |
|
|
Maitreya wrote: | It being stuck at 60 sounds a awful lot like it is locked to vsync.
BTW , what kernel version and firmware are you running?
And which icd is used from /usr/share/vulkan/icd.d/? |
I am unchecking vsync on every game I am trying.
Kernel is 6.6.47-gentoo
I have no idea how to choose icd. I just installed vulkan-loader with "USE=layers".
I see two on my system.
Code: |
(.#): ls /usr/share/vulkan/icd.d/
radeon_icd.i686.json radeon_icd.x86_64.json
|
Sons of forest is giving me 40-50.
Black Myth Wukong is unplayable at 24fps.
my emerge --info
Code: |
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -march=znver2 -pipe -fno-expensive-optimizations"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK=*certificate information*
CXXFLAGS="-O2 -march=znver2 -pipe -fno-expensive-optimizations"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-O2 -march=znver2 -pipe -fno-expensive-optimizations"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg-live ccache config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync merge-wait multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -march=znver2 -pipe -fno-expensive-optimizations"
GENTOO_MIRRORS="*Repository link*"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs"
LEX="flex"
MAKEOPTS="-j32"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="X acl alsa amd64 bzip2 cet crypt gdbm iconv ipv6 libtirpc lm-sensors multilib ncurses nls openmp pam pcre pulseaudio readline seccomp ssl test-rust unicode vaapi vdpau xattr zlib" ABI_X86="64" ADA_TARGET="gcc_12" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_anon authn_dbm authn_file authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias cgid cgi proxy proxy_html proxy_http proxy_http2 xml2enc proxy_scgi proxy_fcgi http2 proxy_wstunnel" APACHE2_MPMS="worker" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax navcom oceanserver oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 tsip tripmate tnt ublox" GRUB_PLATFORMS="efi-64" GUILE_SINGLE_TARGET="3-0" GUILE_TARGETS="3-0" INPUT_DEVICES="joystick libinput synaptics keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz glk hd44780 lb216 lcdm001 mtxorb text" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-1" POSTGRES_TARGETS="postgres16" PYTHON_SINGLE_TARGET="python3_12" PYTHON_TARGETS="python3_12" RUBY_TARGETS="ruby32" VIDEO_CARDS="amdgpu radeonsi" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipp2p iface geoip fuzzy condition tarpit sysrq proto logmark ipmark dhcpmac delude chaos account"
Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, LINGUAS, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
|
Most importantly Code: | VIDEO_CARDS="amdgpu radeonsi" |
|
|
Back to top |
|
|
ritzmax72 Tux's lil' helper
Joined: 10 Aug 2014 Posts: 114
|
Posted: Sun Oct 27, 2024 12:24 pm Post subject: |
|
|
Looking through protondb for Wukong I found this :
Code: | MESA_VK_DEVICE_SELECT_FORCE_DEFAULT_DEVICE=1 %command% |
This completely made the game run at solid 100-114 fps, 1440p, Very high graphics present with FSR.
I just want to understand what this variable mean and why such connotation like "*FORCE*" exist.
Was steam using software mesa for rendering? Why is this default when I clearly have X rendering using GPU:
Because I installed libdrm? Is X using it by default or I have to make such choice and how?
x11-drivers/xf86-vide-amdgpu does not explicitly put any conf files in /etc/x11/conf.d
Although such conf files are available in /usr/share folders. Does X read from there?
Code: |
equery files x11-drivers/xf86-video-amdgpu
* Searching for xf86-video-amdgpu in x11-drivers ...
* Contents of x11-drivers/xf86-video-amdgpu-23.0.0:
/usr
/usr/lib64
/usr/lib64/xorg
/usr/lib64/xorg/modules
/usr/lib64/xorg/modules/drivers
/usr/lib64/xorg/modules/drivers/amdgpu_drv.so
/usr/share
/usr/share/X11
/usr/share/X11/xorg.conf.d
/usr/share/X11/xorg.conf.d/10-amdgpu.conf
/usr/share/doc
/usr/share/doc/xf86-video-amdgpu-23.0.0
/usr/share/doc/xf86-video-amdgpu-23.0.0/ChangeLog.bz2
/usr/share/doc/xf86-video-amdgpu-23.0.0/README.md.bz2
/usr/share/man
/usr/share/man/man4
/usr/share/man/man4/amdgpu.4.bz2
|
I have so many questions. The Gentoo pages for amdgpu is quite complicated and misses details. |
|
Back to top |
|
|
Ralphred l33t
Joined: 31 Dec 2013 Posts: 655
|
Posted: Sun Oct 27, 2024 2:58 pm Post subject: |
|
|
Yeah. So Wukong has/had an issue where it would select the wrong rendering device - I was never affected by it (vega64) but many were.
The use of "FORCE" in this case is just saying "no, you WILL use this rendering device".
The gfx driver stack consists of layers, at the bottom is your amdgpu kernel driver, on top of that your xorg-driver (xf86-video-amdgpu), then on top of that your OpenGL/Vulkan drivers (mesa and optionally amdgpu-pro-vulkan).
So dealing with each layer in turn:
AMDGPU kernel driver
Make sure the GPU got all the firmware it wanted at boot time with dmesg|grep -i firmware
Check it's in use with lspci -k|grep VGA -A3
X's AMDGPU driver
By default X looks at Code: | /etc/X11/xorg.conf
/etc/X11/xorg.conf.d/*.conf
/usr/share/X11/xorg.conf
/usr/share/X11/xorg.conf.d/*.conf | The default 10-amdgpu.conf in /usr doesn't do anything untoward.
You can turn vsync on and off with /etc/drirc or ~/.drirc, however I find mangohud is more user friendly for this. Setting "vsync=1
gl_vsync=0" in your mangohud config turns it off for both OpenGL and Vulkan, setting "vsync=0
gl_vsync=-1" sets it to adaptive sync.
Vulkan stuff
Mesa produces vulkan drivers for amdgpu, and they work well. Gentoo also has an ebuild for amdgpu-pro-vulkan. Both can exist in harmony, and the environment variable(s) you use to switch between them are printed out by the amdgpu-pro-vulkan ebuild after installing. Whether you get better performance from either is based on game and/or card (the pro drivers are worse for me, but vega64 card so YMMV).
Mangohud (ebuild in the guru overlay, docs are here) is much more useful than just "an FPS monitor", when I get games that are CPU/GPU bound I use it to limit fps and normalise frametimes, this is much more relyable than using in game fps limits etc. As I mention above you can use it to enable/disable vsync/adaptive sync*. It can be used to dump a lot of live info as an overlay on your game too. You can turn the overlay on and off with keyboard shortcuts**, and force config reloads on the fly.
*Enabling/disabling vsync type requires a program restart to take effect
**I always forget the keybinds, so keep a copy of them ~/.config/MangoHud/keybinds for reference |
|
Back to top |
|
|
Naib Watchman
Joined: 21 May 2004 Posts: 6065 Location: Removed by Neddy
|
Posted: Sun Oct 27, 2024 3:42 pm Post subject: |
|
|
What kernel are you running?
There was an... unfortunate "feature" that got introduced in kernel-6.10.x that coincided with an update to mesa resulting in excessive VRAM flushes...
To give you an idea my DOTA2 collapse to 10fps
Two things fixed this
1. 6.11.x series resolved this
1. enable BAR in bios to permit larger sizes to be copied to the gpu (instead of lots of little ones) _________________
Quote: | Removed by Chiitoo |
|
|
Back to top |
|
|
ritzmax72 Tux's lil' helper
Joined: 10 Aug 2014 Posts: 114
|
Posted: Mon Oct 28, 2024 5:59 am Post subject: |
|
|
Ralphred wrote: | Yeah. So Wukong has/had an issue where it would select the wrong rendering device - I was never affected by it (vega64) but many were.
The use of "FORCE" in this case is just saying "no, you WILL use this rendering device".
The gfx driver stack consists of layers, at the bottom is your amdgpu kernel driver, on top of that your xorg-driver (xf86-video-amdgpu), then on top of that your OpenGL/Vulkan drivers (mesa and optionally amdgpu-pro-vulkan).
So dealing with each layer in turn:
AMDGPU kernel driver
Make sure the GPU got all the firmware it wanted at boot time with dmesg|grep -i firmware
Check it's in use with lspci -k|grep VGA -A3
X's AMDGPU driver
By default X looks at Code: | /etc/X11/xorg.conf
/etc/X11/xorg.conf.d/*.conf
/usr/share/X11/xorg.conf
/usr/share/X11/xorg.conf.d/*.conf | The default 10-amdgpu.conf in /usr doesn't do anything untoward.
You can turn vsync on and off with /etc/drirc or ~/.drirc, however I find mangohud is more user friendly for this. Setting "vsync=1
gl_vsync=0" in your mangohud config turns it off for both OpenGL and Vulkan, setting "vsync=0
gl_vsync=-1" sets it to adaptive sync.
Vulkan stuff
Mesa produces vulkan drivers for amdgpu, and they work well. Gentoo also has an ebuild for amdgpu-pro-vulkan. Both can exist in harmony, and the environment variable(s) you use to switch between them are printed out by the amdgpu-pro-vulkan ebuild after installing. Whether you get better performance from either is based on game and/or card (the pro drivers are worse for me, but vega64 card so YMMV).
Mangohud (ebuild in the guru overlay, docs are here) is much more useful than just "an FPS monitor", when I get games that are CPU/GPU bound I use it to limit fps and normalise frametimes, this is much more relyable than using in game fps limits etc. As I mention above you can use it to enable/disable vsync/adaptive sync*. It can be used to dump a lot of live info as an overlay on your game too. You can turn the overlay on and off with keyboard shortcuts**, and force config reloads on the fly.
*Enabling/disabling vsync type requires a program restart to take effect
**I always forget the keybinds, so keep a copy of them ~/.config/MangoHud/keybinds for reference |
Wow thanks for useful information. On the good side, we've got a lot to tweak and configure; on the bad side these are not well documented.
What's the difference between dri and drm? Are they complementary or supplementary?
X11 xorg 3d acceleration mentions setting "dri" with mode 0666 on xorg config files, but does not extensively mentions
what other modes are available? (Is 0666 just file permissions?)
Also, the Code: | MESA_VK_DEVICE_SELECT_FORCE_DEFAULT_DEVICE=1 | does not work well with other games like
Elden Ring still outputs 40 fps max (I also found that some people speculates that Elden Ring limits fps at 60 and 40 for some reason).
I am installing mangohud but it is quite unstable or risky. Hopefully the default configs have "hard limits" that does not exceed GPU's capacity.
I'll try this. |
|
Back to top |
|
|
Ralphred l33t
Joined: 31 Dec 2013 Posts: 655
|
Posted: Mon Oct 28, 2024 2:03 pm Post subject: |
|
|
ritzmax72 wrote: | Also, the Code: | MESA_VK_DEVICE_SELECT_FORCE_DEFAULT_DEVICE=1 | does not work well with other games like
Elden Ring still outputs 40 fps max (I also found that some people speculates that Elden Ring limits fps at 60 and 40 for some reason). | Yeah, that environment variable is a fix for Wukong. The Elden Ring 60fps limit is a FromSoft thing, I'd completely forgotten about it because I was running a 60Hz monitor during most of my playthroughs. It's present in the Dark Souls games too. Now AFAIK there are 3rd party patched binaries that remove the fps limit, but I think you can expect to end up on the "naughty boy server"*.
ritzmax72 wrote: | I am installing mangohud but it is quite unstable or risky | Everything in the "guru" overlay is technically "testing" (hence the ~amd64), but MangoHud is fine IME. ritzmax72 wrote: | Hopefully the default configs have "hard limits" that does not exceed GPU's capacity. | By default it's just an "information overlay", but because of where it inserts itself within the gfx stack, it's capable of setting FPS limits and switching vsync on/off is all, it doesn't really hook into the hardware at all.
*Re: "naughty boy server". The guys that run the online servers don't ban accounts for EAC infractions, you just get "quarantined", for ~6 months on, a server with other people who have also provoked EAC's ire for whatever reason. The only time it's an issue is if you expect to play with a friend who's on the "good boy server". |
|
Back to top |
|
|
ritzmax72 Tux's lil' helper
Joined: 10 Aug 2014 Posts: 114
|
Posted: Tue Oct 29, 2024 6:19 am Post subject: |
|
|
Code: | cat /sys/class/drm/card0/device/power_dpm_state
performance
|
power_dpm_state is set to performance. I realized amgpu for newer cards replaces dmp_state thing with powerplay profile.
Code: | /sys/class/drm/card0/device/pp_* | sysfs files to control.
I changed Code: | /sys/class/drm/card0/device/power_dpm_force_performance_level | from auto to high and still no luck.
After going through multiple sources I believe this is the case: People are saying older CPU might bottleneck 7900 xtx performance like the 3950x which is the
one I have. These people are just regular users and might not have technical expertise but have come to believe this through popular opinion. Why do I say this?
First architectural and design differences between older CPU and newer ones might cause some friction on 7900 xtx to properly utilize it's potential; I say this because
Elden Ring and Sons of Forest don't go past 50% utilization no matter what I do. But this argument might be little misleading because if architectural and design bottlenecks are in place because of older gen CPU then how come it's not universal? Why does Black Myth Wukong and Lies of P run very well at max settings? Why do these A&D bottlenecks only affect certain games? Both Black Myth Wukong and Elden Ring are using VKD3D (DX12 translation layer) and most settings similar except maybe: FSR
Both Black Myth Wukong and Lies of P have dedicated FSR tuning which I have enabled. Elden Ring does not have such settings. But Sons of the Forest has it. So I can not say FSR is the factor. I can not say Architectural and Design between older and new CPUs are the factors.
I even ran Sons of the Forest on Windows, same case; it cannot run above 50 fps.
It is too messy and complicated. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|