Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Multi-socket EPYC System can't reach above 50% load
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
ruxbat
n00b
n00b


Joined: 21 Oct 2024
Posts: 1

PostPosted: Mon Oct 21, 2024 6:30 pm    Post subject: Multi-socket EPYC System can't reach above 50% load Reply with quote

Hi everyone,

I've built a dual-socket EPYC 9654 system. Running my workloads in SideFX Houdini, I can't seem to get higher than 50% load. All cores are active, but only reach 50% usage. I tried turning off SMT in the bios (reduces number of cores by 50%), but the 50% utilization remained. I'm using the Pyro benchmark developed here (https://www.vfxarabia.co/post/houdini-benchmark-cores-vs-clockspeed-updated) as my baseline for testing. This is a dual-socket system, so I'm wondering if there's something I'm missing in my kernel config that's causing this. Any thoughts on where to start troubleshooting?

Some information about the system:
Code:

cat /proc/cpuinfo
processor   : 383
vendor_id   : AuthenticAMD
cpu family   : 25
model      : 17
model name   : AMD EPYC 9654 96-Core Processor
stepping   : 1
microcode   : 0xa101148
cpu MHz      : 400.000
cache size   : 1024 KB
physical id   : 1
siblings   : 192
core id      : 79
cpu cores   : 96
apicid      : 415
initial apicid   : 415
fpu      : yes
fpu_exception   : yes
cpuid level   : 16
wp      : yes
flags      : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rdpid overflow_recov succor smca fsrm flush_l1d debug_swap
bugs      : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips   : 4802.25
TLB size   : 3584 4K pages
clflush size   : 64
cache_alignment   : 64
address sizes   : 52 bits physical, 57 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]


Code:
rux@rux ~ $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
400000
rux@rux ~ $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
rux@rux ~ $ cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
performance
rux@rux ~ $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
amd-pstate-epp



Code:
rux@rux ~ $ uname -a
Linux rux 6.6.52-gentoo-gentoo-dist #9 SMP PREEMPT_DYNAMIC Sun Oct 20 22:03:28 MDT 2024 x86_64 AMD EPYC 9654 96-Core Processor AuthenticAMD GNU/Linux


Code:
System Information
  Operating System              Gentoo Linux
  Kernel                        Linux 6.6.52-gentoo-gentoo-dist x86_64
  Model                         Giga Computing MZ73-LM0-000
  Motherboard                   Giga Computing MZ73-LM0-000
  BIOS                          GIGABYTE R04_F32

CPU Information
  Name                          AMD EPYC 9654
  Topology                      2 Processors, 192 Cores, 384 Threads
  Identifier                    AuthenticAMD Family 25 Model 17 Stepping 1
  Base Frequency                3.71 GHz
  L1 Instruction Cache          32.0 KB x 96
  L1 Data Cache                 32.0 KB x 96
  L2 Cache                      1.00 MB x 96
  L3 Cache                      16.0 MB x 12

Memory Information
  Size                          125 GB



System is watercooled, CPUs are at 56C degrees at idle (and when running these tests).

I am able to achieve 100% CPU usage with sysbench (sysbench cpu --threads=384 run), but houdini seems to stay stuck at 50%. I've asked the Houdini community for more information, but I've not had a lot of luck with their forums over the years. You can find that post here (https://www.sidefx.com/forum/topic/98414/?page=1#post-432197).

Any suggestions on how to troubleshoot this are much appreciated.

Thanks in advance!
~Rux
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 22538

PostPosted: Mon Oct 21, 2024 8:24 pm    Post subject: Reply with quote

Welcome to the forums. The first part of your post says you can't get above 50%, but then near the end, you say that sysbench can get above 50%. If that is right, then it seems this is not a kernel problem. I would expect that if it was a kernel issue, then no test could get you above 50%. If you still suspect a kernel issue, I would suggest you provide your kernel configuration for community review. I cannot help you with specifics there, but we have several regular posters who often provide advice about kernel configuration.

56C seems high for a system at idle. Even on a warm day, my system's idle temperature is 35C or below, and I use basic air cooling.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum