I've built a dual-socket EPYC 9654 system. Running my workloads, I can't seem to get higher than 50% load. All cores are active, but only reach 50% usage. I tried turning off SMT in the bios (reduces number of cores by 50%), but the 50% utilization remained. I'm using the Pyro benchmark developed here (https://www.vfxarabia.co/post/houdini-benchmark-cores-vs-clockspeed-updated) as my baseline for testing. This is a dual-socket system, so I'm wondering if there's something I'm missing in my kernel config that's causing this. Any thoughts on where to start troubleshooting?
Some information about the system:
cat /proc/cpuinfo
processor : 383
vendor_id : AuthenticAMD
cpu family : 25
model : 17
model name : AMD EPYC 9654 96-Core Processor
stepping : 1
microcode : 0xa101148
cpu MHz : 400.000
cache size : 1024 KB
physical id : 1
siblings : 192
core id : 79
cpu cores : 96
apicid : 415
initial apicid : 415
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rdpid overflow_recov succor smca fsrm flush_l1d debug_swap
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips : 4802.25
TLB size : 3584 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 52 bits physical, 57 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]
rux@rux ~ $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
400000
rux@rux ~ $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
rux@rux ~ $ cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
performance
rux@rux ~ $ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver
amd-pstate-epp
rux@rux ~ $ uname -a
Linux rux 6.6.52-gentoo-gentoo-dist #9 SMP PREEMPT_DYNAMIC Sun Oct 20 22:03:28 MDT 2024 x86_64 AMD EPYC 9654 96-Core Processor AuthenticAMD GNU/Linux
System Information
Operating System Gentoo Linux
Kernel Linux 6.6.52-gentoo-gentoo-dist x86_64
Model Giga Computing MZ73-LM0-000
Motherboard Giga Computing MZ73-LM0-000
BIOS GIGABYTE R04_F32
CPU Information
Name AMD EPYC 9654
Topology 2 Processors, 192 Cores, 384 Threads
Identifier AuthenticAMD Family 25 Model 17 Stepping 1
Base Frequency 3.71 GHz
L1 Instruction Cache 32.0 KB x 96
L1 Data Cache 32.0 KB x 96
L2 Cache 1.00 MB x 96
L3 Cache 16.0 MB x 12
Memory Information
Size 125 GB
System is watercooled, CPUs are at 56C degrees at idle (and when running these tests).