Which Workloads Benefit from SMT (Hyper-Threading) Running Multiple Threads per Physical Core, and Which Don't

linux

SMT (Simultaneous Multi-Threading) is a technology that runs multiple threads on a single physical core to improve CPU throughput. Intel calls it Hyper-Threading.

Let’s benchmark on an m6i.xlarge instance with 2 physical cores x 2 = 4 vCPUs.

$ lscpu --extended
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE
  0    0      0    0 0:0:0:0          yes
  1    0      0    1 1:1:1:0          yes
  2    0      0    0 0:0:0:0          yes
  3    0      0    1 1:1:1:0          yes

SMT can be enabled or disabled via /sys/devices/system/cpu/smt/control.

run_suite () {
  local t; t=$(nproc)
  echo "==== $1  online vCPU=$t ===="
  sysbench cpu --cpu-max-prime=50000 --threads="$t" --time=15 run \
    | grep -E 'events per second|total number of events'
  stress-ng --vm "$t" --vm-bytes 75% --vm-method all \
    --metrics-brief --timeout 15s 2>&1 | grep 'vm '
}

run_suite "SMT ON"
echo off > /sys/devices/system/cpu/smt/control
sleep 2
run_suite "SMT OFF"
echo on > /sys/devices/system/cpu/smt/control

Enabling Hyper-Threading does not actually add physical cores, so throughput barely improves for tasks like computing primes, while memory read/write workloads with wait time improved significantly.

==== SMT ON  online vCPU=4 ====
    events per second:   636.58
    total number of events:              9552
stress-ng: metrc: [2086] vm  689759  16.50  31.22  33.47  41799.76  10663.04

==== SMT OFF  online vCPU=2 ====
    events per second:   620.01
    total number of events:              9302
stress-ng: metrc: [2110] vm  209813  16.50  12.73  19.31  12715.00   6550.20

In matrix operations as well, running 2 threads on different physical cores (-c 0,1) nearly doubles throughput compared to a single thread (-c 0) (3619.18 → 7040.92), while threads sharing a core (-c 0,2) show per-thread throughput degradation (2756.51).

$ taskset -c 0   stress-ng --matrix 1 --matrix-method all --metrics-brief --timeout 15s
# stress-ng: metrc: matrix    54290  15.00  15.00  0.00  3619.18  3619.27

$ taskset -c 0,2 stress-ng --matrix 2 --matrix-method all --metrics-brief --timeout 15s
# stress-ng: metrc: matrix    82706  15.00  30.00  0.00  5512.95  2756.51

$ taskset -c 0,1 stress-ng --matrix 2 --matrix-method all --metrics-brief --timeout 15s
# stress-ng: metrc: matrix   105624  15.00  30.00  0.00  7040.92  3520.54