Hyper-Threading and energy - Processor energy configuration series - Part 1
- by Arne Tarara

This arcticle is part of a multi-part series. Be sure to check out / stay tuned for the other parts! In this series we look at processor configuration options either from the OS side or directly in MSRs of the CPU and their effect on the power draw of the CPU.

Hyper-Threading is the branded name from Intel for a technology called Simultaneous Multi-Threading (SMT).

SMT is basically active in every modern Intel or AMD CPU. On Linux it’s status can be checked in /sys/devices/system/cpu/smt/active and changed by setting /sys/devices/system/cpu/smt/control to either off or forceoff. - off seemed to be enough in our cases. The OS might maybe turn it back on again - Using forceoff is safer, but needs a reboot to reactivate

On macOS this can be set in the NVRAM settings: - Boot into macOS Recovery - https://support.apple.com/en-gb/HT210108 - nvram boot-args=“cwae=2” - nvram SMTDisable=%01 - Reset NVRAM to reset settings

SMT is generally a technology that optimizes multi tasking. It basically offloads scheduling from the OS to the CPU, which can technically do it quicker and in turn provides the OS with more cores than are phsically available.

The question that arises for someone who is doing research in software energy consumption is:

  • How much energy / power does Hyper-Threading / SMT need to provide it’s functionality?
  • What are the drawbacks of SMT?
  • Should I turn SMT off when when energy is my primary concern?

Energy test

Our test machine is a MacBook Pro 13" 2015 model with a Intel Core i7-5557U CPU @ 3.1 GHz.

According to /proc/cpuinfo this chip has 2 physical cores (found by looking at max. core id number) and 4 threads (found by looking at max. processor number).

Looking at flags we see that ht is a feature, which corresponds to Hyper-Threading.

In order to have a first glimpse at the energy characterisitcs of this feature we are using sysbench, which you can just install through aptitude on Ubuntu 22.04.

The command we ran in sysbench is:

sudo perf stat -a -e power/energy-pkg/,power/energy-ram/ sysbench --cpu-max-prime=10000 --threads=X --test=cpu run

The command always runs for 10 s fixed. What we modified during the runs is the –threads argument as seen in the following table.

Results table

Threads HT Off - Events HT Off - Energy [J] HT off - mJ / Ops HT On - Events HT On - Energy [J] HT On - mJ / Ops
Idle 0,00 29,18 - 0,00 29,18 -
1 11.601,00 106,83 9,21 11.564,00 107,5 9,30
2 22.922,00 160,40 7,00 23.049,00 163,92 7,11
3 22.924,00 161,87 7,06 29.063,00 183,51 6,31
4 22.900,00 161,58 7,06 34.895,00 204,22 5,85
5 22.908,00 162,76 7,10 34.931,00 199,66 5,72
6 22.902,00 162,16 7,08 34.972,00 202,81 5,80

Graph

Hyper-Threading energy graph
As seen in the charts and the table Hyper-Threading on our Intel CPU on the test bench is always able to deliver more operations per 10 seconds, while the energy cost is not rising as quickly. So the result is: Energy cost per operation drops when using Hyper-Threading while also retaining and even increasing throughput.

The energy with Hyper-Threading turned on exceeds the total amount of the non-Hyper-Threading configuration of the chip when using 3 cores or more. This was not necessarily expected … it could also have been that the chip somehow throttles the performance but uses a constant energy budget..

The other interesting metric is the mJ / Ops metric. Here we can see that Hyper-Threading actually is more energy efficient per operation than running the system only with physical cores.

Discussion

The results are quite suprising as Hyper-Threading used to have a bit of a bad rep.

For instance this article from Percona comes to the conclusion that Hyper-Threading has rather throttling features and typically is more suitable for low utilization workloads.

Also Hyper-Threading has potential security issues although the current state and if it relevant in real world setups is not quite clear to us.

Another factor to keep into consideration is that Hyper-Threading by theory reduced the latency of your system when a task is picked up.
This makes perfect sense, as you introduce another scheduling layer.
However since a normal Linux installation is anyway not real-time workload optimized this factor might not weigh very high.

All in all we are very suprised about how energy friendly the feature is and especially for the typical server workloads that are rather multi-threaded and mostly idling. Since Hyper-Threading seems to have no effect on idle CPUs this seems like a perfect fit.

Since Hyper-Threading is by default turned on, and also every server in the SPECPower database has it turned on we see no reason to run benchmarks that should reflect CPU capabilities with Hyper-Threading turned off.

Did we miss something? Please shoot us an email to [email protected]