2024-10-05
This post describes how to program the PMU through the x86 model specific registers (MSRs). This information is specific to Intel Skylake, but other microarchitectures are similar.
The cpuid
instruction with eax = 10
returns
PMU details in eax
and edx
.
#include <stdio.h>
#include <inttypes.h>
union cpuid {
struct {
uint32_t nfixed_counters:5;
uint32_t fixed_counter_width:8;
};
struct {
uint32_t pmu_version:8;
uint32_t nprog_counters:8;
uint32_t prog_counter_width:8;
};
uint32_t val;
};
int main() {
union cpuid eax = {.val = 10}, edx;
("cpuid" : "+a"(eax.val), "=d"(edx.val)::"ebx", "ecx");
__asm __volatile
("Version: %u\n"
printf"Fixed counters: %u\n"
"Fixed counter width: %u\n"
"Programmable counters: %u\n"
"Progammable counter width: %u\n",
.pmu_version, edx.nfixed_counters,
eax.fixed_counter_width, eax.nprog_counters,
edx.prog_counter_width);
eax
return 0;
}
rdmsr and wrmsr are reserved for ring 0. The msr module can be loaded for user access. This is implemented in arch/x86/kernel/msr.c.
$ modprobe msr
$ lsmod | grep msr
msr 16384 0
Reading/writing to an MSR requires opening
/dev/cpu/*/msr
for a particular core and seeking to the
register’s offset. A context switch can be avoided for reads by using rdpmc to read
performance counters from ring 3. This is done by setting the PCE bit
(bit 8) in cr4
from ring 0 once at setup time.
There are 3 fixed counters that are hardcoded to count the following
events. These can be controlled using the
IA32_FIXED_CTR_CTRL
MSR at offset 0x38d
.
Fixed Counter | Event | Offset |
---|---|---|
IA32_FIXED_CTR0 | INSTR_RETIRED_ANY | 0x309 |
IA32_FIXED_CTR1 | CPU_CLK_UNHALTED.THREAD | 0x30a |
IA32_FIXED_CTR2 | CPU_CLK_UNHALTED.REF_TSC | 0x30b |
There are 4 programmable counters per logical core (8 if
hyperthreading is disabled). IA32_PMC{4-7}
and
IA32_PERFEVTSEL{4-7}
are only available when hyperthreading
is disabled.
MSR | Offset | Control | Offset |
---|---|---|---|
IA32_PMC0 | 0xc1 | IA32_PERFEVTSEL0 | 0x186 |
IA32_PMC1 | 0xc2 | IA32_PERFEVTSEL1 | 0x187 |
IA32_PMC2 | 0xc3 | IA32_PERFEVTSEL2 | 0x188 |
IA32_PMC3 | 0xc4 | IA32_PERFEVTSEL3 | 0x189 |
IA32_PMC4 | 0xc5 | IA32_PERFEVTSEL4 | 0x18a |
IA32_PMC5 | 0xc6 | IA32_PERFEVTSEL5 | 0x18b |
IA32_PMC6 | 0xc7 | IA32_PERFEVTSEL6 | 0x18c |
IA32_PMC7 | 0xc8 | IA32_PERFEVTSEL7 | 0x18d |
When tracing application code it’s necessary to reset the PMU state beforehand.
Writing 0 to the IA32_PERF_GLOBAL_CTRL
and
IA32_FIXED_CTR_CTRL
registers will disable the counters so
that they don’t immediately resume counting after being cleared.
After disabling, writing 0 to the IA32_FIXED_CTR{0-2}
and IA32_PMC{0-3}
registers will clear the fixed and
programmable counters respectively.
Writing a bitmask with bits 34-32 set for each fixed counter and low
bits set for each programmable counter to the
IA32_PERF_GLOBAL_CTRL
register will enable the counters.
The value 0x70000000f
will enable all 3 fixed counters and
all 4 programmable counters. The user bit also needs to be set for each
fixed counter by writing to IA32_FIXED_CTR_CTRL
.
Programmable counters can be configured by writing a bitmask to the
IA32_PERFEVTSELx
registers as shown below. The
event_select
, unit_mask
, and
cmask
bits correspond to the event being programmed. Intel
publishes a list for each microarchitecture here. The
enable_cntrs
bit must also be set to enable the counter.
For tracing applications in ring 3, it’s enough to set the
user_mode
bit, leaving everything else cleared.
union counter_config {
struct {
uint32_t event_select:8;
uint32_t unit_mask: 8;
uint32_t user_mode:1;
uint32_t os_mode:1;
uint32_t edge_detect:1;
uint32_t pin_control:1;
uint32_t interrupt:1;
uint32_t any_thread:1;
uint32_t enable_cntrs:1;
uint32_t invert_cntr_mask:1;
uint32_t cmask:8;
uint32_t reserved:32;
};
uint64_t val;
};
The counters will be enabled and the PMU will begin counting after
the write completes. Reading from the corresponding
IA32_PMCx
register will give the current count for the
programmed event.