Modern Intel processors have a performance monitoring unit (PMU) for counting performance-related events during execution. The most common way to interact with the PMU is through the perf subsystem. This post describes how to program the PMU directly through the x86 model specific registers (MSRs). This information is specific to Intel Skylake, but other microarchitectures are similar.

Probing the PMU

The cpuid instruction with eax = 10 returns PMU details in eax and edx.

#include <stdio.h>
#include <inttypes.h>

union cpuid {
    struct {
        uint32_t nfixed_counters:5;
        uint32_t fixed_counter_width:8;
    };
    struct {
        uint32_t pmu_version:8;
        uint32_t nprog_counters:8;
        uint32_t prog_counter_width:8;
    };
    uint32_t val;
};

int main() {
  union cpuid eax = {.val = 10}, edx;

  __asm __volatile("cpuid" : "+a"(eax.val), "=d"(edx.val)::"ebx", "ecx");

  printf("Version: %u\n"
         "Fixed counters: %u\n"
         "Fixed counter width: %u\n"
         "Programmable counters: %u\n"
         "Progammable counter width: %u\n",
         eax.pmu_version, edx.nfixed_counters,
         edx.fixed_counter_width, eax.nprog_counters,
         eax.prog_counter_width);

  return 0;
}

Accessing Hardware Counters

rdmsr and wrmsr are reserved for ring 0. The msr module can be loaded for user access. This is implemented in arch/x86/kernel/msr.c.

$ modprobe msr
$ lsmod | grep msr
msr                    16384  0

Reading/writing to an MSR requires opening /dev/cpu/*/msr for a particular core and seeking to the register’s offset. A context switch can be avoided for reads by using rdpmc to read performance counters from ring 3. This is done by setting the PCE bit (bit 8) in cr4 from ring 0 once at setup time.

Fixed Counters

There are 3 fixed counters that are hardcoded to count the following events. These can be controlled using the IA32_FIXED_CTR_CTRL MSR at offset 0x38d.

Fixed Counter Event Offset
IA32_FIXED_CTR0 INSTR_RETIRED_ANY 0x309
IA32_FIXED_CTR1 CPU_CLK_UNHALTED.THREAD 0x30a
IA32_FIXED_CTR2 CPU_CLK_UNHALTED.REF_TSC 0x30b

Programmable Counters

There are 4 programmable counters per logical core (8 if hyperthreading is disabled). IA32_PMC{4-7} and IA32_PERFEVTSEL{4-7} are only available when hyperthreading is disabled.

MSR Offset Control Offset
IA32_PMC0 0xc1 IA32_PERFEVTSEL0 0x186
IA32_PMC1 0xc2 IA32_PERFEVTSEL1 0x187
IA32_PMC2 0xc3 IA32_PERFEVTSEL2 0x188
IA32_PMC3 0xc4 IA32_PERFEVTSEL3 0x189
IA32_PMC4 0xc5 IA32_PERFEVTSEL4 0x18a
IA32_PMC5 0xc6 IA32_PERFEVTSEL5 0x18b
IA32_PMC6 0xc7 IA32_PERFEVTSEL6 0x18c
IA32_PMC7 0xc8 IA32_PERFEVTSEL7 0x18d

Resetting the PMU

When tracing application code it’s necessary to reset the PMU state beforehand.

Disable Counters

Writing 0 to the IA32_PERF_GLOBAL_CTRL and IA32_FIXED_CTR_CTRL registers will disable the counters so that they don’t immediately resume counting after being cleared.

Clear counters

After disabling, writing 0 to the IA32_FIXED_CTR{0-2} and IA32_PMC{0-3} registers will clear the fixed and programmable counters respectively.

Enable counters

Writing a bitmask with bits 34-32 set for each fixed counter and low bits set for each programmable counter to the IA32_PERF_GLOBAL_CTRL register will enable the counters. The value 0x70000000f will enable all 3 fixed counters and all 4 programmable counters. The user bit also needs to be set for each fixed counter by writing to IA32_FIXED_CTR_CTRL.

Configuring Events

Programmable counters can be configured by writing a bitmask to the IA32_PERFEVTSELx registers as shown below. The event_select, unit_mask, and cmask bits correspond to the event being programmed. Intel publishes a list for each microarchitecture here. The enable_cntrs bit must also be set to enable the counter. For tracing applications in ring 3, it’s enough to set the user_mode bit, leaving everything else cleared.

union counter_config {
    struct {
        uint32_t event_select:8;
        uint32_t unit_mask: 8;
        uint32_t user_mode:1;
        uint32_t os_mode:1;
        uint32_t edge_detect:1;
        uint32_t pin_control:1;
        uint32_t interrupt:1;
        uint32_t any_thread:1;
        uint32_t enable_cntrs:1;
        uint32_t invert_cntr_mask:1;
        uint32_t cmask:8;
        uint32_t reserved:32;
    };
    uint64_t val;
};

The counters will be enabled and the PMU will begin counting after the write completes. Reading from the corresponding IA32_PMCx register will give the current count for the programmed event.

References

  1. Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B