FreeBSD manual
download PDF document: pmc.core.3.pdf
PMC.CORE(3) FreeBSD Library Functions Manual PMC.CORE(3)
NAME
pmc.core - measurement events for Intel Core Solo and Core Duo family
CPUs
LIBRARY
Performance Counters Library (libpmc, -lpmc)
SYNOPSIS
#include <pmc.h>
DESCRIPTION
Intel Core Solo and Core Duo CPUs contain PMCs conforming to version 1 of
the Intel performance measurement architecture.
These PMCs are documented in Volume 3: System Programming Guide, IA-32
Intel(R) Architecture Software Developer's Manual, Order Number
253669-027US, Intel Corporation, July 2008.
PMC Features
CPUs conforming to version 1 of the Intel performance measurement
architecture contain two programmable PMCs of class PMC_CLASS_IAP. The
PMCs are 40 bits width and offer the following capabilities:
Capability Support
PMC_CAP_CASCADE No
PMC_CAP_EDGE Yes
PMC_CAP_INTERRUPT Yes
PMC_CAP_INVERT Yes
PMC_CAP_READ Yes
PMC_CAP_PRECISE No
PMC_CAP_SYSTEM Yes
PMC_CAP_TAGGING No
PMC_CAP_THRESHOLD Yes
PMC_CAP_USER Yes
PMC_CAP_WRITE Yes
Event Qualifiers
Event specifiers for these PMCs support the following common qualifiers:
cmask=value
Configure the PMC to increment only if the number of configured
events measured in a cycle is greater than or equal to value.
edge Configure the PMC to count the number of de-asserted to asserted
transitions of the conditions expressed by the other qualifiers.
If specified, the counter will increment only once whenever a
condition becomes true, irrespective of the number of clocks
during which the condition remains true.
inv Invert the sense of comparison when the "cmask" qualifier is
present, making the counter increment when the number of events
per cycle is less than the value specified by the "cmask"
qualifier.
os Configure the PMC to count events happening at processor
privilege level 0.
qualifier "core=value", where argument value is one of:
all Measure event conditions on all cores.
this Measure event conditions on this core.
The default is "this".
Events that require an agent qualifier to be specified use an additional
qualifier "agent=value", where argument value is one of:
this Measure events associated with this bus agent.
any Measure events caused by any bus agent.
The default is "this".
Events that require a hardware prefetch qualifier to be specified use an
additional qualifier "prefetch=value", where argument value is one of:
both Include all prefetches.
only Only count hardware prefetches.
exclude Exclude hardware prefetches.
The default is "both".
Events that require a cache coherence qualifier to be specified use an
additional qualifier "cachestate=value", where argument value contains
one or more of the following letters:
e Count cache lines in the exclusive state.
i Count cache lines in the invalid state.
m Count cache lines in the modified state.
s Count cache lines in the shared state.
The default is "eims".
Event Specifiers
The following event names are case insensitive. Whitespace, hyphens and
underscore characters in these names are ignored.
Core PMCs support the following events:
BAClears
(Event E6H, Umask 00H) The number of BAClear conditions asserted.
BTB_Misses
(Event E2H, Umask 00H) The number of branches for which the
branch table buffer did not produce a prediction.
Br_BAC_Missp_Exec
(Event 8AH, Umask 00H) The number of branch instructions executed
that were mispredicted at the front end.
Br_Bogus
(Event E4H, Umask 00H) The number of bogus branches.
Br_Call_Exec
(Event 92H, Umask 00H) The number of CALL instructions executed.
Br_Call_Missp_Exec
(Event 93H, Umask 00H) The number of CALL instructions executed
that were mispredicted.
Br_Cnd_Exec
(Event 8BH, Umask 00H) The number of conditional branch
instructions executed.
Br_Cnd_Missp_Exec
Br_Ind_Exec
(Event 8DH, Umask 00H) The number of indirect branches executed.
Br_Ind_Missp_Exec
(Event 8EH, Umask 00H) The number of indirect branch instructions
executed that were mispredicted.
Br_Inst_Exec
(Event 88H, Umask 00H) The number of branch instructions executed
including speculative branches.
Br_Instr_Decoded
(Event E0H, Umask 00H) The number of branch instructions decoded.
Br_Instr_Ret
(Event C4H, Umask 00H) (Alias "Branch Instruction Retired") The
number of branch instructions retired. This is an architectural
performance event.
Br_MisPred_Ret
(Event C5H, Umask 00H) (Alias "Branch Misses Retired") The number
of mispredicted branch instructions retired. This is an
architectural performance event.
Br_MisPred_Taken_Ret
(Event CAH, Umask 00H) The number of taken and mispredicted
branches retired.
Br_Missp_Exec
(Event 89H, Umask 00H) The number of branch instructions executed
and mispredicted at execution including branches that were not
predicted.
Br_Ret_BAC_Missp_Exec
(Event 91H, Umask 00H) The number of return branch instructions
that were mispredicted at the front end.
Br_Ret_Exec
(Event 8FH, Umask 00H) The number of return branch instructions
executed.
Br_Ret_Missp_Exec
(Event 90H, Umask 00H) The number of return branch instructions
executed that were mispredicted.
Br_Taken_Ret
(Event C9H, Umask 00H) The number of taken branches retired.
Bus_BNR_Clocks
(Event 61H, Umask 00H) The number of external bus cycles while
BNR (bus not ready) was asserted.
Bus_DRDY_Clocks [,agent=agent]
(Event 62H, Umask 00H) The number of external bus cycles while
DRDY was asserted.
Bus_Data_Rcv
(Event 64H, Umask 40H) The number of cycles during which the
processor is busy receiving data.
from the core.
Bus_Req_Outstanding [,agent=agent] [,core=core]
(Event 60H) The weighted cycles of cacheable bus data read
requests from the data cache unit or hardware prefetcher.
Bus_Snoop_Stall
(Event 7EH, Umask 00H) The number bus cycles while a bus snoop is
stalled.
Bus_Snoops [,agent=agent] [,cachestate=mesi]
(Event 77H) The number of snoop responses to bus transactions.
Bus_Trans_Any [,agent=agent]
(Event 70H) The number of completed bus transactions.
Bus_Trans_Brd [,core=core]
(Event 65H) The number of read bus transactions.
Bus_Trans_Burst [,agent=agent]
(Event 6EH) The number of completed burst transactions. Retried
transactions may be counted more than once.
Bus_Trans_Def [,core=core]
(Event 6DH) The number of completed deferred transactions.
Bus_Trans_IO [,agent=agent] [,core=core]
(Event 6CH) The number of completed I/O transactions counting
both reads and writes.
Bus_Trans_Ifetch [,agent=agent] [,core=core]
(Event 68H) Completed instruction fetch transactions.
Bus_Trans_Inval [,agent=agent] [,core=core]
(Event 69H) The number completed invalidate transactions.
Bus_Trans_Mem [,agent=agent]
(Event 6FH) The number of completed memory transactions.
Bus_Trans_P [,agent=agent] [,core=core]
(Event 6BH) The number of completed partial transactions.
Bus_Trans_Pwr [,agent=agent] [,core=core]
(Event 6AH) The number of completed partial write transactions.
Bus_Trans_RFO [,agent=agent] [,core=core]
(Event 66H) The number of completed read-for-ownership
transactions.
Bus_Trans_WB [,agent=agent]
(Event 67H) The number of completed write-back transactions from
the data cache unit, excluding L2 write-backs.
Cycles_Div_Busy
(Event 14H, Umask 00H) The number of cycles the divider is busy.
The event is only available on PMC0.
Cycles_Int_Masked
(Event C6H, Umask 00H) The number of cycles while interrupts were
(Event 78H) The number of data cache unit snoops to L1 cache
lines in the shared state.
DCache_Cache_Lock [,cachestate=mesi]
(Event 42H) The number of cacheable locked read operations to
invalid state.
DCache_Cache_LD [,cachestate=mesi]
(Event 40H) The number of cacheable L1 data read operations.
DCache_Cache_ST [,cachestate=mesi]
(Event 41H) The number cacheable L1 data write operations.
DCache_M_Evict
(Event 47H, Umask 00H) The number of M state data cache lines
that were evicted.
DCache_M_Repl
(Event 46H, Umask 00H) The number of M state data cache lines
that were allocated.
DCache_Pend_Miss
(Event 48H, Umask 00H) The weighted cycles an L1 miss was
outstanding.
DCache_Repl
(Event 45H, Umask 0FH) The number of data cache line
replacements.
Data_Mem_Cache_Ref
(Event 44H, Umask 02H) The number of cacheable read and write
operations to L1 data cache.
Data_Mem_Ref
(Event 43H, Umask 01H) The number of L1 data reads and writes,
both cacheable and un-cacheable.
Dbus_Busy [,core=core]
(Event 22H) The number of core cycles during which the data bus
was busy.
Dbus_Busy_Rd [,core=core]
(Event 23H) The number of cycles during which the data bus was
busy transferring data to a core.
Div (Event 13H, Umask 00H) The number of divide operations including
speculative operations for integer and floating point divides.
This event can only be counted on PMC1.
Dtlb_Miss
(Event 49H, Umask 00H) The number of data references that missed
the TLB.
ESP_Uops
(Event D7H, Umask 00H) The number of ESP folding instructions
decoded.
EST_Trans [,trans=transition]
(Event 3AH) Count the number of Intel Enhanced SpeedStep
FP_Assist
(Event 11H, Umask 00H) The number of floating point operations
that required microcode assists. The event is only available on
PMC1.
FP_Comp_Instr_Ret
(Event C1H, Umask 00H) The number of X87 floating point compute
instructions retired. The event is only available on PMC0.
FP_Comps_Op_Exe
(Event 10H, Umask 00H) The number of floating point computational
instructions executed.
FP_MMX_Trans
(Event CCH, Umask 01H) The number of transitions from X87 to MMX.
Fused_Ld_Uops_Ret
(Event DAH, Umask 01H) The number of fused load uops retired.
Fused_St_Uops_Ret
(Event DAH, Umask 02H) The number of fused store uops retired.
Fused_Uops_Ret
(Event DAH, Umask 00H) The number of fused uops retired.
HW_Int_Rx
(Event C8H, Umask 00H) The number of hardware interrupts
received.
ICache_Misses
(Event 81H, Umask 00H) The number of instruction fetch misses in
the instruction cache and streaming buffers.
ICache_Reads
(Event 80H, Umask 00H) The number of instruction fetches from the
instruction cache and streaming buffers counting both cacheable
and un-cacheable fetches.
IFU_Mem_Stall
(Event 86H, Umask 00H) The number of cycles the instruction fetch
unit was stalled while waiting for data from memory.
ILD_Stall
(Event 87H, Umask 00H) The number of instruction length decoder
stalls.
ITLB_Misses
(Event 85H, Umask 00H) The number of instruction TLB misses.
Instr_Decoded
(Event D0H, Umask 00H) The number of instructions decoded.
Instr_Ret
(Event C0H, Umask 00H) (Alias "Instruction Retired") The number
of instructions retired. This is an architectural performance
event.
L1_Pref_Req
(Event 4FH, Umask 00H) The number of L1 prefetch request due to
fetch unit from L2 cache including speculative fetches.
L2_LD [,cachestate=mesi] [,core=core]
(Event 29H) The number of L2 cache reads.
L2_Lines_In [,core=core] [,prefetch=prefetch]
(Event 24H) The number of L2 cache lines allocated.
L2_Lines_Out [,core=core] [,prefetch=prefetch]
(Event 26H) The number of L2 cache lines evicted.
L2_M_Lines_In [,core=core]
(Event 25H) The number of L2 M state cache lines allocated.
L2_M_Lines_Out [,core=core] [,prefetch=prefetch]
(Event 27H) The number of L2 M state cache lines evicted.
L2_No_Request_Cycles [,cachestate=mesi] [,core=core] [,prefetch=prefetch]
(Event 32H) The number of cycles there was no request to access
L2 cache.
L2_Reject_Cycles [,cachestate=mesi] [,core=core] [,prefetch=prefetch]
(Event 30H) The number of cycles the L2 cache was busy and
rejecting new requests.
L2_Rqsts [,cachestate=mesi] [,core=core] [,prefetch=prefetch]
(Event 2EH) The number of L2 cache requests.
L2_ST [,cachestate=mesi] [,core=core]
(Event 2AH) The number of L2 cache writes including speculative
writes.
LD_Blocks
(Event 03H, Umask 00H) The number of load operations delayed due
to store buffer blocks.
LLC_Misses
(Event 2EH, Umask 41H) The number of cache misses for references
to the last level cache, excluding misses due to hardware
prefetches. This is an architectural performance event.
LLC_Reference
The number of references to the last level cache, excluding those
due to hardware prefetches. This is an architectural performance
event. (Event 2EH, Umask 4FH) This is an architectural
performance event.
MMX_Assist
(Event CDH, Umask 00H) The number of EMMX instructions executed.
MMX_FP_Trans
(Event CCH, Umask 00H) The number of transitions from MMX to X87.
MMX_Instr_Exec
(Event B0H, Umask 00H) The number of MMX instructions executed
excluding MOVQ and MOVD stores.
MMX_Instr_Ret
(Event CEH, Umask 00H) The number of MMX instructions retired.
available on PMC1 only.
NonHlt_Ref_Cycles
(Event 3CH, Umask 01H) (Alias "Unhalted Reference Cycles") The
number of non-halted bus cycles. This is an architectural
performance event.
Pref_Rqsts_Dn
(Event F8H, Umask 00H) The number of hardware prefetch requests
issued in backward streams.
Pref_Rqsts_Up
(Event F0H, Umask 00H) The number of hardware prefetch requests
issued in forward streams.
Resource_Stall
(Event A2H, Umask 00H) The number of cycles where there is a
resource related stall.
SD_Drains
(Event 04H, Umask 00H) The number of cycles while draining store
buffers.
SIMD_FP_DP_P_Ret
(Event D8H, Umask 02H) The number of SSE/SSE2 packed double
precision instructions retired.
SIMD_FP_DP_P_Comp_Ret
(Event D9H, Umask 02H) The number of SSE/SSE2 packed double
precision compute instructions retired.
SIMD_FP_DP_S_Ret
(Event D8H, Umask 03H) The number of SSE/SSE2 scalar double
precision instructions retired.
SIMD_FP_DP_S_Comp_Ret
(Event D9H, Umask 03H) The number of SSE/SSE2 scalar double
precision compute instructions retired.
SIMD_FP_SP_P_Comp_Ret
(Event D9H, Umask 00H) The number of SSE/SSE2 packed single
precision compute instructions retired.
SIMD_FP_SP_Ret
(Event D8H, Umask 00H) The number of SSE/SSE2 scalar single
precision instructions retired, both packed and scalar.
SIMD_FP_SP_S_Ret
(Event D8H, Umask 01H) The number of SSE/SSE2 scalar single
precision instructions retired.
SIMD_FP_SP_S_Comp_Ret
(Event D9H, Umask 01H) The number of SSE/SSE2 single precision
compute instructions retired.
SIMD_Int_128_Ret
(Event D8H, Umask 04H) The number of SSE2 128-bit integer
instructions retired.
SIMD_Int_Plog_Exec
(Event B3H, Umask 10H) The number of SIMD integer packed logical
instructions executed.
SIMD_Int_Pmul_Exec
(Event B3H, Umask 01H) The number of SIMD integer packed multiply
instructions executed.
SIMD_Int_Psft_Exec
(Event B3H, Umask 02H) The number of SIMD integer packed shift
instructions executed.
SIMD_Int_Sat_Exec
(Event B1H, Umask 00H) The number of SIMD integer saturating
instructions executed.
SIMD_Int_Upck_Exec
(Event B3H, Umask 08H) The number of SIMD integer unpack
instructions executed.
SMC_Detected
(Event C3H, Umask 00H) The number of times self-modifying code
was detected.
SSE_NTStores_Miss
(Event 4BH, Umask 03H) The number of times an SSE streaming store
instruction missed all caches.
SSE_NTStores_Ret
(Event 07H, Umask 03H) The number of SSE streaming store
instructions executed.
SSE_PrefNta_Miss
(Event 4BH, Umask 00H) The number of times PREFETCHNTA missed all
caches.
SSE_PrefNta_Ret
(Event 07H, Umask 00H) The number of PREFETCHNTA instructions
retired.
SSE_PrefT1_Miss
(Event 4BH, Umask 01H) The number of times PREFETCHT1 missed all
caches.
SSE_PrefT1_Ret
(Event 07H, Umask 01H) The number of PREFETCHT1 instructions
retired.
SSE_PrefT2_Miss
(Event 4BH, Umask 02H) The number of times PREFETCHNT2 missed all
caches.
SSE_PrefT2_Ret
(Event 07H, Umask 02H) The number of PREFETCHT2 instructions
retired.
Seg_Reg_Loads
(Event 06H, Umask 00H) The number of segment register loads.
the current core clock.
Unfusion
(Event DBH, Umask 00H) The number of unfusion events.
Unhalted_Core_Cycles
(Event 3CH, Umask 00H) The number of core clock cycles when the
clock signal on a specific core is not halted. This is an
architectural performance event.
Uops_Ret
(Event C2H, Umask 00H) The number of micro-ops retired.
Event Name Aliases
The following table shows the mapping between the PMC-independent aliases
supported by Performance Counters Library (libpmc, -lpmc) and the
underlying hardware events used.
Alias Event
branches Br_Instr_Ret
branch-mispredicts Br_MisPred_Ret
dc-misses (unsupported)
ic-misses ICache_Misses
instructions Instr_Ret
interrupts HW_Int_Rx
unhalted-cycles (unsupported)
PROCESSOR ERRATA
The following errata affect performance measurement on these processors.
These errata are documented in Intel(R) CoreTM Duo Processor and Intel(R)
CoreTM Solo Processor on 65 nm Process, Specification Update, Order
Number 309222-017, Intel Corporation, July 2008.
AE19 Data prefetch performance monitoring events can only be enabled
on a single core.
AE25 Performance monitoring counters that count external bus events
may report incorrect values after processor power state
transitions.
AE28 Performance monitoring events for retired floating point
operations (C1H) may not be accurate.
AE29 DR3 address match on MOVD/MOVQ/MOVNTQ memory store instruction
may incorrectly increment performance monitoring count for
saturating SIMD instructions retired (Event CFH).
AE33 Hardware prefetch performance monitoring events may be counted
inaccurately.
AE36 The CPU_CLK_UNHALTED performance monitoring event (Event 3CH)
counts clocks when the processor is in the C1/C2 processor power
states.
AE39 Certain performance monitoring counters related to bus, L2 cache
and power management are inaccurate.
AE51 Performance monitoring events for retired instructions (Event
C0H) may not be accurate.
AE67 Performance monitoring event FP_ASSIST may not be accurate.
AE78 Performance monitoring event for hardware prefetch requests
(Event 4EH) and hardware prefetch request cache misses (Event
4FH) may not be accurate.
AE82 Performance monitoring event FP_MMX_TRANS_TO_MMX may not count
some transitions.
SEE ALSO
The Performance Counters Library (libpmc, -lpmc) library was written by
Joseph Koshy <jkoshy@FreeBSD.org>.
FreeBSD 14.2-RELEASE November 12, 2008 FreeBSD 14.2-RELEASE