FreeBSD manual

download PDF document: pmc.core.3.pdf

PMC.CORE(3) FreeBSD Library Functions Manual PMC.CORE(3)
NAME pmc.core - measurement events for Intel Core Solo and Core Duo family CPUs
LIBRARY Performance Counters Library (libpmc, -lpmc)
SYNOPSIS #include <pmc.h>
DESCRIPTION Intel Core Solo and Core Duo CPUs contain PMCs conforming to version 1 of the Intel performance measurement architecture.
These PMCs are documented in Volume 3: System Programming Guide, IA-32 Intel(R) Architecture Software Developer's Manual, Order Number 253669-027US, Intel Corporation, July 2008.
PMC Features CPUs conforming to version 1 of the Intel performance measurement architecture contain two programmable PMCs of class PMC_CLASS_IAP. The PMCs are 40 bits width and offer the following capabilities:
Capability Support PMC_CAP_CASCADE No PMC_CAP_EDGE Yes PMC_CAP_INTERRUPT Yes PMC_CAP_INVERT Yes PMC_CAP_READ Yes PMC_CAP_PRECISE No PMC_CAP_SYSTEM Yes PMC_CAP_TAGGING No PMC_CAP_THRESHOLD Yes PMC_CAP_USER Yes PMC_CAP_WRITE Yes
Event Qualifiers Event specifiers for these PMCs support the following common qualifiers:
cmask=value Configure the PMC to increment only if the number of configured events measured in a cycle is greater than or equal to value.
edge Configure the PMC to count the number of de-asserted to asserted transitions of the conditions expressed by the other qualifiers. If specified, the counter will increment only once whenever a condition becomes true, irrespective of the number of clocks during which the condition remains true.
inv Invert the sense of comparison when the "cmask" qualifier is present, making the counter increment when the number of events per cycle is less than the value specified by the "cmask" qualifier.
os Configure the PMC to count events happening at processor privilege level 0.
qualifier "core=value", where argument value is one of: all Measure event conditions on all cores. this Measure event conditions on this core. The default is "this".
Events that require an agent qualifier to be specified use an additional qualifier "agent=value", where argument value is one of: this Measure events associated with this bus agent. any Measure events caused by any bus agent. The default is "this".
Events that require a hardware prefetch qualifier to be specified use an additional qualifier "prefetch=value", where argument value is one of: both Include all prefetches. only Only count hardware prefetches. exclude Exclude hardware prefetches. The default is "both".
Events that require a cache coherence qualifier to be specified use an additional qualifier "cachestate=value", where argument value contains one or more of the following letters: e Count cache lines in the exclusive state. i Count cache lines in the invalid state. m Count cache lines in the modified state. s Count cache lines in the shared state. The default is "eims".
Event Specifiers The following event names are case insensitive. Whitespace, hyphens and underscore characters in these names are ignored.
Core PMCs support the following events:
BAClears (Event E6H, Umask 00H) The number of BAClear conditions asserted.
BTB_Misses (Event E2H, Umask 00H) The number of branches for which the branch table buffer did not produce a prediction.
Br_BAC_Missp_Exec (Event 8AH, Umask 00H) The number of branch instructions executed that were mispredicted at the front end.
Br_Bogus (Event E4H, Umask 00H) The number of bogus branches.
Br_Call_Exec (Event 92H, Umask 00H) The number of CALL instructions executed.
Br_Call_Missp_Exec (Event 93H, Umask 00H) The number of CALL instructions executed that were mispredicted.
Br_Cnd_Exec (Event 8BH, Umask 00H) The number of conditional branch instructions executed.
Br_Cnd_Missp_Exec Br_Ind_Exec (Event 8DH, Umask 00H) The number of indirect branches executed.
Br_Ind_Missp_Exec (Event 8EH, Umask 00H) The number of indirect branch instructions executed that were mispredicted.
Br_Inst_Exec (Event 88H, Umask 00H) The number of branch instructions executed including speculative branches.
Br_Instr_Decoded (Event E0H, Umask 00H) The number of branch instructions decoded.
Br_Instr_Ret (Event C4H, Umask 00H) (Alias "Branch Instruction Retired") The number of branch instructions retired. This is an architectural performance event.
Br_MisPred_Ret (Event C5H, Umask 00H) (Alias "Branch Misses Retired") The number of mispredicted branch instructions retired. This is an architectural performance event.
Br_MisPred_Taken_Ret (Event CAH, Umask 00H) The number of taken and mispredicted branches retired.
Br_Missp_Exec (Event 89H, Umask 00H) The number of branch instructions executed and mispredicted at execution including branches that were not predicted.
Br_Ret_BAC_Missp_Exec (Event 91H, Umask 00H) The number of return branch instructions that were mispredicted at the front end.
Br_Ret_Exec (Event 8FH, Umask 00H) The number of return branch instructions executed.
Br_Ret_Missp_Exec (Event 90H, Umask 00H) The number of return branch instructions executed that were mispredicted.
Br_Taken_Ret (Event C9H, Umask 00H) The number of taken branches retired.
Bus_BNR_Clocks (Event 61H, Umask 00H) The number of external bus cycles while BNR (bus not ready) was asserted.
Bus_DRDY_Clocks [,agent=agent] (Event 62H, Umask 00H) The number of external bus cycles while DRDY was asserted.
Bus_Data_Rcv (Event 64H, Umask 40H) The number of cycles during which the processor is busy receiving data. from the core.
Bus_Req_Outstanding [,agent=agent] [,core=core] (Event 60H) The weighted cycles of cacheable bus data read requests from the data cache unit or hardware prefetcher.
Bus_Snoop_Stall (Event 7EH, Umask 00H) The number bus cycles while a bus snoop is stalled.
Bus_Snoops [,agent=agent] [,cachestate=mesi] (Event 77H) The number of snoop responses to bus transactions.
Bus_Trans_Any [,agent=agent] (Event 70H) The number of completed bus transactions.
Bus_Trans_Brd [,core=core] (Event 65H) The number of read bus transactions.
Bus_Trans_Burst [,agent=agent] (Event 6EH) The number of completed burst transactions. Retried transactions may be counted more than once.
Bus_Trans_Def [,core=core] (Event 6DH) The number of completed deferred transactions.
Bus_Trans_IO [,agent=agent] [,core=core] (Event 6CH) The number of completed I/O transactions counting both reads and writes.
Bus_Trans_Ifetch [,agent=agent] [,core=core] (Event 68H) Completed instruction fetch transactions.
Bus_Trans_Inval [,agent=agent] [,core=core] (Event 69H) The number completed invalidate transactions.
Bus_Trans_Mem [,agent=agent] (Event 6FH) The number of completed memory transactions.
Bus_Trans_P [,agent=agent] [,core=core] (Event 6BH) The number of completed partial transactions.
Bus_Trans_Pwr [,agent=agent] [,core=core] (Event 6AH) The number of completed partial write transactions.
Bus_Trans_RFO [,agent=agent] [,core=core] (Event 66H) The number of completed read-for-ownership transactions.
Bus_Trans_WB [,agent=agent] (Event 67H) The number of completed write-back transactions from the data cache unit, excluding L2 write-backs.
Cycles_Div_Busy (Event 14H, Umask 00H) The number of cycles the divider is busy. The event is only available on PMC0.
Cycles_Int_Masked (Event C6H, Umask 00H) The number of cycles while interrupts were (Event 78H) The number of data cache unit snoops to L1 cache lines in the shared state.
DCache_Cache_Lock [,cachestate=mesi] (Event 42H) The number of cacheable locked read operations to invalid state.
DCache_Cache_LD [,cachestate=mesi] (Event 40H) The number of cacheable L1 data read operations.
DCache_Cache_ST [,cachestate=mesi] (Event 41H) The number cacheable L1 data write operations.
DCache_M_Evict (Event 47H, Umask 00H) The number of M state data cache lines that were evicted.
DCache_M_Repl (Event 46H, Umask 00H) The number of M state data cache lines that were allocated.
DCache_Pend_Miss (Event 48H, Umask 00H) The weighted cycles an L1 miss was outstanding.
DCache_Repl (Event 45H, Umask 0FH) The number of data cache line replacements.
Data_Mem_Cache_Ref (Event 44H, Umask 02H) The number of cacheable read and write operations to L1 data cache.
Data_Mem_Ref (Event 43H, Umask 01H) The number of L1 data reads and writes, both cacheable and un-cacheable.
Dbus_Busy [,core=core] (Event 22H) The number of core cycles during which the data bus was busy.
Dbus_Busy_Rd [,core=core] (Event 23H) The number of cycles during which the data bus was busy transferring data to a core.
Div (Event 13H, Umask 00H) The number of divide operations including speculative operations for integer and floating point divides. This event can only be counted on PMC1.
Dtlb_Miss (Event 49H, Umask 00H) The number of data references that missed the TLB.
ESP_Uops (Event D7H, Umask 00H) The number of ESP folding instructions decoded.
EST_Trans [,trans=transition] (Event 3AH) Count the number of Intel Enhanced SpeedStep
FP_Assist (Event 11H, Umask 00H) The number of floating point operations that required microcode assists. The event is only available on PMC1.
FP_Comp_Instr_Ret (Event C1H, Umask 00H) The number of X87 floating point compute instructions retired. The event is only available on PMC0.
FP_Comps_Op_Exe (Event 10H, Umask 00H) The number of floating point computational instructions executed.
FP_MMX_Trans (Event CCH, Umask 01H) The number of transitions from X87 to MMX.
Fused_Ld_Uops_Ret (Event DAH, Umask 01H) The number of fused load uops retired.
Fused_St_Uops_Ret (Event DAH, Umask 02H) The number of fused store uops retired.
Fused_Uops_Ret (Event DAH, Umask 00H) The number of fused uops retired.
HW_Int_Rx (Event C8H, Umask 00H) The number of hardware interrupts received.
ICache_Misses (Event 81H, Umask 00H) The number of instruction fetch misses in the instruction cache and streaming buffers.
ICache_Reads (Event 80H, Umask 00H) The number of instruction fetches from the instruction cache and streaming buffers counting both cacheable and un-cacheable fetches.
IFU_Mem_Stall (Event 86H, Umask 00H) The number of cycles the instruction fetch unit was stalled while waiting for data from memory.
ILD_Stall (Event 87H, Umask 00H) The number of instruction length decoder stalls.
ITLB_Misses (Event 85H, Umask 00H) The number of instruction TLB misses.
Instr_Decoded (Event D0H, Umask 00H) The number of instructions decoded.
Instr_Ret (Event C0H, Umask 00H) (Alias "Instruction Retired") The number of instructions retired. This is an architectural performance event.
L1_Pref_Req (Event 4FH, Umask 00H) The number of L1 prefetch request due to fetch unit from L2 cache including speculative fetches.
L2_LD [,cachestate=mesi] [,core=core] (Event 29H) The number of L2 cache reads.
L2_Lines_In [,core=core] [,prefetch=prefetch] (Event 24H) The number of L2 cache lines allocated.
L2_Lines_Out [,core=core] [,prefetch=prefetch] (Event 26H) The number of L2 cache lines evicted.
L2_M_Lines_In [,core=core] (Event 25H) The number of L2 M state cache lines allocated.
L2_M_Lines_Out [,core=core] [,prefetch=prefetch] (Event 27H) The number of L2 M state cache lines evicted.
L2_No_Request_Cycles [,cachestate=mesi] [,core=core] [,prefetch=prefetch] (Event 32H) The number of cycles there was no request to access L2 cache.
L2_Reject_Cycles [,cachestate=mesi] [,core=core] [,prefetch=prefetch] (Event 30H) The number of cycles the L2 cache was busy and rejecting new requests.
L2_Rqsts [,cachestate=mesi] [,core=core] [,prefetch=prefetch] (Event 2EH) The number of L2 cache requests.
L2_ST [,cachestate=mesi] [,core=core] (Event 2AH) The number of L2 cache writes including speculative writes.
LD_Blocks (Event 03H, Umask 00H) The number of load operations delayed due to store buffer blocks.
LLC_Misses (Event 2EH, Umask 41H) The number of cache misses for references to the last level cache, excluding misses due to hardware prefetches. This is an architectural performance event.
LLC_Reference The number of references to the last level cache, excluding those due to hardware prefetches. This is an architectural performance event. (Event 2EH, Umask 4FH) This is an architectural performance event.
MMX_Assist (Event CDH, Umask 00H) The number of EMMX instructions executed.
MMX_FP_Trans (Event CCH, Umask 00H) The number of transitions from MMX to X87.
MMX_Instr_Exec (Event B0H, Umask 00H) The number of MMX instructions executed excluding MOVQ and MOVD stores.
MMX_Instr_Ret (Event CEH, Umask 00H) The number of MMX instructions retired. available on PMC1 only.
NonHlt_Ref_Cycles (Event 3CH, Umask 01H) (Alias "Unhalted Reference Cycles") The number of non-halted bus cycles. This is an architectural performance event.
Pref_Rqsts_Dn (Event F8H, Umask 00H) The number of hardware prefetch requests issued in backward streams.
Pref_Rqsts_Up (Event F0H, Umask 00H) The number of hardware prefetch requests issued in forward streams.
Resource_Stall (Event A2H, Umask 00H) The number of cycles where there is a resource related stall.
SD_Drains (Event 04H, Umask 00H) The number of cycles while draining store buffers.
SIMD_FP_DP_P_Ret (Event D8H, Umask 02H) The number of SSE/SSE2 packed double precision instructions retired.
SIMD_FP_DP_P_Comp_Ret (Event D9H, Umask 02H) The number of SSE/SSE2 packed double precision compute instructions retired.
SIMD_FP_DP_S_Ret (Event D8H, Umask 03H) The number of SSE/SSE2 scalar double precision instructions retired.
SIMD_FP_DP_S_Comp_Ret (Event D9H, Umask 03H) The number of SSE/SSE2 scalar double precision compute instructions retired.
SIMD_FP_SP_P_Comp_Ret (Event D9H, Umask 00H) The number of SSE/SSE2 packed single precision compute instructions retired.
SIMD_FP_SP_Ret (Event D8H, Umask 00H) The number of SSE/SSE2 scalar single precision instructions retired, both packed and scalar.
SIMD_FP_SP_S_Ret (Event D8H, Umask 01H) The number of SSE/SSE2 scalar single precision instructions retired.
SIMD_FP_SP_S_Comp_Ret (Event D9H, Umask 01H) The number of SSE/SSE2 single precision compute instructions retired.
SIMD_Int_128_Ret (Event D8H, Umask 04H) The number of SSE2 128-bit integer instructions retired.

SIMD_Int_Plog_Exec (Event B3H, Umask 10H) The number of SIMD integer packed logical instructions executed.
SIMD_Int_Pmul_Exec (Event B3H, Umask 01H) The number of SIMD integer packed multiply instructions executed.
SIMD_Int_Psft_Exec (Event B3H, Umask 02H) The number of SIMD integer packed shift instructions executed.
SIMD_Int_Sat_Exec (Event B1H, Umask 00H) The number of SIMD integer saturating instructions executed.
SIMD_Int_Upck_Exec (Event B3H, Umask 08H) The number of SIMD integer unpack instructions executed.
SMC_Detected (Event C3H, Umask 00H) The number of times self-modifying code was detected.
SSE_NTStores_Miss (Event 4BH, Umask 03H) The number of times an SSE streaming store instruction missed all caches.
SSE_NTStores_Ret (Event 07H, Umask 03H) The number of SSE streaming store instructions executed.
SSE_PrefNta_Miss (Event 4BH, Umask 00H) The number of times PREFETCHNTA missed all caches.
SSE_PrefNta_Ret (Event 07H, Umask 00H) The number of PREFETCHNTA instructions retired.
SSE_PrefT1_Miss (Event 4BH, Umask 01H) The number of times PREFETCHT1 missed all caches.
SSE_PrefT1_Ret (Event 07H, Umask 01H) The number of PREFETCHT1 instructions retired.
SSE_PrefT2_Miss (Event 4BH, Umask 02H) The number of times PREFETCHNT2 missed all caches.
SSE_PrefT2_Ret (Event 07H, Umask 02H) The number of PREFETCHT2 instructions retired.
Seg_Reg_Loads (Event 06H, Umask 00H) The number of segment register loads. the current core clock.
Unfusion (Event DBH, Umask 00H) The number of unfusion events.
Unhalted_Core_Cycles (Event 3CH, Umask 00H) The number of core clock cycles when the clock signal on a specific core is not halted. This is an architectural performance event.
Uops_Ret (Event C2H, Umask 00H) The number of micro-ops retired.
Event Name Aliases The following table shows the mapping between the PMC-independent aliases supported by Performance Counters Library (libpmc, -lpmc) and the underlying hardware events used.
Alias Event branches Br_Instr_Ret branch-mispredicts Br_MisPred_Ret dc-misses (unsupported) ic-misses ICache_Misses instructions Instr_Ret interrupts HW_Int_Rx unhalted-cycles (unsupported)
PROCESSOR ERRATA The following errata affect performance measurement on these processors. These errata are documented in Intel(R) CoreTM Duo Processor and Intel(R) CoreTM Solo Processor on 65 nm Process, Specification Update, Order Number 309222-017, Intel Corporation, July 2008. AE19 Data prefetch performance monitoring events can only be enabled on a single core. AE25 Performance monitoring counters that count external bus events may report incorrect values after processor power state transitions. AE28 Performance monitoring events for retired floating point operations (C1H) may not be accurate. AE29 DR3 address match on MOVD/MOVQ/MOVNTQ memory store instruction may incorrectly increment performance monitoring count for saturating SIMD instructions retired (Event CFH). AE33 Hardware prefetch performance monitoring events may be counted inaccurately. AE36 The CPU_CLK_UNHALTED performance monitoring event (Event 3CH) counts clocks when the processor is in the C1/C2 processor power states. AE39 Certain performance monitoring counters related to bus, L2 cache and power management are inaccurate. AE51 Performance monitoring events for retired instructions (Event C0H) may not be accurate. AE67 Performance monitoring event FP_ASSIST may not be accurate. AE78 Performance monitoring event for hardware prefetch requests (Event 4EH) and hardware prefetch request cache misses (Event 4FH) may not be accurate. AE82 Performance monitoring event FP_MMX_TRANS_TO_MMX may not count some transitions.
SEE ALSO The Performance Counters Library (libpmc, -lpmc) library was written by Joseph Koshy <jkoshy@FreeBSD.org>.
FreeBSD 14.2-RELEASE November 12, 2008 FreeBSD 14.2-RELEASE