FreeBSD manual

download PDF document: pmc.cmn-600.3.pdf

PMC.CMN-600(3) FreeBSD Library Functions Manual PMC.CMN-600(3)
NAME pmc.cmn-600 - Library for accessing the Arm CoreLink CMN-600 Coherent Mesh Network Controller performance counter events
LIBRARY Performance Counters Library (libpmc, -lpmc)
SYNOPSIS #include <pmc.h>
DESCRIPTION CMN-600 PMU counters may be configured to count any one of a defined set of hardware events. Unlike other performance counters, counters for the CMN-600 require the node ID to set up.
Node ID information currently can be obtained one of two ways. Using bootverbose, for example set sysctl debug.bootverbose=1 and then load the hwpmc(4) KLD module. The cmn600 module will be loaded automatically as a dependency: $ sysctl debug.bootverbose=1 $ kldload hwpmc Another way is to use sysctl to trigger dump of nodes tree to system console: $ sysctl dev.cmn600.0.dump_nodes=1
Some BIOS versions of dual-socket machines have no NUMA domain information in ACPI. In such cases, to get more accurate events statistics, set the kernel environment variable hint.cmn600.{unit}.domain={value}. Where {unit} is a cmn600 device unit number and {value} is the NUMA domain of the CPU package containing that CMN-600 controller. Example: $ kenv hint.cmn600.0.domain=0 $ kenv hint.cmn600.1.domain=1 $ kldunload hwpmc cmn600 $ kldload hwpmc
Arm CoreLink CMN-600 Coherent Mesh Network Controller performance counters are documented in Revision: r3p2, Arm CoreLink CMN-600 Coherent Mesh Network Technical Reference Manual, ARM Limited, 2020.
PMC Capabilities CMN-600 PMU counters support the following capabilities:
Capability Support PMC_CAP_CASCADE No PMC_CAP_EDGE No PMC_CAP_INTERRUPT Yes PMC_CAP_INVERT No PMC_CAP_READ Yes PMC_CAP_PRECISE No PMC_CAP_SYSTEM Yes PMC_CAP_TAGGING No PMC_CAP_THRESHOLD Yes PMC_CAP_USER No PMC_CAP_WRITE Yes
Event Qualifiers
xpport=port Count only events matched by port. (East, West, North, South, devport0, devport1 or numeric 0 to 5)
xpchannel=channel Filter events by XP node channel. (REQ, RSP, SNP, DAT or 0, 1, 2, 3)
Class Name Prefix These PMCs are named using a class name prefix of "CMN600_PMU_".
Event Specifiers The following list of PMC events are available:
DVM node events dn_rxreq_dvmop Number of DVMOP requests received. This includes all the sub- types include TLB invalidate, Branch predictor invalidate, instruction cache (physical and virtual) invalidate.
dn_rxreq_dvmsync Number of DVM Sync requests received.
dn_rxreq_dvmop_vmid_filtered Number of incoming DVMOP requests that are subject to VMID based filtering. This is a measure of the effectiveness of VMID based filtering and potential reduction in DVM snoops.
dn_rxreq_retried Number of incoming requests that are retried. This is a measure of the retry rate.
dn_rxreq_trk_occupancy Counts the tracker occupancy in DN. "occupancy": All, dvmop, dvmsync
dn_rxreq_tlbi_dvmop Number of DVMOP TLB invalidate requests received.
dn_rxreq_bpi_dvmop Number of DVMOP Branch predictor invalidate requests received.
dn_rxreq_pici_dvmop Number of DVMOP physical instruction cache invalidate requests received.
dn_rxreq_vivi_dvmop Number of DVMOP virtual instruction cache invalidate requests received.
dn_rxreq_dvmop_other_filtered Number of DVM op requests to RNDs, BPI or PICI/VICI, that were filtered
dn_rxreq_snp_sent Number of SNPs sent to RNs
dn_rxreq_snp_stalled Counts total cache misses in first lookup result (high priority)
hnf_slc_sf_cache_access Counts number of cache accesses in first access (high priority)
hnf_cache_fill Counts total allocations in HN SLC (all cache line allocations to SLC)
hnf_pocq_retry Counts number of retried requests
hnf_pocq_reqs_recvd Counts number of requests received by HN
hnf_sf_hit Counts number of SF hits
hnf_sf_evictions Counts number of SF eviction cache invalidations initiated
hnf_dir_snoops_sent Counts number of directed snoops sent (not including SF back invalidation)
hnf_brd_snoops_sent Counts number of multicast snoops sent (not including SF back invalidation)
hnf_slc_eviction Counts number of SLC evictions (dirty only)
hnf_slc_fill_invalid_way Counts number of SLC fills to an invalid way
hnf_mc_retries Counts number of retried transactions by the MC
hnf_mc_reqs Counts number of requests sent to MC
hnf_qos_hh_retry Counts number of times a HighHigh priority request is protocol- retried at the HN-F
hnf_qos_pocq Counts the POCQ occupancy in HN-F. Support argument "occupancy". Accept: All, Read, Write, Atomic, Stash. Default: All.
hnf_pocq_addrhaz Counts number of POCQ address hazards upon allocation
hnf_pocq_atomic_addrhaz Counts number of POCQ address hazards upon allocation for atomic operations
hnf_ld_st_swp_adq_full Counts number of times ADQ is full for Ld/St/SWP type atomic operations while POCQ has pending operations credits to upload
hnf_txrsp_stall Counts number of times HN-F has a pending TXRSP flit but no credits to upload
hnf_seq_full Counts number of times requests are replayed in SLC pipe due to SEQ being full
hnf_seq_hit Counts number of times a request in SLC hit a pending SF eviction in SEQ
hnf_snp_sent Counts number of snoops sent including directed, multicast, and SF back invalidation
hnf_sfbi_dir_snp_sent Counts number of times directed snoops were sent due to SF back invalidation
hnf_sfbi_brd_snp_sent Counts number of times multicast snoops were sent due to SF back invalidation
hnf_snp_sent_untrk Counts number of times snooped were sent due to untracked RNF's
hnf_intv_dirty Counts number of times SF back invalidation resulted in dirty line intervention from the RN
hnf_stash_snp_sent Counts number of times stash snoops were sent
hnf_stash_data_pull Counts number of times stash snoops resulted in data pull from the RN
hnf_snp_fwded Counts number of times data forward snoops were sent
HN-I node events hni_rrt_rd_occ_cnt_ovfl RRT read occupancy count overflow
hni_rrt_wr_occ_cnt_ovfl RRT write occupancy count overflow
hni_rdt_rd_occ_cnt_ovfl RDT read occupancy count overflow
hni_rdt_wr_occ_cnt_ovfl RDT write occupancy count overflow
hni_wdb_occ_cnt_ovfl WDB occupancy count overflow
RDT read allocation
hni_rdt_wr_alloc RDT write allocation
hni_wdb_alloc WDB allocation
hni_txrsp_retryack RETRYACK TXRSP flit sent
hni_arvalid_no_arready ARVALID set without ARREADY event
hni_arready_no_arvalid ARREADY set without ARVALID event
hni_awvalid_no_awready AWVALID set without AWREADY event
hni_awready_no_awvalid AWREADY set without AWVALID event
hni_wvalid_no_wready WVALID set without WREADY event
hni_txdat_stall TXDAT stall (TXDAT valid but no link credit available)
hni_nonpcie_serialization Non-PCIe serialization event
hni_pcie_serialization PCIe serialization event
XP node events xp_txflit_valid Number of flits transmitted on a specified port and CHI channel. This is a measure of the flit transfer bandwidth from an XP. Note: On device ports, this event also includes link flit transfers.
xp_txflit_stall Number of cycles when a flit is stalled at an XP waiting for link credits at a specified port and CHI channel. This is a measure of the flit traffic congestion on the mesh and at the flit download ports.
xp_partial_dat_flit Number of times when a partial DAT flit is uploaded onto the mesh from a RN-F_CHIA port. Partial DAT flit transmission occurs when XP is not able to combine two 128b DAT flits and send them over the 256b DAT channel. This can happen under 2 circumstances: 1. Only one 128b DAT flit is received within a transmission time window. 2. Two 128b DAT flits are received but they are not two halves of a single 256b word.
SBSX node events CMO request
sbsx_txrsp_retryack RETRYACK TXRSP flit sent
sbsx_txdat_flitv TXDAT flit seen
sbsx_txrsp_flitv TXRSP flit seen
sbsx_rd_req_trkr_occ_cnt_ovfl Read request tracker occupancy count overflow
sbsx_wr_req_trkr_occ_cnt_ovfl Write request tracker occupancy count overflow
sbsx_cmo_req_trkr_occ_cnt_ovfl CMO request tracker occupancy count overflow
sbsx_wdb_occ_cnt_ovfl WDB occupancy count overflow
sbsx_rd_axi_trkr_occ_cnt_ovfl Read AXI pending tracker occupancy count overflow
sbsx_cmo_axi_trkr_occ_cnt_ovfl CMO AXI pending tracker occupancy count overflow
sbsx_arvalid_no_arready ARVALID set without ARREADY
sbsx_awvalid_no_awready AWVALID set without AWREADY
sbsx_wvalid_no_wready WVALID set without WREADY
sbsx_txdat_stall TXDAT stall (TXDAT valid but no link credit available)
sbsx_txrsp_stall TXRSP stall (TXRSP valid but no link credit available)
RN-D node events rnd_s0_rdata_beats Number of RData beats, RVALID and RREADY, dispatched on port 0. This is a measure of the read bandwidth, including CMO responses.
rnd_s1_rdata_beats Number of RData beats, RVALID and RREADY, dispatched on port 1. This is a measure of the read bandwidth, including CMO responses.
rnd_s2_rdata_beats Number of RData beats, RVALID and RREADY, dispatched on port 2. This is a measure of the read bandwidth, including CMO responses.
rnd_rxdat_flits Number of RXDAT flits received. This is a measure of the true Number of TXREQ flits dispatched. This is a measure of the total request bandwidth.
rnd_txreq_flits_retried Number of retried TXREQ flits dispatched. This is a measure of the retry rate.
rnd_rrt_occ_ovfl All entries in the read request tracker are occupied. This is a measure of oversubscription in the read request tracker.
rnd_wrt_occ_ovfl All entries in the write request tracker are occupied. This is a measure of oversubscription in the write request tracker.
rnd_txreq_flits_replayed Number of replayed TXREQ flits. This is the measure of replay rate.
rnd_wrcancel_sent Number of write data cancels sent. This is the measure of write cancel rate.
rnd_s0_wdata_beats Number of WData beats, WVALID and WREADY, dispatched on port 0. This is a measure of write bandwidth on AXI port 0.
rnd_s1_wdata_beats Number of WData beats, WVALID and WREADY, dispatched on port 1. This is a measure of write bandwidth on AXI port 1.
rnd_s2_wdata_beats Number of WData beats, WVALID and WREADY, dispatched on port 2. This is a measure of write bandwidth on AXI port 2.
rnd_rrt_alloc Number of allocations in the read request tracker. This is a measure of read transaction count.
rnd_wrt_alloc Number of allocations in the write request tracker. This is a measure of write transaction count.
rnd_rdb_unord Number of cycles for which Read Data Buffer state machine is in Unordered Mode.
rnd_rdb_replay Number of cycles for which Read Data Buffer state machine is in Replay mode
rnd_rdb_hybrid Number of cycles for which Read Data Buffer state machine is in hybrid mode. Hybrid mode is where there is mix of ordered/unordered traffic.
rnd_rdb_ord Number of cycles for which Read Data Buffer state machine is in ordered Mode. Number of RData beats, RVALID and RREADY, dispatched on port 1. This is a measure of the read bandwidth, including CMO responses.
rni_s2_rdata_beats Number of RData beats, RVALID and RREADY, dispatched on port 2. This is a measure of the read bandwidth, including CMO responses.
rni_rxdat_flits Number of RXDAT flits received. This is a measure of the true read data bandwidth, excluding CMOs.
rni_txdat_flits Number of TXDAT flits dispatched. This is a measure of the write bandwidth.
rni_txreq_flits_total Number of TXREQ flits dispatched. This is a measure of the total request bandwidth.
rni_txreq_flits_retried Number of retried TXREQ flits dispatched. This is a measure of the retry rate.
rni_rrt_occ_ovfl All entries in the read request tracker are occupied. This is a measure of oversubscription in the read request tracker.
rni_wrt_occ_ovfl All entries in the write request tracker are occupied. This is a measure of oversubscription in the write request tracker.
rni_txreq_flits_replayed Number of replayed TXREQ flits. This is the measure of replay rate.
rni_wrcancel_sent Number of write data cancels sent. This is the measure of write cancel rate
rni_s0_wdata_beats Number of WData beats, WVALID and WREADY, dispatched on port 0. This is a measure of write bandwidth on AXI port 0.
rni_s1_wdata_beats Number of WData beats, WVALID and WREADY, dispatched on port 1. This is a measure of write bandwidth on AXI port 1.
rni_s2_wdata_beats Number of WData beats, WVALID and WREADY, dispatched on port 2. This is a measure of write bandwidth on AXI port 2.
rni_rrt_alloc Number of allocations in the read request tracker. This is a measure of read transaction count.
rni_wrt_alloc Number of allocations in the write request tracker. This is a measure of write transaction count

rni_rdb_hybrid Number of cycles for which Read Data Buffer state machine is in hybrid mode. Hybrid mode is where there is mix of ordered/unordered traffic.
rni_rdb_ord Number of cycles for which Read Data Buffer state machine is in ordered Mode.
CXHA node events (Note: CXHA events descriptions are guessed)
cxha_rddatbyp Number of Read DAT Bypass
cxha_chirsp_up_stall Number of CHI RSP up Stall
cxha_chidat_up_stall Number of CHI DAT up Stall
cxha_snppcrd_lnk0_stall Number of Snoop Pcrd Stall on Link 0
cxha_snppcrd_lnk1_stall Number of Snoop Pcrd Stall on Link 1
cxha_snppcrd_lnk2_stall Number of Snoop Pcrd Stall on Link 2
cxha_reqtrk_occ Request Tracker Occupancy
cxha_rdb_occ Read Data Buffer Occupancy
cxha_rdbbyp_occ Read Data Buffer Bypass Occupancy
cxha_wdb_occ Write Data Buffer Occupancy
cxha_snptrk_occ Snoop Tracker Occupancy
cxha_sdb_occ SDB Occupancy
cxha_snphaz_occ Snoop Hazard Occupancy
CXRA node events cxra_req_trk_occ Request tracker occupancy
cxra_snp_trk_occ Snoop tracker occupancy
Snoop sink buffer occupancy
cxra_snp_bcasts Snoop broadcasts
cxra_req_chains Number of request chains formed larger than one
cxra_req_chain_avg_len Average size of request chains, only for chain sizes larger than one
cxra_chi_rsp_upload_stalls Local RA upload stalls to CHI because of contention with HA
cxra_chi_dat_upload_stalls Local RA upload stalls to CHI because of contention with HA
cxra_dat_pcrd_stalls_lnk0 Memory Data Request available, but no DAT Pcrd to send over CCIX per LinkEnd 0
cxra_dat_pcrd_stalls_lnk1 Memory Data Request available, but no DAT Pcrd to send over CCIX per LinkEnd 1
cxra_dat_pcrd_stalls_lnk2 Memory Data Request available, but no DAT Pcrd to send over CCIX per LinkEnd 2
cxra_req_pcrd_stalls_lnk0 Memory Data Request available but no Req Pcrd to send over CCIX per LinkEnd 0
cxra_req_pcrd_stalls_lnk1 Memory Data Request available but no Req Pcrd to send over CCIX per LinkEnd 1
cxra_req_pcrd_stalls_lnk2 Memory Data Request available but no Req Pcrd to send over CCIX per LinkEnd 2
cxra_ext_rsp_stall CHI external RSP stall
cxra_ext_dat_stall CHI external DAT stall
CXLA node events cxla_rx_tlp_link0 RX TLP for Link 0
cxla_rx_tlp_link1 RX TLP for Link 1
cxla_rx_tlp_link2 RX TLP for Link 2
cxla_tx_tlp_link0
cxla_rx_cxs_link0 RX CXS for Link 0
cxla_rx_cxs_link1 RX CXS for Link 1
cxla_rx_cxs_link2 RX CXS for Link 2
cxla_tx_cxs_link0 TX CXS for Link 0
cxla_tx_cxs_link1 TX CXS for Link 1
cxla_tx_cxs_link2 TX CXS for Link 2
cxla_avg_rx_tlp_sz_dws Average RX TLP size in DWs
cxla_avg_tx_tlp_sz_dws Average TX TLP size in DWs
cxla_avg_rx_tlp_sz_ccix_msg Average RX TLP size in CCIX messages
cxla_avg_tx_tlp_sz_ccix_msg Average TX TLP size in CCIX messages
cxla_avg_sz_rx_cxs_dw_beat Average size of RX CXS in DWs within a beat
cxla_avg_sz_tx_cxs_dw_beat Average size of TX CXS in DWs within a beat
cxla_tx_cxs_link_credit_backpressure TX CXS link credit backpressure
cxla_rx_tlp_buffer_full RX TLP buffer full and backpressured
cxla_tx_tlp_buffer_full TX TLP buffer full and backpressured
cxla_avg_latency_process_rx_tlp Average latency to process an RX TLP
cxla_avg_latency_form_tx_tlp Average latency to form a TX TLP
SEE ALSO pmc(3), pmc.atom(3), pmc.core(3), pmc.core2(3), pmc.corei7(3), pmc.corei7uc(3), pmc.iaf(3), pmc.iaf(3), pmc.k7(3), pmc.k8(3), pmc.soft(3), pmc.tsc(3), pmc.westmere(3), pmc.westmereuc(3), pmc_cpuinfo(3), pmclog(3), hwpmc(4)
HISTORY written by Oleksandr Rybalko <ray@FreeBSD.org>.
FreeBSD 14.0-RELEASE-p11 December 19, 2021 FreeBSD 14.0-RELEASE-p11