FreeBSD manual
download PDF document: pmc.cmn-600.3.pdf
PMC.CMN-600(3) FreeBSD Library Functions Manual PMC.CMN-600(3)
NAME
pmc.cmn-600 - Library for accessing the Arm CoreLink CMN-600 Coherent
Mesh Network Controller performance counter events
LIBRARY
Performance Counters Library (libpmc, -lpmc)
SYNOPSIS
#include <pmc.h>
DESCRIPTION
CMN-600 PMU counters may be configured to count any one of a defined set
of hardware events. Unlike other performance counters, counters for the
CMN-600 require the node ID to set up.
Node ID information currently can be obtained one of two ways. Using
bootverbose, for example set sysctl debug.bootverbose=1 and then load the
hwpmc(4) KLD module. The cmn600 module will be loaded automatically as a
dependency:
$ sysctl debug.bootverbose=1
$ kldload hwpmc
Another way is to use sysctl to trigger dump of nodes tree to system
console:
$ sysctl dev.cmn600.0.dump_nodes=1
Some BIOS versions of dual-socket machines have no NUMA domain
information in ACPI. In such cases, to get more accurate events
statistics, set the kernel environment variable
hint.cmn600.{unit}.domain={value}. Where {unit} is a cmn600 device unit
number and {value} is the NUMA domain of the CPU package containing that
CMN-600 controller. Example:
$ kenv hint.cmn600.0.domain=0
$ kenv hint.cmn600.1.domain=1
$ kldunload hwpmc cmn600
$ kldload hwpmc
Arm CoreLink CMN-600 Coherent Mesh Network Controller performance
counters are documented in Revision: r3p2, Arm CoreLink CMN-600 Coherent
Mesh Network Technical Reference Manual, ARM Limited, 2020.
PMC Capabilities
CMN-600 PMU counters support the following capabilities:
Capability Support
PMC_CAP_CASCADE No
PMC_CAP_EDGE No
PMC_CAP_INTERRUPT Yes
PMC_CAP_INVERT No
PMC_CAP_READ Yes
PMC_CAP_PRECISE No
PMC_CAP_SYSTEM Yes
PMC_CAP_TAGGING No
PMC_CAP_THRESHOLD Yes
PMC_CAP_USER No
PMC_CAP_WRITE Yes
Event Qualifiers
xpport=port
Count only events matched by port. (East, West, North, South,
devport0, devport1 or numeric 0 to 5)
xpchannel=channel
Filter events by XP node channel. (REQ, RSP, SNP, DAT or 0, 1,
2, 3)
Class Name Prefix
These PMCs are named using a class name prefix of "CMN600_PMU_".
Event Specifiers
The following list of PMC events are available:
DVM node events
dn_rxreq_dvmop
Number of DVMOP requests received. This includes all the sub-
types include TLB invalidate, Branch predictor invalidate,
instruction cache (physical and virtual) invalidate.
dn_rxreq_dvmsync
Number of DVM Sync requests received.
dn_rxreq_dvmop_vmid_filtered
Number of incoming DVMOP requests that are subject to VMID based
filtering. This is a measure of the effectiveness of VMID based
filtering and potential reduction in DVM snoops.
dn_rxreq_retried
Number of incoming requests that are retried. This is a measure
of the retry rate.
dn_rxreq_trk_occupancy
Counts the tracker occupancy in DN. "occupancy": All, dvmop,
dvmsync
dn_rxreq_tlbi_dvmop
Number of DVMOP TLB invalidate requests received.
dn_rxreq_bpi_dvmop
Number of DVMOP Branch predictor invalidate requests received.
dn_rxreq_pici_dvmop
Number of DVMOP physical instruction cache invalidate requests
received.
dn_rxreq_vivi_dvmop
Number of DVMOP virtual instruction cache invalidate requests
received.
dn_rxreq_dvmop_other_filtered
Number of DVM op requests to RNDs, BPI or PICI/VICI, that were
filtered
dn_rxreq_snp_sent
Number of SNPs sent to RNs
dn_rxreq_snp_stalled
Counts total cache misses in first lookup result (high priority)
hnf_slc_sf_cache_access
Counts number of cache accesses in first access (high priority)
hnf_cache_fill
Counts total allocations in HN SLC (all cache line allocations to
SLC)
hnf_pocq_retry
Counts number of retried requests
hnf_pocq_reqs_recvd
Counts number of requests received by HN
hnf_sf_hit
Counts number of SF hits
hnf_sf_evictions
Counts number of SF eviction cache invalidations initiated
hnf_dir_snoops_sent
Counts number of directed snoops sent (not including SF back
invalidation)
hnf_brd_snoops_sent
Counts number of multicast snoops sent (not including SF back
invalidation)
hnf_slc_eviction
Counts number of SLC evictions (dirty only)
hnf_slc_fill_invalid_way
Counts number of SLC fills to an invalid way
hnf_mc_retries
Counts number of retried transactions by the MC
hnf_mc_reqs
Counts number of requests sent to MC
hnf_qos_hh_retry
Counts number of times a HighHigh priority request is protocol-
retried at the HN-F
hnf_qos_pocq
Counts the POCQ occupancy in HN-F. Support argument "occupancy".
Accept: All, Read, Write, Atomic, Stash. Default: All.
hnf_pocq_addrhaz
Counts number of POCQ address hazards upon allocation
hnf_pocq_atomic_addrhaz
Counts number of POCQ address hazards upon allocation for atomic
operations
hnf_ld_st_swp_adq_full
Counts number of times ADQ is full for Ld/St/SWP type atomic
operations while POCQ has pending operations
credits to upload
hnf_txrsp_stall
Counts number of times HN-F has a pending TXRSP flit but no
credits to upload
hnf_seq_full
Counts number of times requests are replayed in SLC pipe due to
SEQ being full
hnf_seq_hit
Counts number of times a request in SLC hit a pending SF eviction
in SEQ
hnf_snp_sent
Counts number of snoops sent including directed, multicast, and
SF back invalidation
hnf_sfbi_dir_snp_sent
Counts number of times directed snoops were sent due to SF back
invalidation
hnf_sfbi_brd_snp_sent
Counts number of times multicast snoops were sent due to SF back
invalidation
hnf_snp_sent_untrk
Counts number of times snooped were sent due to untracked RNF's
hnf_intv_dirty
Counts number of times SF back invalidation resulted in dirty
line intervention from the RN
hnf_stash_snp_sent
Counts number of times stash snoops were sent
hnf_stash_data_pull
Counts number of times stash snoops resulted in data pull from
the RN
hnf_snp_fwded
Counts number of times data forward snoops were sent
HN-I node events
hni_rrt_rd_occ_cnt_ovfl
RRT read occupancy count overflow
hni_rrt_wr_occ_cnt_ovfl
RRT write occupancy count overflow
hni_rdt_rd_occ_cnt_ovfl
RDT read occupancy count overflow
hni_rdt_wr_occ_cnt_ovfl
RDT write occupancy count overflow
hni_wdb_occ_cnt_ovfl
WDB occupancy count overflow
RDT read allocation
hni_rdt_wr_alloc
RDT write allocation
hni_wdb_alloc
WDB allocation
hni_txrsp_retryack
RETRYACK TXRSP flit sent
hni_arvalid_no_arready
ARVALID set without ARREADY event
hni_arready_no_arvalid
ARREADY set without ARVALID event
hni_awvalid_no_awready
AWVALID set without AWREADY event
hni_awready_no_awvalid
AWREADY set without AWVALID event
hni_wvalid_no_wready
WVALID set without WREADY event
hni_txdat_stall
TXDAT stall (TXDAT valid but no link credit available)
hni_nonpcie_serialization
Non-PCIe serialization event
hni_pcie_serialization
PCIe serialization event
XP node events
xp_txflit_valid
Number of flits transmitted on a specified port and CHI channel.
This is a measure of the flit transfer bandwidth from an XP.
Note: On device ports, this event also includes link flit
transfers.
xp_txflit_stall
Number of cycles when a flit is stalled at an XP waiting for link
credits at a specified port and CHI channel. This is a measure
of the flit traffic congestion on the mesh and at the flit
download ports.
xp_partial_dat_flit
Number of times when a partial DAT flit is uploaded onto the mesh
from a RN-F_CHIA port. Partial DAT flit transmission occurs when
XP is not able to combine two 128b DAT flits and send them over
the 256b DAT channel. This can happen under 2 circumstances: 1.
Only one 128b DAT flit is received within a transmission time
window. 2. Two 128b DAT flits are received but they are not two
halves of a
single 256b word.
SBSX node events
CMO request
sbsx_txrsp_retryack
RETRYACK TXRSP flit sent
sbsx_txdat_flitv
TXDAT flit seen
sbsx_txrsp_flitv
TXRSP flit seen
sbsx_rd_req_trkr_occ_cnt_ovfl
Read request tracker occupancy count overflow
sbsx_wr_req_trkr_occ_cnt_ovfl
Write request tracker occupancy count overflow
sbsx_cmo_req_trkr_occ_cnt_ovfl
CMO request tracker occupancy count overflow
sbsx_wdb_occ_cnt_ovfl
WDB occupancy count overflow
sbsx_rd_axi_trkr_occ_cnt_ovfl
Read AXI pending tracker occupancy count overflow
sbsx_cmo_axi_trkr_occ_cnt_ovfl
CMO AXI pending tracker occupancy count overflow
sbsx_arvalid_no_arready
ARVALID set without ARREADY
sbsx_awvalid_no_awready
AWVALID set without AWREADY
sbsx_wvalid_no_wready
WVALID set without WREADY
sbsx_txdat_stall
TXDAT stall (TXDAT valid but no link credit available)
sbsx_txrsp_stall
TXRSP stall (TXRSP valid but no link credit available)
RN-D node events
rnd_s0_rdata_beats
Number of RData beats, RVALID and RREADY, dispatched on port 0.
This is a measure of the read bandwidth, including CMO responses.
rnd_s1_rdata_beats
Number of RData beats, RVALID and RREADY, dispatched on port 1.
This is a measure of the read bandwidth, including CMO responses.
rnd_s2_rdata_beats
Number of RData beats, RVALID and RREADY, dispatched on port 2.
This is a measure of the read bandwidth, including CMO responses.
rnd_rxdat_flits
Number of RXDAT flits received. This is a measure of the true
Number of TXREQ flits dispatched. This is a measure of the total
request bandwidth.
rnd_txreq_flits_retried
Number of retried TXREQ flits dispatched. This is a measure of
the retry rate.
rnd_rrt_occ_ovfl
All entries in the read request tracker are occupied. This is a
measure of oversubscription in the read request tracker.
rnd_wrt_occ_ovfl
All entries in the write request tracker are occupied. This is a
measure of oversubscription in the write request tracker.
rnd_txreq_flits_replayed
Number of replayed TXREQ flits. This is the measure of replay
rate.
rnd_wrcancel_sent
Number of write data cancels sent. This is the measure of write
cancel rate.
rnd_s0_wdata_beats
Number of WData beats, WVALID and WREADY, dispatched on port 0.
This is a measure of write bandwidth on AXI port 0.
rnd_s1_wdata_beats
Number of WData beats, WVALID and WREADY, dispatched on port 1.
This is a measure of write bandwidth on AXI port 1.
rnd_s2_wdata_beats
Number of WData beats, WVALID and WREADY, dispatched on port 2.
This is a measure of write bandwidth on AXI port 2.
rnd_rrt_alloc
Number of allocations in the read request tracker. This is a
measure of read transaction count.
rnd_wrt_alloc
Number of allocations in the write request tracker. This is a
measure of write transaction count.
rnd_rdb_unord
Number of cycles for which Read Data Buffer state machine is in
Unordered Mode.
rnd_rdb_replay
Number of cycles for which Read Data Buffer state machine is in
Replay mode
rnd_rdb_hybrid
Number of cycles for which Read Data Buffer state machine is in
hybrid mode. Hybrid mode is where there is mix of
ordered/unordered traffic.
rnd_rdb_ord
Number of cycles for which Read Data Buffer state machine is in
ordered Mode.
Number of RData beats, RVALID and RREADY, dispatched on port 1.
This is a measure of the read bandwidth, including CMO responses.
rni_s2_rdata_beats
Number of RData beats, RVALID and RREADY, dispatched on port 2.
This is a measure of the read bandwidth, including CMO responses.
rni_rxdat_flits
Number of RXDAT flits received. This is a measure of the true
read data bandwidth, excluding CMOs.
rni_txdat_flits
Number of TXDAT flits dispatched. This is a measure of the write
bandwidth.
rni_txreq_flits_total
Number of TXREQ flits dispatched. This is a measure of the total
request bandwidth.
rni_txreq_flits_retried
Number of retried TXREQ flits dispatched. This is a measure of
the retry rate.
rni_rrt_occ_ovfl
All entries in the read request tracker are occupied. This is a
measure of oversubscription in the read request tracker.
rni_wrt_occ_ovfl
All entries in the write request tracker are occupied. This is a
measure of oversubscription in the write request tracker.
rni_txreq_flits_replayed
Number of replayed TXREQ flits. This is the measure of replay
rate.
rni_wrcancel_sent
Number of write data cancels sent. This is the measure of write
cancel rate
rni_s0_wdata_beats
Number of WData beats, WVALID and WREADY, dispatched on port 0.
This is a measure of write bandwidth on AXI port 0.
rni_s1_wdata_beats
Number of WData beats, WVALID and WREADY, dispatched on port 1.
This is a measure of write bandwidth on AXI port 1.
rni_s2_wdata_beats
Number of WData beats, WVALID and WREADY, dispatched on port 2.
This is a measure of write bandwidth on AXI port 2.
rni_rrt_alloc
Number of allocations in the read request tracker. This is a
measure of read transaction count.
rni_wrt_alloc
Number of allocations in the write request tracker. This is a
measure of write transaction count
rni_rdb_hybrid
Number of cycles for which Read Data Buffer state machine is in
hybrid mode. Hybrid mode is where there is mix of
ordered/unordered traffic.
rni_rdb_ord
Number of cycles for which Read Data Buffer state machine is in
ordered Mode.
CXHA node events
(Note: CXHA events descriptions are guessed)
cxha_rddatbyp
Number of Read DAT Bypass
cxha_chirsp_up_stall
Number of CHI RSP up Stall
cxha_chidat_up_stall
Number of CHI DAT up Stall
cxha_snppcrd_lnk0_stall
Number of Snoop Pcrd Stall on Link 0
cxha_snppcrd_lnk1_stall
Number of Snoop Pcrd Stall on Link 1
cxha_snppcrd_lnk2_stall
Number of Snoop Pcrd Stall on Link 2
cxha_reqtrk_occ
Request Tracker Occupancy
cxha_rdb_occ
Read Data Buffer Occupancy
cxha_rdbbyp_occ
Read Data Buffer Bypass Occupancy
cxha_wdb_occ
Write Data Buffer Occupancy
cxha_snptrk_occ
Snoop Tracker Occupancy
cxha_sdb_occ
SDB Occupancy
cxha_snphaz_occ
Snoop Hazard Occupancy
CXRA node events
cxra_req_trk_occ
Request tracker occupancy
cxra_snp_trk_occ
Snoop tracker occupancy
Snoop sink buffer occupancy
cxra_snp_bcasts
Snoop broadcasts
cxra_req_chains
Number of request chains formed larger than one
cxra_req_chain_avg_len
Average size of request chains, only for chain sizes larger than
one
cxra_chi_rsp_upload_stalls
Local RA upload stalls to CHI because of contention with HA
cxra_chi_dat_upload_stalls
Local RA upload stalls to CHI because of contention with HA
cxra_dat_pcrd_stalls_lnk0
Memory Data Request available, but no DAT Pcrd to send over CCIX
per LinkEnd 0
cxra_dat_pcrd_stalls_lnk1
Memory Data Request available, but no DAT Pcrd to send over CCIX
per LinkEnd 1
cxra_dat_pcrd_stalls_lnk2
Memory Data Request available, but no DAT Pcrd to send over CCIX
per LinkEnd 2
cxra_req_pcrd_stalls_lnk0
Memory Data Request available but no Req Pcrd to send over CCIX
per LinkEnd 0
cxra_req_pcrd_stalls_lnk1
Memory Data Request available but no Req Pcrd to send over CCIX
per LinkEnd 1
cxra_req_pcrd_stalls_lnk2
Memory Data Request available but no Req Pcrd to send over CCIX
per LinkEnd 2
cxra_ext_rsp_stall
CHI external RSP stall
cxra_ext_dat_stall
CHI external DAT stall
CXLA node events
cxla_rx_tlp_link0
RX TLP for Link 0
cxla_rx_tlp_link1
RX TLP for Link 1
cxla_rx_tlp_link2
RX TLP for Link 2
cxla_tx_tlp_link0
cxla_rx_cxs_link0
RX CXS for Link 0
cxla_rx_cxs_link1
RX CXS for Link 1
cxla_rx_cxs_link2
RX CXS for Link 2
cxla_tx_cxs_link0
TX CXS for Link 0
cxla_tx_cxs_link1
TX CXS for Link 1
cxla_tx_cxs_link2
TX CXS for Link 2
cxla_avg_rx_tlp_sz_dws
Average RX TLP size in DWs
cxla_avg_tx_tlp_sz_dws
Average TX TLP size in DWs
cxla_avg_rx_tlp_sz_ccix_msg
Average RX TLP size in CCIX messages
cxla_avg_tx_tlp_sz_ccix_msg
Average TX TLP size in CCIX messages
cxla_avg_sz_rx_cxs_dw_beat
Average size of RX CXS in DWs within a beat
cxla_avg_sz_tx_cxs_dw_beat
Average size of TX CXS in DWs within a beat
cxla_tx_cxs_link_credit_backpressure
TX CXS link credit backpressure
cxla_rx_tlp_buffer_full
RX TLP buffer full and backpressured
cxla_tx_tlp_buffer_full
TX TLP buffer full and backpressured
cxla_avg_latency_process_rx_tlp
Average latency to process an RX TLP
cxla_avg_latency_form_tx_tlp
Average latency to form a TX TLP
SEE ALSO
pmc(3), pmc.atom(3), pmc.core(3), pmc.core2(3), pmc.corei7(3),
pmc.corei7uc(3), pmc.iaf(3), pmc.iaf(3), pmc.k7(3), pmc.k8(3),
pmc.soft(3), pmc.tsc(3), pmc.westmere(3), pmc.westmereuc(3),
pmc_cpuinfo(3), pmclog(3), hwpmc(4)
HISTORY
written by Oleksandr Rybalko <ray@FreeBSD.org>.
FreeBSD 14.0-RELEASE-p11 December 19, 2021 FreeBSD 14.0-RELEASE-p11