Kubernetes Metrics Reference (2023)

aggregator_discovery_aggregation_count_totalALPHACounterCounter of number of times discovery was aggregatedaggregator_openapi_v2_regeneration_countALPHACounterCounter of OpenAPI v2 spec regeneration count broken down by causing APIService name and reason.

apiservice

reason

aggregator_openapi_v2_regeneration_durationALPHAGaugeGauge of OpenAPI v2 spec regeneration duration in seconds.

reason

aggregator_unavailable_apiserviceALPHACustomGauge of APIServices which are marked as unavailable broken down by APIService name.

name

aggregator_unavailable_apiservice_totalALPHACounterCounter of APIServices which are marked as unavailable broken down by APIService name and reason.

name

reason

apiextensions_openapi_v2_regeneration_countALPHACounterCounter of OpenAPI v2 spec regeneration count broken down by causing CRD name and reason.

crd

reason

apiextensions_openapi_v3_regeneration_countALPHACounterCounter of OpenAPI v3 spec regeneration count broken down by group, version, causing CRD and reason.

crd

group

reason

version

apiserver_admission_match_condition_evaluation_errors_totalALPHACounterAdmission match condition evaluation errors count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).

kind

name

operation

type

apiserver_admission_match_condition_evaluation_secondsALPHAHistogramAdmission match condition evaluation time in seconds, identified by name and broken out for each kind containing matchConditions (webhook or policy), operation and type (validate or admit).

kind

name

operation

type

apiserver_admission_match_condition_exclusions_totalALPHACounterAdmission match condition evaluation exclusions count, identified by name of resource containing the match condition and broken out for each kind containing matchConditions (webhook or policy), operation and admission type (validate or admit).

kind

name

operation

type

apiserver_admission_step_admission_duration_seconds_summaryALPHASummaryAdmission sub-step latency summary in seconds, broken out for each operation and API resource and step type (validate or admit).

operation

rejected

type

apiserver_admission_webhook_fail_open_countALPHACounterAdmission webhook fail open count, identified by name and broken out for each admission type (validating or mutating).

name

type

apiserver_admission_webhook_rejection_countALPHACounterAdmission webhook rejection count, identified by name and broken out for each admission type (validating or admit) and operation. Additional labels specify an error type (calling_webhook_error or apiserver_internal_error if an error occurred; no_error otherwise) and optionally a non-zero rejection code if the webhook rejects the request with an HTTP status code (honored by the apiserver when the code is greater or equal to 400). Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.

error_type

name

operation

rejection_code

type

apiserver_admission_webhook_request_totalALPHACounterAdmission webhook request total, identified by name and broken out for each admission type (validating or mutating) and operation. Additional labels specify whether the request was rejected or not and an HTTP status code. Codes greater than 600 are truncated to 600, to keep the metrics cardinality bounded.

code

name

operation

rejected

type

apiserver_audit_error_totalALPHACounterCounter of audit events that failed to be audited properly. Plugin identifies the plugin affected by the error.

plugin

apiserver_audit_event_totalALPHACounterCounter of audit events generated and sent to the audit backend.apiserver_audit_level_totalALPHACounterCounter of policy levels for audit events (1 per request).

level

apiserver_audit_requests_rejected_totalALPHACounterCounter of apiserver requests rejected due to an error in audit logging backend.apiserver_cache_list_fetched_objects_totalALPHACounterNumber of objects read from watch cache in the course of serving a LIST request

index

resource_prefix

apiserver_cache_list_returned_objects_totalALPHACounterNumber of objects returned for a LIST request from watch cache

resource_prefix

apiserver_cache_list_totalALPHACounterNumber of LIST requests served from watch cache

index

resource_prefix

apiserver_cel_compilation_duration_secondsALPHAHistogramCEL compilation time in seconds.apiserver_cel_evaluation_duration_secondsALPHAHistogramCEL evaluation time in seconds.apiserver_certificates_registry_csr_honored_duration_totalALPHACounterTotal number of issued CSRs with a requested duration that was honored, sliced by signer (only kubernetes.io signer names are specifically identified)

signerName

apiserver_certificates_registry_csr_requested_duration_totalALPHACounterTotal number of issued CSRs with a requested duration, sliced by signer (only kubernetes.io signer names are specifically identified)

signerName

apiserver_client_certificate_expiration_secondsALPHAHistogramDistribution of the remaining lifetime on the certificate used to authenticate a request.apiserver_conversion_webhook_duration_secondsALPHAHistogramConversion webhook request latency

failure_type

result

apiserver_conversion_webhook_request_totalALPHACounterCounter for conversion webhook requests with success/failure and failure error type

failure_type

result

apiserver_crd_conversion_webhook_duration_secondsALPHAHistogramCRD webhook conversion duration in seconds

crd_name

from_version

succeeded

to_version

apiserver_current_inqueue_requestsALPHAGaugeMaximal number of queued requests in this apiserver per request kind in last second.

request_kind

apiserver_delegated_authn_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.

code

apiserver_delegated_authn_request_totalALPHACounterNumber of HTTP requests partitioned by status code.

code

apiserver_delegated_authz_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.

code

apiserver_delegated_authz_request_totalALPHACounterNumber of HTTP requests partitioned by status code.

code

apiserver_egress_dialer_dial_duration_secondsALPHAHistogramDial latency histogram in seconds, labeled by the protocol (http-connect or grpc), transport (tcp or uds)

protocol

transport

apiserver_egress_dialer_dial_failure_countALPHACounterDial failure count, labeled by the protocol (http-connect or grpc), transport (tcp or uds), and stage (connect or proxy). The stage indicates at which stage the dial failed

protocol

stage

transport

apiserver_egress_dialer_dial_start_totalALPHACounterDial starts, labeled by the protocol (http-connect or grpc) and transport (tcp or uds).

protocol

transport

apiserver_encryption_config_controller_automatic_reload_failures_totalALPHACounterTotal number of failed automatic reloads of encryption configuration.apiserver_encryption_config_controller_automatic_reload_last_timestamp_secondsALPHAGaugeTimestamp of the last successful or failed automatic reload of encryption configuration.

status

apiserver_encryption_config_controller_automatic_reload_success_totalALPHACounterTotal number of successful automatic reloads of encryption configuration.apiserver_envelope_encryption_dek_cache_fill_percentALPHAGaugePercent of the cache slots currently occupied by cached DEKs.apiserver_envelope_encryption_dek_cache_inter_arrival_time_secondsALPHAHistogramTime (in seconds) of inter arrival of transformation requests.

transformation_type

apiserver_envelope_encryption_invalid_key_id_from_status_totalALPHACounterNumber of times an invalid keyID is returned by the Status RPC call split by error.

error

provider_name

apiserver_envelope_encryption_key_id_hash_last_timestamp_secondsALPHAGaugeThe last time in seconds when a keyID was used.

key_id_hash

provider_name

transformation_type

apiserver_envelope_encryption_key_id_hash_status_last_timestamp_secondsALPHAGaugeThe last time in seconds when a keyID was returned by the Status RPC call.

key_id_hash

provider_name

apiserver_envelope_encryption_key_id_hash_totalALPHACounterNumber of times a keyID is used split by transformation type and provider.

key_id_hash

provider_name

transformation_type

apiserver_envelope_encryption_kms_operations_latency_secondsALPHAHistogramKMS operation duration with gRPC error code status total.

grpc_status_code

method_name

provider_name

apiserver_flowcontrol_current_limit_seatsALPHAGaugecurrent derived number of execution seats available to each priority level

priority_level

apiserver_flowcontrol_current_rALPHAGaugeR(time of last change)

priority_level

apiserver_flowcontrol_demand_seatsALPHATimingRatioHistogramObservations, at the end of every nanosecond, of (the number of seats each priority level could use) / (nominal number of seats for that level)

priority_level

apiserver_flowcontrol_demand_seats_averageALPHAGaugeTime-weighted average, over last adjustment period, of demand_seats

priority_level

apiserver_flowcontrol_demand_seats_high_watermarkALPHAGaugeHigh watermark, over last adjustment period, of demand_seats

priority_level

apiserver_flowcontrol_demand_seats_smoothedALPHAGaugeSmoothed seat demands

priority_level

apiserver_flowcontrol_demand_seats_stdevALPHAGaugeTime-weighted standard deviation, over last adjustment period, of demand_seats

priority_level

apiserver_flowcontrol_dispatch_rALPHAGaugeR(time of last dispatch)

priority_level

apiserver_flowcontrol_epoch_advance_totalALPHACounterNumber of times the queueset's progress meter jumped backward

priority_level

success

apiserver_flowcontrol_latest_sALPHAGaugeS(most recently dispatched request)

priority_level

apiserver_flowcontrol_lower_limit_seatsALPHAGaugeConfigured lower bound on number of execution seats available to each priority level

priority_level

apiserver_flowcontrol_next_discounted_s_boundsALPHAGaugemin and max, over queues, of S(oldest waiting request in queue) - estimated work in progress

bound

priority_level

apiserver_flowcontrol_next_s_boundsALPHAGaugemin and max, over queues, of S(oldest waiting request in queue)

bound

priority_level

apiserver_flowcontrol_priority_level_request_utilizationALPHATimingRatioHistogramObservations, at the end of every nanosecond, of number of requests (as a fraction of the relevant limit) waiting or in any stage of execution (but only initial stage for WATCHes)

phase

priority_level

apiserver_flowcontrol_priority_level_seat_utilizationALPHATimingRatioHistogramObservations, at the end of every nanosecond, of utilization of seats for any stage of execution (but only initial stage for WATCHes)

priority_level

phase:executing

apiserver_flowcontrol_read_vs_write_current_requestsALPHATimingRatioHistogramObservations, at the end of every nanosecond, of the number of requests (as a fraction of the relevant limit) waiting or in regular stage of execution

phase

request_kind

apiserver_flowcontrol_request_concurrency_in_useALPHAGaugeConcurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem

flow_schema

priority_level

1.31.0apiserver_flowcontrol_request_concurrency_limitALPHAGaugeNominal number of execution seats configured for each priority level

priority_level

1.30.0apiserver_flowcontrol_request_dispatch_no_accommodation_totalALPHACounterNumber of times a dispatch attempt resulted in a non accommodation due to lack of available seats

flow_schema

priority_level

apiserver_flowcontrol_request_execution_secondsALPHAHistogramDuration of initial stage (for a WATCH) or any (for a non-WATCH) stage of request execution in the API Priority and Fairness subsystem

flow_schema

priority_level

type

apiserver_flowcontrol_request_queue_length_after_enqueueALPHAHistogramLength of queue in the API Priority and Fairness subsystem, as seen by each request after it is enqueued

flow_schema

priority_level

apiserver_flowcontrol_seat_fair_fracALPHAGaugeFair fraction of server's concurrency to allocate to each priority level that can use itapiserver_flowcontrol_target_seatsALPHAGaugeSeat allocation targets

priority_level

apiserver_flowcontrol_upper_limit_seatsALPHAGaugeConfigured upper bound on number of execution seats available to each priority level

priority_level

apiserver_flowcontrol_watch_count_samplesALPHAHistogramcount of watchers for mutating requests in API Priority and Fairness

flow_schema

priority_level

apiserver_flowcontrol_work_estimated_seatsALPHAHistogramNumber of estimated seats (maximum of initial and final seats) associated with requests in API Priority and Fairness

flow_schema

priority_level

apiserver_init_events_totalALPHACounterCounter of init events processed in watch cache broken by resource type.

resource

apiserver_kube_aggregator_x509_insecure_sha1_totalALPHACounterCounts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)apiserver_kube_aggregator_x509_missing_san_totalALPHACounterCounts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)apiserver_request_aborts_totalALPHACounterNumber of requests which apiserver aborted possibly due to a timeout, for each group, version, verb, resource, subresource and scope

group

resource

scope

subresource

verb

version

apiserver_request_body_sizesALPHAHistogramApiserver request body sizes broken out by size.

resource

verb

apiserver_request_filter_duration_secondsALPHAHistogramRequest filter latency distribution in seconds, for each filter type

filter

apiserver_request_post_timeout_totalALPHACounterTracks the activity of the request handlers after the associated requests have been timed out by the apiserver

source

status

apiserver_request_sli_duration_secondsALPHAHistogramResponse latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.

component

group

resource

scope

subresource

verb

version

apiserver_request_slo_duration_secondsALPHAHistogramResponse latency distribution (not counting webhook duration and priority & fairness queue wait times) in seconds for each verb, group, version, resource, subresource, scope and component.

component

group

resource

scope

subresource

verb

version

1.27.0apiserver_request_terminations_totalALPHACounterNumber of requests which apiserver terminated in self-defense.

code

component

group

resource

scope

subresource

verb

version

apiserver_request_timestamp_comparison_timeALPHAHistogramTime taken for comparison of old vs new objects in UPDATE or PATCH requests

code_path

apiserver_rerouted_request_totalALPHACounterTotal number of requests that were proxied to a peer kube apiserver because the local apiserver was not capable of serving it

code

apiserver_selfrequest_totalALPHACounterCounter of apiserver self-requests broken out for each verb, API resource and subresource.

resource

subresource

verb

apiserver_storage_data_key_generation_duration_secondsALPHAHistogramLatencies in seconds of data encryption key(DEK) generation operations.apiserver_storage_data_key_generation_failures_totalALPHACounterTotal number of failed data encryption key(DEK) generation operations.apiserver_storage_db_total_size_in_bytesALPHAGaugeTotal size of the storage database file physically allocated in bytes.

endpoint

1.28.0apiserver_storage_decode_errors_totalALPHACounterNumber of stored object decode errors split by object type

resource

apiserver_storage_envelope_transformation_cache_misses_totalALPHACounterTotal number of cache misses while accessing key decryption key(KEK).apiserver_storage_events_received_totalALPHACounterNumber of etcd events received split by kind.

resource

apiserver_storage_list_evaluated_objects_totalALPHACounterNumber of objects tested in the course of serving a LIST request from storage

resource

apiserver_storage_list_fetched_objects_totalALPHACounterNumber of objects read from storage in the course of serving a LIST request

resource

apiserver_storage_list_returned_objects_totalALPHACounterNumber of objects returned for a LIST request from storage

resource

apiserver_storage_list_totalALPHACounterNumber of LIST requests served from storage

resource

apiserver_storage_size_bytesALPHACustomSize of the storage database file physically allocated in bytes.

cluster

apiserver_storage_transformation_duration_secondsALPHAHistogramLatencies in seconds of value transformation operations.

transformation_type

transformer_prefix

apiserver_storage_transformation_operations_totalALPHACounterTotal number of transformations. Successful transformation will have a status 'OK' and a varied status string when the transformation fails. This status and transformation_type fields may be used for alerting on encryption/decryption failure using transformation_type from_storage for decryption and to_storage for encryption

status

transformation_type

transformer_prefix

apiserver_terminated_watchers_totalALPHACounterCounter of watchers closed due to unresponsiveness broken by resource type.

resource

apiserver_tls_handshake_errors_totalALPHACounterNumber of requests dropped with 'TLS handshake error from' errorapiserver_validating_admission_policy_check_duration_secondsALPHAHistogramValidation admission latency for individual validation expressions in seconds, labeled by policy and further including binding, state and enforcement action taken.

enforcement_action

policy

policy_binding

state

apiserver_validating_admission_policy_check_totalALPHACounterValidation admission policy check total, labeled by policy and further identified by binding, enforcement action taken, and state.

enforcement_action

policy

policy_binding

state

apiserver_validating_admission_policy_definition_totalALPHACounterValidation admission policy count total, labeled by state and enforcement action.

enforcement_action

state

apiserver_watch_cache_events_dispatched_totalALPHACounterCounter of events dispatched in watch cache broken by resource type.

resource

apiserver_watch_cache_events_received_totalALPHACounterCounter of events received in watch cache broken by resource type.

resource

apiserver_watch_cache_initializations_totalALPHACounterCounter of watch cache initializations broken by resource type.

resource

apiserver_watch_events_sizesALPHAHistogramWatch event size distribution in bytes

group

kind

version

apiserver_watch_events_totalALPHACounterNumber of events sent in watch clients

group

kind

version

apiserver_webhooks_x509_insecure_sha1_totalALPHACounterCounts the number of requests to servers with insecure SHA1 signatures in their serving certificate OR the number of connection failures due to the insecure SHA1 signatures (either/or, based on the runtime environment)apiserver_webhooks_x509_missing_san_totalALPHACounterCounts the number of requests to servers missing SAN extension in their serving certificate OR the number of connection failures due to the lack of x509 certificate SAN extension missing (either/or, based on the runtime environment)attach_detach_controller_attachdetach_controller_forced_detachesALPHACounterNumber of times the A/D Controller performed a forced detach

reason

attachdetach_controller_total_volumesALPHACustomNumber of volumes in A/D Controller

plugin_name

state

authenticated_user_requestsALPHACounterCounter of authenticated requests broken out by username.

username

authentication_attemptsALPHACounterCounter of authenticated attempts.

result

authentication_duration_secondsALPHAHistogramAuthentication duration in seconds broken out by result.

result

authentication_token_cache_active_fetch_countALPHAGauge

status

authentication_token_cache_fetch_totalALPHACounter

status

authentication_token_cache_request_duration_secondsALPHAHistogram

status

authentication_token_cache_request_totalALPHACounter

status

authorization_attempts_totalALPHACounterCounter of authorization attempts broken down by result. It can be either 'allowed', 'denied', 'no-opinion' or 'error'.

result

authorization_duration_secondsALPHAHistogramAuthorization duration in seconds broken out by result.

result

cloud_provider_webhook_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by status code.

code

webhook

cloud_provider_webhook_request_totalALPHACounterNumber of HTTP requests partitioned by status code.

code

webhook

cloudprovider_azure_api_request_duration_secondsALPHAHistogramLatency of an Azure API call

request

resource_group

source

subscription_id

cloudprovider_azure_api_request_errorsALPHACounterNumber of errors for an Azure API call

request

resource_group

source

subscription_id

cloudprovider_azure_api_request_ratelimited_countALPHACounterNumber of rate limited Azure API calls

request

resource_group

source

subscription_id

cloudprovider_azure_api_request_throttled_countALPHACounterNumber of throttled Azure API calls

request

resource_group

source

subscription_id

cloudprovider_azure_op_duration_secondsALPHAHistogramLatency of an Azure service operation

request

resource_group

source

subscription_id

cloudprovider_azure_op_failure_countALPHACounterNumber of failed Azure service operations

request

resource_group

source

subscription_id

cloudprovider_gce_api_request_duration_secondsALPHAHistogramLatency of a GCE API call

region

request

version

zone

cloudprovider_gce_api_request_errorsALPHACounterNumber of errors for an API call

region

request

version

zone

cloudprovider_vsphere_api_request_duration_secondsALPHAHistogramLatency of vsphere api call

request

cloudprovider_vsphere_api_request_errorsALPHACountervsphere Api errors

request

cloudprovider_vsphere_operation_duration_secondsALPHAHistogramLatency of vsphere operation call

operation

cloudprovider_vsphere_operation_errorsALPHACountervsphere operation errors

operation

cloudprovider_vsphere_vcenter_versionsALPHACustomVersions for connected vSphere vCenters

hostname

version

build

container_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the container in core-seconds

container

pod

namespace

container_memory_working_set_bytesALPHACustomCurrent working set of the container in bytes

container

pod

namespace

container_start_time_secondsALPHACustomStart time of the container since unix epoch in seconds

container

pod

namespace

container_swap_usage_bytesALPHACustomCurrent amount of the container swap usage in bytes. Reported only on non-windows systems

container

pod

namespace

csi_operations_secondsALPHAHistogramContainer Storage Interface operation duration with gRPC error code status total

driver_name

grpc_status_code

method_name

migrated

endpoint_slice_controller_changesALPHACounterNumber of EndpointSlice changes

operation

endpoint_slice_controller_desired_endpoint_slicesALPHAGaugeNumber of EndpointSlices that would exist with perfect endpoint allocationendpoint_slice_controller_endpoints_added_per_syncALPHAHistogramNumber of endpoints added on each Service syncendpoint_slice_controller_endpoints_desiredALPHAGaugeNumber of endpoints desiredendpoint_slice_controller_endpoints_removed_per_syncALPHAHistogramNumber of endpoints removed on each Service syncendpoint_slice_controller_endpointslices_changed_per_syncALPHAHistogramNumber of EndpointSlices changed on each Service sync

topology

endpoint_slice_controller_num_endpoint_slicesALPHAGaugeNumber of EndpointSlicesendpoint_slice_controller_syncsALPHACounterNumber of EndpointSlice syncs

result

endpoint_slice_mirroring_controller_addresses_skipped_per_syncALPHAHistogramNumber of addresses skipped on each Endpoints sync due to being invalid or exceeding MaxEndpointsPerSubsetendpoint_slice_mirroring_controller_changesALPHACounterNumber of EndpointSlice changes

operation

endpoint_slice_mirroring_controller_desired_endpoint_slicesALPHAGaugeNumber of EndpointSlices that would exist with perfect endpoint allocationendpoint_slice_mirroring_controller_endpoints_added_per_syncALPHAHistogramNumber of endpoints added on each Endpoints syncendpoint_slice_mirroring_controller_endpoints_desiredALPHAGaugeNumber of endpoints desiredendpoint_slice_mirroring_controller_endpoints_removed_per_syncALPHAHistogramNumber of endpoints removed on each Endpoints syncendpoint_slice_mirroring_controller_endpoints_sync_durationALPHAHistogramDuration of syncEndpoints() in secondsendpoint_slice_mirroring_controller_endpoints_updated_per_syncALPHAHistogramNumber of endpoints updated on each Endpoints syncendpoint_slice_mirroring_controller_num_endpoint_slicesALPHAGaugeNumber of EndpointSlicesephemeral_volume_controller_create_failures_totalALPHACounterNumber of PersistenVolumeClaims creation requestsephemeral_volume_controller_create_totalALPHACounterNumber of PersistenVolumeClaims creation requestsetcd_bookmark_countsALPHAGaugeNumber of etcd bookmarks (progress notify events) split by kind.

resource

etcd_lease_object_countsALPHAHistogramNumber of objects attached to a single etcd lease.etcd_request_duration_secondsALPHAHistogramEtcd request latency in seconds for each operation and object type.

operation

type

etcd_request_errors_totalALPHACounterEtcd failed request counts for each operation and object type.

operation

type

etcd_requests_totalALPHACounterEtcd request counts for each operation and object type.

operation

type

etcd_version_infoALPHAGaugeEtcd server's binary version

binary_version

field_validation_request_duration_secondsALPHAHistogramResponse latency distribution in seconds for each field validation value

field_validation

force_cleaned_failed_volume_operation_errors_totalALPHACounterThe number of volumes that failed force cleanup after their reconstruction failed during kubelet startup.force_cleaned_failed_volume_operations_totalALPHACounterThe number of volumes that were force cleaned after their reconstruction failed during kubelet startup. This includes both successful and failed cleanups.garbagecollector_controller_resources_sync_error_totalALPHACounterNumber of garbage collector resources sync errorsget_token_countALPHACounterCounter of total Token() requests to the alternate token sourceget_token_fail_countALPHACounterCounter of failed Token() requests to the alternate token sourcehorizontal_pod_autoscaler_controller_metric_computation_duration_secondsALPHAHistogramThe time(seconds) that the HPA controller takes to calculate one metric. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. The label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type

action

error

metric_type

horizontal_pod_autoscaler_controller_metric_computation_totalALPHACounterNumber of metric computations. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. The label 'metric_type' corresponds to HPA.spec.metrics[*].type

action

error

metric_type

horizontal_pod_autoscaler_controller_reconciliation_duration_secondsALPHAHistogramThe time(seconds) that the HPA controller takes to reconcile once. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.

action

error

horizontal_pod_autoscaler_controller_reconciliations_totalALPHACounterNumber of reconciliations of HPA controller. The label 'action' should be either 'scale_down', 'scale_up', or 'none'. Also, the label 'error' should be either 'spec', 'internal', or 'none'. Note that if both spec and internal errors happen during a reconciliation, the first one to occur is reported in `error` label.

action

error

job_controller_pod_failures_handled_by_failure_policy_totalALPHACounter`The number of failed Pods handled by failure policy with, respect to the failure policy action applied based on the matched, rule. Possible values of the action label correspond to the, possible values for the failure policy rule action, which are:, "FailJob", "Ignore" and "Count".`

action

job_controller_terminated_pods_tracking_finalizer_totalALPHACounter`The number of terminated pods (phase=Failed|Succeeded), that have the finalizer batch.kubernetes.io/job-tracking, The event label can be "add" or "delete".`

event

kube_apiserver_clusterip_allocator_allocated_ipsALPHAGaugeGauge measuring the number of allocated IPs for Services

cidr

kube_apiserver_clusterip_allocator_allocation_errors_totalALPHACounterNumber of errors trying to allocate Cluster IPs

cidr

scope

kube_apiserver_clusterip_allocator_allocation_totalALPHACounterNumber of Cluster IPs allocations

cidr

scope

kube_apiserver_clusterip_allocator_available_ipsALPHAGaugeGauge measuring the number of available IPs for Services

cidr

kube_apiserver_nodeport_allocator_allocated_portsALPHAGaugeGauge measuring the number of allocated NodePorts for Serviceskube_apiserver_nodeport_allocator_available_portsALPHAGaugeGauge measuring the number of available NodePorts for Serviceskube_apiserver_pod_logs_backend_tls_failure_totalALPHACounterTotal number of requests for pods/logs that failed due to kubelet server TLS verificationkube_apiserver_pod_logs_insecure_backend_totalALPHACounterTotal number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied

usage

kube_apiserver_pod_logs_pods_logs_backend_tls_failure_totalALPHACounterTotal number of requests for pods/logs that failed due to kubelet server TLS verification1.27.0kube_apiserver_pod_logs_pods_logs_insecure_backend_totalALPHACounterTotal number of requests for pods/logs sliced by usage type: enforce_tls, skip_tls_allowed, skip_tls_denied

usage

1.27.0kubelet_active_podsALPHAGaugeThe number of pods the kubelet considers active and which are being considered when admitting new pods. static is true if the pod is not from the apiserver.

static

kubelet_certificate_manager_client_expiration_renew_errorsALPHACounterCounter of certificate renewal errors.kubelet_certificate_manager_client_ttl_secondsALPHAGaugeGauge of the TTL (time-to-live) of the Kubelet's client certificate. The value is in seconds until certificate expiry (negative if already expired). If client certificate is invalid or unused, the value will be +INF.kubelet_certificate_manager_server_rotation_secondsALPHAHistogramHistogram of the number of seconds the previous certificate lived before being rotated.kubelet_certificate_manager_server_ttl_secondsALPHAGaugeGauge of the shortest TTL (time-to-live) of the Kubelet's serving certificate. The value is in seconds until certificate expiry (negative if already expired). If serving certificate is invalid or unused, the value will be +INF.kubelet_cgroup_manager_duration_secondsALPHAHistogramDuration in seconds for cgroup manager operations. Broken down by method.

operation_type

kubelet_container_log_filesystem_used_bytesALPHACustomBytes used by the container's logs on the filesystem.

uid

namespace

pod

container

kubelet_containers_per_pod_countALPHAHistogramThe number of containers per pod.kubelet_cpu_manager_pinning_errors_totalALPHACounterThe number of cpu core allocations which required pinning failed.kubelet_cpu_manager_pinning_requests_totalALPHACounterThe number of cpu core allocations which required pinning.kubelet_credential_provider_plugin_durationALPHAHistogramDuration of execution in seconds for credential provider plugin

plugin_name

kubelet_credential_provider_plugin_errorsALPHACounterNumber of errors from credential provider plugin

plugin_name

kubelet_desired_podsALPHAGaugeThe number of pods the kubelet is being instructed to run. static is true if the pod is not from the apiserver.

static

kubelet_device_plugin_alloc_duration_secondsALPHAHistogramDuration in seconds to serve a device plugin Allocation request. Broken down by resource name.

resource_name

kubelet_device_plugin_registration_totalALPHACounterCumulative number of device plugin registrations. Broken down by resource name.

resource_name

kubelet_evented_pleg_connection_error_countALPHACounterThe number of errors encountered during the establishment of streaming connection with the CRI runtime.kubelet_evented_pleg_connection_latency_secondsALPHAHistogramThe latency of streaming connection with the CRI runtime, measured in seconds.kubelet_evented_pleg_connection_success_countALPHACounterThe number of times a streaming client was obtained to receive CRI Events.kubelet_eviction_stats_age_secondsALPHAHistogramTime between when stats are collected, and when pod is evicted based on those stats by eviction signal

eviction_signal

kubelet_evictionsALPHACounterCumulative number of pod evictions by eviction signal

eviction_signal

kubelet_graceful_shutdown_end_time_secondsALPHAGaugeLast graceful shutdown start time since unix epoch in secondskubelet_graceful_shutdown_start_time_secondsALPHAGaugeLast graceful shutdown start time since unix epoch in secondskubelet_http_inflight_requestsALPHAGaugeNumber of the inflight http requests

long_running

method

path

server_type

kubelet_http_requests_duration_secondsALPHAHistogramDuration in seconds to serve http requests

long_running

method

path

server_type

kubelet_http_requests_totalALPHACounterNumber of the http requests received since the server started

long_running

method

path

server_type

kubelet_lifecycle_handler_http_fallbacks_totalALPHACounterThe number of times lifecycle handlers successfully fell back to http from https.kubelet_managed_ephemeral_containersALPHAGaugeCurrent number of ephemeral containers in pods managed by this kubelet.kubelet_mirror_podsALPHAGaugeThe number of mirror pods the kubelet will try to create (one per admitted static pod)kubelet_node_nameALPHAGaugeThe node's name. The count is always 1.

node

kubelet_orphan_pod_cleaned_volumesALPHAGaugeThe total number of orphaned Pods whose volumes were cleaned in the last periodic sweep.kubelet_orphan_pod_cleaned_volumes_errorsALPHAGaugeThe number of orphaned Pods whose volumes failed to be cleaned in the last periodic sweep.kubelet_orphaned_runtime_pods_totalALPHACounterNumber of pods that have been detected in the container runtime without being already known to the pod worker. This typically indicates the kubelet was restarted while a pod was force deleted in the API or in the local configuration, which is unusual.kubelet_pleg_discard_eventsALPHACounterThe number of discard events in PLEG.kubelet_pleg_last_seen_secondsALPHAGaugeTimestamp in seconds when PLEG was last seen active.kubelet_pleg_relist_duration_secondsALPHAHistogramDuration in seconds for relisting pods in PLEG.kubelet_pleg_relist_interval_secondsALPHAHistogramInterval in seconds between relisting in PLEG.kubelet_pod_resources_endpoint_errors_getALPHACounterNumber of requests to the PodResource Get endpoint which returned error. Broken down by server api version.

server_api_version

kubelet_pod_resources_endpoint_errors_get_allocatableALPHACounterNumber of requests to the PodResource GetAllocatableResources endpoint which returned error. Broken down by server api version.

server_api_version

kubelet_pod_resources_endpoint_errors_listALPHACounterNumber of requests to the PodResource List endpoint which returned error. Broken down by server api version.

server_api_version

kubelet_pod_resources_endpoint_requests_getALPHACounterNumber of requests to the PodResource Get endpoint. Broken down by server api version.

server_api_version

kubelet_pod_resources_endpoint_requests_get_allocatableALPHACounterNumber of requests to the PodResource GetAllocatableResources endpoint. Broken down by server api version.

server_api_version

kubelet_pod_resources_endpoint_requests_listALPHACounterNumber of requests to the PodResource List endpoint. Broken down by server api version.

server_api_version

kubelet_pod_resources_endpoint_requests_totalALPHACounterCumulative number of requests to the PodResource endpoint. Broken down by server api version.

server_api_version

kubelet_pod_start_duration_secondsALPHAHistogramDuration in seconds from kubelet seeing a pod for the first time to the pod starting to runkubelet_pod_start_sli_duration_secondsALPHAHistogramDuration in seconds to start a pod, excluding time to pull images and run init containers, measured from pod creation timestamp to when all its containers are reported as started and observed via watchkubelet_pod_status_sync_duration_secondsALPHAHistogramDuration in seconds to sync a pod status update. Measures time from detection of a change to pod status until the API is successfully updated for that pod, even if multiple intevening changes to pod status occur.kubelet_pod_worker_duration_secondsALPHAHistogramDuration in seconds to sync a single pod. Broken down by operation type: create, update, or sync

operation_type

kubelet_pod_worker_start_duration_secondsALPHAHistogramDuration in seconds from kubelet seeing a pod to starting a worker.kubelet_preemptionsALPHACounterCumulative number of pod preemptions by preemption resource

preemption_signal

kubelet_restarted_pods_totalALPHACounterNumber of pods that have been restarted because they were deleted and recreated with the same UID while the kubelet was watching them (common for static pods, extremely uncommon for API pods)

static

kubelet_run_podsandbox_duration_secondsALPHAHistogramDuration in seconds of the run_podsandbox operations. Broken down by RuntimeClass.Handler.

runtime_handler

kubelet_run_podsandbox_errors_totalALPHACounterCumulative number of the run_podsandbox operation errors by RuntimeClass.Handler.

runtime_handler

kubelet_running_containersALPHAGaugeNumber of containers currently running

container_state

kubelet_running_podsALPHAGaugeNumber of pods that have a running pod sandboxkubelet_runtime_operations_duration_secondsALPHAHistogramDuration in seconds of runtime operations. Broken down by operation type.

operation_type

kubelet_runtime_operations_errors_totalALPHACounterCumulative number of runtime operation errors by operation type.

operation_type

kubelet_runtime_operations_totalALPHACounterCumulative number of runtime operations by operation type.

operation_type

kubelet_server_expiration_renew_errorsALPHACounterCounter of certificate renewal errors.kubelet_started_containers_errors_totalALPHACounterCumulative number of errors when starting containers

code

container_type

kubelet_started_containers_totalALPHACounterCumulative number of containers started

container_type

kubelet_started_host_process_containers_errors_totalALPHACounterCumulative number of errors when starting hostprocess containers. This metric will only be collected on Windows.

code

container_type

kubelet_started_host_process_containers_totalALPHACounterCumulative number of hostprocess containers started. This metric will only be collected on Windows.

container_type

kubelet_started_pods_errors_totalALPHACounterCumulative number of errors when starting podskubelet_started_pods_totalALPHACounterCumulative number of pods startedkubelet_topology_manager_admission_duration_msALPHAHistogramDuration in milliseconds to serve a pod admission request.kubelet_topology_manager_admission_errors_totalALPHACounterThe number of admission request failures where resources could not be aligned.kubelet_topology_manager_admission_requests_totalALPHACounterThe number of admission requests where resources have to be aligned.kubelet_volume_metric_collection_duration_secondsALPHAHistogramDuration in seconds to calculate volume stats

metric_source

kubelet_volume_stats_available_bytesALPHACustomNumber of available bytes in the volume

namespace

persistentvolumeclaim

kubelet_volume_stats_capacity_bytesALPHACustomCapacity in bytes of the volume

namespace

persistentvolumeclaim

kubelet_volume_stats_health_status_abnormalALPHACustomAbnormal volume health status. The count is either 1 or 0. 1 indicates the volume is unhealthy, 0 indicates volume is healthy

namespace

persistentvolumeclaim

kubelet_volume_stats_inodesALPHACustomMaximum number of inodes in the volume

namespace

persistentvolumeclaim

kubelet_volume_stats_inodes_freeALPHACustomNumber of free inodes in the volume

namespace

persistentvolumeclaim

kubelet_volume_stats_inodes_usedALPHACustomNumber of used inodes in the volume

namespace

persistentvolumeclaim

kubelet_volume_stats_used_bytesALPHACustomNumber of used bytes in the volume

namespace

persistentvolumeclaim

kubelet_working_podsALPHAGaugeNumber of pods the kubelet is actually running, broken down by lifecycle phase, whether the pod is desired, orphaned, or runtime only (also orphaned), and whether the pod is static. An orphaned pod has been removed from local configuration or force deleted in the API and consumes resources that are not otherwise visible.

config

lifecycle

static

kubeproxy_network_programming_duration_secondsALPHAHistogramIn Cluster Network Programming Latency in secondskubeproxy_proxy_healthz_totalALPHACounterCumulative proxy healthz HTTP status

code

kubeproxy_proxy_livez_totalALPHACounterCumulative proxy livez HTTP status

code

kubeproxy_sync_full_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in seconds for full resyncskubeproxy_sync_partial_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in seconds for partial resyncskubeproxy_sync_proxy_rules_duration_secondsALPHAHistogramSyncProxyRules latency in secondskubeproxy_sync_proxy_rules_endpoint_changes_pendingALPHAGaugePending proxy rules Endpoint changeskubeproxy_sync_proxy_rules_endpoint_changes_totalALPHACounterCumulative proxy rules Endpoint changeskubeproxy_sync_proxy_rules_iptables_lastALPHAGaugeNumber of iptables rules written by kube-proxy in last sync

table

kubeproxy_sync_proxy_rules_iptables_partial_restore_failures_totalALPHACounterCumulative proxy iptables partial restore failureskubeproxy_sync_proxy_rules_iptables_restore_failures_totalALPHACounterCumulative proxy iptables restore failureskubeproxy_sync_proxy_rules_iptables_totalALPHAGaugeTotal number of iptables rules owned by kube-proxy

table

kubeproxy_sync_proxy_rules_last_queued_timestamp_secondsALPHAGaugeThe last time a sync of proxy rules was queuedkubeproxy_sync_proxy_rules_last_timestamp_secondsALPHAGaugeThe last time proxy rules were successfully syncedkubeproxy_sync_proxy_rules_no_local_endpoints_totalALPHAGaugeNumber of services with a Local traffic policy and no endpoints

traffic_policy

kubeproxy_sync_proxy_rules_service_changes_pendingALPHAGaugePending proxy rules Service changeskubeproxy_sync_proxy_rules_service_changes_totalALPHACounterCumulative proxy rules Service changeskubernetes_build_infoALPHAGaugeA metric with a constant '1' value labeled by major, minor, git version, git commit, git tree state, build date, Go version, and compiler from which Kubernetes was built, and platform on which it is running.

build_date

compiler

git_commit

git_tree_state

git_version

go_version

major

minor

platform

leader_election_master_statusALPHAGaugeGauge of if the reporting system is master of the relevant lease, 0 indicates backup, 1 indicates master. 'name' is the string used to identify the lease. Please make sure to group by name.

name

node_authorizer_graph_actions_duration_secondsALPHAHistogramHistogram of duration of graph actions in node authorizer.

operation

node_collector_unhealthy_nodes_in_zoneALPHAGaugeGauge measuring number of not Ready Nodes per zones.

zone

node_collector_update_all_nodes_health_duration_secondsALPHAHistogramDuration in seconds for NodeController to update the health of all nodes.node_collector_update_node_health_duration_secondsALPHAHistogramDuration in seconds for NodeController to update the health of a single node.node_collector_zone_healthALPHAGaugeGauge measuring percentage of healthy nodes per zone.

zone

node_collector_zone_sizeALPHAGaugeGauge measuring number of registered Nodes per zones.

zone

node_controller_cloud_provider_taint_removal_delay_secondsALPHAHistogramNumber of seconds after node creation when NodeController removed the cloud-provider taint of a single node.node_controller_initial_node_sync_delay_secondsALPHAHistogramNumber of seconds after node creation when NodeController finished the initial synchronization of a single node.node_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the node in core-secondsnode_ipam_controller_cidrset_allocation_tries_per_requestALPHAHistogramNumber of endpoints added on each Service sync

clusterCIDR

node_ipam_controller_cidrset_cidrs_allocations_totalALPHACounterCounter measuring total number of CIDR allocations.

clusterCIDR

node_ipam_controller_cidrset_cidrs_releases_totalALPHACounterCounter measuring total number of CIDR releases.

clusterCIDR

node_ipam_controller_cidrset_usage_cidrsALPHAGaugeGauge measuring percentage of allocated CIDRs.

clusterCIDR

node_ipam_controller_cirdset_max_cidrsALPHAGaugeMaximum number of CIDRs that can be allocated.

clusterCIDR

node_ipam_controller_multicidrset_allocation_tries_per_requestALPHAHistogramHistogram measuring CIDR allocation tries per request.

clusterCIDR

node_ipam_controller_multicidrset_cidrs_allocations_totalALPHACounterCounter measuring total number of CIDR allocations.

clusterCIDR

node_ipam_controller_multicidrset_cidrs_releases_totalALPHACounterCounter measuring total number of CIDR releases.

clusterCIDR

node_ipam_controller_multicidrset_usage_cidrsALPHAGaugeGauge measuring percentage of allocated CIDRs.

clusterCIDR

node_ipam_controller_multicirdset_max_cidrsALPHAGaugeMaximum number of CIDRs that can be allocated.

clusterCIDR

node_memory_working_set_bytesALPHACustomCurrent working set of the node in bytesnode_swap_usage_bytesALPHACustomCurrent swap usage of the node in bytes. Reported only on non-windows systemsnumber_of_l4_ilbsALPHAGaugeNumber of L4 ILBs

feature

plugin_manager_total_pluginsALPHACustomNumber of plugins in Plugin Manager

socket_path

state

pod_cpu_usage_seconds_totalALPHACustomCumulative cpu time consumed by the pod in core-seconds

pod

namespace

pod_gc_collector_force_delete_pod_errors_totalALPHACounterNumber of errors encountered when forcefully deleting the pods since the Pod GC Controller started.

namespace

reason

pod_gc_collector_force_delete_pods_totalALPHACounterNumber of pods that are being forcefully deleted since the Pod GC Controller started.

namespace

reason

pod_memory_working_set_bytesALPHACustomCurrent working set of the pod in bytes

pod

namespace

pod_security_errors_totalALPHACounterNumber of errors preventing normal evaluation. Non-fatal errors may result in the latest restricted profile being used for evaluation.

fatal

request_operation

resource

subresource

pod_security_evaluations_totalALPHACounterNumber of policy evaluations that occurred, not counting ignored or exempt requests.

decision

mode

policy_level

policy_version

request_operation

resource

subresource

pod_security_exemptions_totalALPHACounterNumber of exempt requests, not counting ignored or out of scope requests.

request_operation

resource

subresource

pod_swap_usage_bytesALPHACustomCurrent amount of the pod swap usage in bytes. Reported only on non-windows systems

pod

namespace

prober_probe_duration_secondsALPHAHistogramDuration in seconds for a probe response.

container

namespace

pod

probe_type

prober_probe_totalALPHACounterCumulative number of a liveness, readiness or startup probe for a container by result.

container

namespace

pod

pod_uid

probe_type

result

pv_collector_bound_pv_countALPHACustomGauge measuring number of persistent volume currently bound

storage_class

pv_collector_bound_pvc_countALPHACustomGauge measuring number of persistent volume claim currently bound

namespace

pv_collector_total_pv_countALPHACustomGauge measuring total number of persistent volumes

plugin_name

volume_mode

pv_collector_unbound_pv_countALPHACustomGauge measuring number of persistent volume currently unbound

storage_class

pv_collector_unbound_pvc_countALPHACustomGauge measuring number of persistent volume claim currently unbound

namespace

reconstruct_volume_operations_errors_totalALPHACounterThe number of volumes that failed reconstruction from the operating system during kubelet startup.reconstruct_volume_operations_totalALPHACounterThe number of volumes that were attempted to be reconstructed from the operating system during kubelet startup. This includes both successful and failed reconstruction.replicaset_controller_sorting_deletion_age_ratioALPHAHistogramThe ratio of chosen deleted pod's ages to the current youngest pod's age (at the time). Should be <2.The intent of this metric is to measure the rough efficacy of the LogarithmicScaleDown feature gate's effect onthe sorting (and deletion) of pods when a replicaset scales down. This only considers Ready pods when calculating and reporting.resourceclaim_controller_create_attempts_totalALPHACounterNumber of ResourceClaims creation requestsresourceclaim_controller_create_failures_totalALPHACounterNumber of ResourceClaims creation request failuresrest_client_dns_resolution_duration_secondsALPHAHistogramDNS resolver latency in seconds. Broken down by host.

host

rest_client_exec_plugin_call_totalALPHACounterNumber of calls to an exec plugin, partitioned by the type of event encountered (no_error, plugin_execution_error, plugin_not_found_error, client_internal_error) and an optional exit code. The exit code will be set to 0 if and only if the plugin call was successful.

call_status

code

rest_client_exec_plugin_certificate_rotation_ageALPHAHistogramHistogram of the number of seconds the last auth exec plugin client certificate lived before being rotated. If auth exec plugin client certificates are unused, histogram will contain no data.rest_client_exec_plugin_ttl_secondsALPHAGaugeGauge of the shortest TTL (time-to-live) of the client certificate(s) managed by the auth exec plugin. The value is in seconds until certificate expiry (negative if already expired). If auth exec plugins are unused or manage no TLS certificates, the value will be +INF.rest_client_rate_limiter_duration_secondsALPHAHistogramClient side rate limiter latency in seconds. Broken down by verb, and host.

host

verb

rest_client_request_duration_secondsALPHAHistogramRequest latency in seconds. Broken down by verb, and host.

host

verb

rest_client_request_retries_totalALPHACounterNumber of request retries, partitioned by status code, verb, and host.

code

host

verb

rest_client_request_size_bytesALPHAHistogramRequest size in bytes. Broken down by verb and host.

host

verb

rest_client_requests_totalALPHACounterNumber of HTTP requests, partitioned by status code, method, and host.

code

host

method

rest_client_response_size_bytesALPHAHistogramResponse size in bytes. Broken down by verb and host.

host

verb

rest_client_transport_cache_entriesALPHAGaugeNumber of transport entries in the internal cache.rest_client_transport_create_calls_totalALPHACounterNumber of calls to get a new transport, partitioned by the result of the operation hit: obtained from the cache, miss: created and added to the cache, uncacheable: created and not cached

result

retroactive_storageclass_errors_totalALPHACounterTotal number of failed retroactive StorageClass assignments to persistent volume claimretroactive_storageclass_totalALPHACounterTotal number of retroactive StorageClass assignments to persistent volume claimroot_ca_cert_publisher_sync_duration_secondsALPHAHistogramNumber of namespace syncs happened in root ca cert publisher.

code

root_ca_cert_publisher_sync_totalALPHACounterNumber of namespace syncs happened in root ca cert publisher.

code

running_managed_controllersALPHAGaugeIndicates where instances of a controller are currently running

manager

name

scheduler_goroutinesALPHAGaugeNumber of running goroutines split by the work they do such as binding.

operation

scheduler_permit_wait_duration_secondsALPHAHistogramDuration of waiting on permit.

result

scheduler_plugin_evaluation_totalALPHACounterNumber of attempts to schedule pods by each plugin and the extension point (available only in PreFilter and Filter.).

extension_point

plugin

profile

scheduler_plugin_execution_duration_secondsALPHAHistogramDuration for running a plugin at a specific extension point.

extension_point

plugin

status

scheduler_scheduler_cache_sizeALPHAGaugeNumber of nodes, pods, and assumed (bound) pods in the scheduler cache.

type

scheduler_scheduling_algorithm_duration_secondsALPHAHistogramScheduling algorithm latency in secondsscheduler_unschedulable_podsALPHAGaugeThe number of unschedulable pods broken down by plugin name. A pod will increment the gauge for all plugins that caused it to not schedule and so this metric have meaning only when broken down by plugin.

plugin

profile

scheduler_volume_binder_cache_requests_totalALPHACounterTotal number for request volume binding cache

operation

scheduler_volume_scheduling_stage_error_totalALPHACounterVolume scheduling stage error count

operation

scrape_errorALPHACustom1 if there was an error while getting container metrics, 0 otherwiseservice_controller_loadbalancer_sync_totalALPHACounterA metric counting the amount of times any load balancer has been configured, as an effect of service/node changes on the clusterservice_controller_nodesync_error_totalALPHACounterA metric counting the amount of times any load balancer has been configured and errored, as an effect of node changes on the clusterservice_controller_nodesync_latency_secondsALPHAHistogramA metric measuring the latency for nodesync which updates loadbalancer hosts on cluster node updates.service_controller_update_loadbalancer_host_latency_secondsALPHAHistogramA metric measuring the latency for updating each load balancer hosts.serviceaccount_legacy_tokens_totalALPHACounterCumulative legacy service account tokens usedserviceaccount_stale_tokens_totalALPHACounterCumulative stale projected service account tokens usedserviceaccount_valid_tokens_totalALPHACounterCumulative valid projected service account tokens usedstorage_count_attachable_volumes_in_useALPHACustomMeasure number of volumes in use

node

volume_plugin

storage_operation_duration_secondsALPHAHistogramStorage operation duration

migrated

operation_name

status

volume_plugin

ttl_after_finished_controller_job_deletion_duration_secondsALPHAHistogramThe time it took to delete the job since it became eligible for deletionvolume_manager_selinux_container_errors_totalALPHAGaugeNumber of errors when kubelet cannot compute SELinux context for a container. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of containers.volume_manager_selinux_container_warnings_totalALPHAGaugeNumber of errors when kubelet cannot compute SELinux context for a container that are ignored. They will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.volume_manager_selinux_pod_context_mismatch_errors_totalALPHAGaugeNumber of errors when a Pod defines different SELinux contexts for its containers that use the same volume. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.volume_manager_selinux_pod_context_mismatch_warnings_totalALPHAGaugeNumber of errors when a Pod defines different SELinux contexts for its containers that use the same volume. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.volume_manager_selinux_volume_context_mismatch_errors_totalALPHAGaugeNumber of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. Kubelet can't start such a Pod then and it will retry, therefore value of this metric may not represent the actual nr. of Pods.volume_manager_selinux_volume_context_mismatch_warnings_totalALPHAGaugeNumber of errors when a Pod uses a volume that is already mounted with a different SELinux context than the Pod needs. They are not errors yet, but they will become real errors when SELinuxMountReadWriteOncePod feature is expanded to all volume access modes.volume_manager_selinux_volumes_admitted_totalALPHAGaugeNumber of volumes whose SELinux context was fine and will be mounted with mount -o context option.volume_manager_total_volumesALPHACustomNumber of volumes in Volume Manager

plugin_name

state

volume_operation_total_errorsALPHACounterTotal volume operation errors

operation_name

plugin_name

volume_operation_total_secondsALPHAHistogramStorage operation end to end duration in seconds

operation_name

plugin_name

watch_cache_capacityALPHAGaugeTotal capacity of watch cache broken by resource type.

resource

watch_cache_capacity_decrease_totalALPHACounterTotal number of watch cache capacity decrease events broken by resource type.

resource

watch_cache_capacity_increase_totalALPHACounterTotal number of watch cache capacity increase events broken by resource type.

resource

workqueue_adds_totalALPHACounterTotal number of adds handled by workqueue

name

workqueue_depthALPHAGaugeCurrent depth of workqueue

name

workqueue_longest_running_processor_secondsALPHAGaugeHow many seconds has the longest running processor for workqueue been running.

name

workqueue_queue_duration_secondsALPHAHistogramHow long in seconds an item stays in workqueue before being requested.

name

workqueue_retries_totalALPHACounterTotal number of retries handled by workqueue

name

workqueue_unfinished_work_secondsALPHAGaugeHow many seconds of work has done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.

name

workqueue_work_duration_secondsALPHAHistogramHow long in seconds processing an item from workqueue takes.

name

Top Articles
Latest Posts
Article information

Author: Rev. Leonie Wyman

Last Updated: 10/08/2023

Views: 6714

Rating: 4.9 / 5 (79 voted)

Reviews: 94% of readers found this page helpful

Author information

Name: Rev. Leonie Wyman

Birthday: 1993-07-01

Address: Suite 763 6272 Lang Bypass, New Xochitlport, VT 72704-3308

Phone: +22014484519944

Job: Banking Officer

Hobby: Sailing, Gaming, Basketball, Calligraphy, Mycology, Astronomy, Juggling

Introduction: My name is Rev. Leonie Wyman, I am a colorful, tasty, splendid, fair, witty, gorgeous, splendid person who loves writing and wants to share my knowledge and understanding with you.