Docs » Integrations Guide » Integrations Reference » Kubernetes

../../_images/integrations_kubernetes.png Kubernetes πŸ”—

DESCRIPTION πŸ”—

SignalFx monitors applications, services, and infrastructure in your Kubernetes environment using the SignalFx Smart Agent for Kubernetes. By default, the Smart Agent for Kubernetes is packaged in a container for deployment as a DaemonSet on each node in your Kubernetes cluster. Install the SignalFx Smart Agent on all Linux-hosted Kubernetes nodes for which you want to collect metrics.

The Smart Agent is installed with a set of pre-configured monitors that collect metrics from the software and services it discovers on the nodes where it is installed. Metrics from the Smart Agent kubernetes-cluster and kubelet-stats monitors automatically populate built-in dashboards in SignalFx.

INSTALLATION πŸ”—

Requirements

  • Kubernetes 1.11 or higher
  • OpenShift 3.10 or higher
  • Linux-hosted Kubernetes nodes

Dependencies

  • Install and configure the Helm client. For details on installing Helm, see here.
  • Install the Tiller component on your Kubernetes cluster.

We recommend that you use Helm to install and configure the SignalFx Smart Agent in your Kubernetes environment. Helm allows mulitple resources (that is, DaemonSet, configmap, clusterrole, and clusterrole binding) to be configured by a single script action. If you want to install the SignalFx Smart Agent on your Kubernetes cluster using kubectl, see Kubernetes Advanced Installation.

Install the SignalFx Smart Agent on your Kubernetes cluster using Helm πŸ”—

  1. Add the SignalFx Helm chart repository to Helm.
$ helm repo add signalfx https://dl.signalfx.com/helm-repo
  1. Ensure the latest state of the repository.
$ helm repo update
  1. Install the Smart Agent chart with the necessary configuration values for the chart.
$ helm install --set signalFxAccessToken=<YOUR_ACCESS_TOKEN> --set clusterName=<YOUR_CLUSTER_NAME> --set agentVersion=<VERSION_NUMBER> --set signalFxRealm=<YOUR_SIGNALFX_REALM> <SIGNALFX_AGENT_APPNAME> signalfx/signalfx-agent
Config Options Required or Optional Description
YOUR_ACCESS_TOKEN Required The token used to authenticate your connection to SignalFx.
YOUR_CLUSTER_NAME Required (if not overriding the Smart Agent config template and providing your own cluster name) A name that will be applied as the kubernetes-cluster dimension to any metric originating in this cluster.
VERSION_NUMBER Optional Specify the version to install. By default, the latest released version will be installed.
SIGNALFX_REALM Required Specify the name of the realm in which your organization is hosted. The realm name is shown on your profile page in the SignalFx web application.
SIGNALFX_AGENT_APPNAME Optional Provide a name to identify the Smart Agent, or specify --generate-name to generate an identifier for the Smart Agent automatically.

Optionally, specify agentConfig if you want to provide your own agent configuration.

If you are using OpenShift, set kubernetesDistro to openshift to get OpenShift-specific functionality.

$ helm install --set signalFxAccessToken=<YOUR_ACCESS_TOKEN> --set clusterName=<YOUR_CLUSTER_NAME> --set agentVersion=<VERSION_NUMBER> --set signalFxRealm=<YOUR_SIGNALFX_REALM> signalfx/signalfx-agent --set kubernetesDistro=openshift

Your installation is complete.

FEATURES πŸ”—

Use the Kubernetes integration to monitor the health and performance of your microservices, the Kubernetes orchestration services, and the infrastructure that they are running on.

  • Discover and automatically configure the monitoring of supported services running in the containers
  • Use the built-in dashboards to view key metrics that are indicators of the health of your infrastructure and the orchestrator

Kubernetes Navigator πŸ”—

The new SignalFx Kubernetes Navigator gives you a real-time, at-a-glance view of the overall health and performance of your Kubernetes environment. It also provides visibility all the way through the stack as you drill down and across elements of your environment, reflecting the fact that the infrastructure, Kubernetes control plane, containers, applications, and services are all related layers, not just individual system components.

../../_images/Map_1.png

The Kubernetes Navigator selection bar provides several tabs for viewing information about your clusters, nodes, pods, containers, and workloads. Examples are shown below.

  • Node Detail: The Node Detail tab displays detailed information about a selected node, including additional properties, workloads running on the node, containers on this node, and so on. The properties in the upper left are metadata about the node. If desired, you can specify a different cluster or node. The status of the workloads helps you understand the health of the workloads.

    ../../_images/Node_Detail_1.png

  • Pod Detail: The Pod Detail tab displays detailed information about a selected pod, including its containers. Use this view to track the activity on one pod or across all pods in your cluster. The properties in the upper left are metadata about the pod. If desired, you can specify a different cluster, node, or pod.

    ../../_images/Pod_Detail_1.png

Learning More πŸ”—

Once data is flowing, explore the Kubernetes Navigator to get familiar with the ways to visualize data from your nodes, pods, containers, and network.

METRICS πŸ”—

Metric Name Description Type
container_cpu_system_seconds_total Cumulative system cpu time consumed in nanoseconds. counter
container_cpu_usage_seconds_total Cumulative cpu time consumed per cpu in nanoseconds. counter
container_cpu_user_seconds_total Cumulative user cpu time consumed in nanoseconds. counter
container_cpu_utilization Cumulative cpu utilization in percentages. counter
container_fs_io_current Number of I/Os currently in progress gauge
container_fs_io_time_seconds_total Cumulative count of seconds spent doing I/Os counter
container_fs_io_time_weighted_seconds_total Cumulative weighted I/O time in seconds counter
container_fs_limit_bytes Number of bytes that the container may occupy on this filesystem. gauge
container_fs_read_seconds_total Cumulative count of seconds spent reading counter
container_fs_reads_merged_total Cumulative count of reads merged counter
container_fs_reads_total Cumulative count of reads completed counter
container_fs_sector_reads_total Cumulative count of sector reads completed counter
container_fs_sector_writes_total Cumulative count of sector writes completed counter
container_fs_usage_bytes Number of bytes that are consumed by the container on this filesystem. gauge
container_fs_write_seconds_total Cumulative count of seconds spent writing counter
container_fs_writes_merged_total Cumulative count of writes merged counter
container_fs_writes_total Cumulative count of writes completed counter
container_last_seen Last time a container was seen by the exporter gauge
container_memory_failcnt Number of memory usage hits limits counter
container_memory_failures_total Cumulative count of memory allocation failures. counter
container_memory_usage_bytes Current memory usage in bytes. gauge
container_memory_working_set_bytes Current working set in bytes. gauge
container_network_receive_bytes_total Cumulative count of bytes received counter
container_network_receive_errors_total Cumulative count of errors encountered while receiving counter
container_network_receive_packets_dropped_total Cumulative count of packets dropped while receiving counter
container_network_receive_packets_total Cumulative count of packets received counter
container_network_transmit_bytes_total Cumulative count of bytes transmitted counter
container_network_transmit_errors_total Cumulative count of errors encountered while transmitting counter
container_network_transmit_packets_dropped_total Cumulative count of packets dropped while transmitting counter
container_network_transmit_packets_total Cumulative count of packets transmitted counter
container_spec_cpu_shares CPU share of the container. gauge
container_spec_memory_limit_bytes Memory limit for the container. gauge
container_spec_memory_swap_limit_bytes Memory swap limit for the container. gauge
container_start_time_seconds Start time of the container since unix epoch in seconds. gauge
container_tasks_state Number of tasks in given state gauge
kubernetes.daemon_set.current_scheduled The number of nodes that are running at least 1 daemon pod and are supposed to run the daemon pod gauge
kubernetes.daemon_set.desired_scheduled The total number of nodes that should be running the daemon pod (including nodes currently running the daemon pod) gauge
kubernetes.daemon_set.misscheduled The number of nodes that are running the daemon pod, but are not supposed to run the daemon pod gauge
kubernetes.daemon_set.ready The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready gauge
kubernetes.deployment.available Total number of available pods (ready for at least minReadySeconds) targeted by this deployment. gauge
kubernetes.deployment.desired Number of desired pods gauge
kubernetes.replica_set.available Total number of available pods (ready for at least minReadySeconds) targeted by this replica set. gauge
kubernetes.replica_set.desired Number of desired pods gauge
kubernetes.replication_controller.available Total number of available pods (ready for at least minReadySeconds) targeted by this replication controller. gauge
kubernetes.replication_controller.desired Number of desired pods gauge
kubernetes.container_restart_count How many times the container has restarted (capped at 5 due to K8s GC) gauge
kubernetes.node_ready Whether this node is ready (1), not ready (0) or in an unknown state (-1) gauge
kubernetes.pod_phase Current phase of the pod (1 - Pending, 2 - Running, 3 - Succeeded, 4 - Failed, 5 - Unknown) gauge
machine_cpu_cores Number of CPU cores on the node. gauge
machine_cpu_frequency_khz Node's CPU frequency. gauge
machine_memory_bytes Amount of memory installed on the node. gauge

container_cpu_system_seconds_total πŸ”—

counter

Cumulative system cpu time consumed in nanoseconds.

container_cpu_usage_seconds_total πŸ”—

counter

Cumulative cpu time consumed per cpu in nanoseconds.

container_cpu_user_seconds_total πŸ”—

counter

Cumulative user cpu time consumed in nanoseconds.

container_cpu_utilization πŸ”—

counter

Cumulative cpu utilization in percentages. This metric is container_cpu_usage_seconds_total/10000000. Metric has counter type because user would be interested in rate of change (Rollup: Rate/sec)

container_fs_io_current πŸ”—

gauge

Number of I/Os currently in progress.

container_fs_io_time_seconds_total πŸ”—

counter

Cumulative count of seconds spent doing I/Os.

container_fs_io_time_weighted_seconds_total πŸ”—

counter

Cumulative weighted I/O time in seconds.

container_fs_limit_bytes πŸ”—

gauge

Number of bytes that the container may occupy on this filesystem.

container_fs_read_seconds_total πŸ”—

counter

Cumulative count of seconds spent reading.

container_fs_reads_merged_total πŸ”—

counter

Cumulative count of reads merged.

container_fs_reads_total πŸ”—

counter

Cumulative count of reads completed.

container_fs_sector_reads_total πŸ”—

counter

Cumulative count of sector reads completed.

container_fs_sector_writes_total πŸ”—

counter

Cumulative count of sector writes completed.

container_fs_usage_bytes πŸ”—

gauge

Number of bytes that are consumed by the container on this filesystem.

container_fs_write_seconds_total πŸ”—

counter

Cumulative count of seconds spent writing.

container_fs_writes_merged_total πŸ”—

counter

Cumulative count of writes merged.

container_fs_writes_total πŸ”—

counter

Cumulative count of writes completed.

container_last_seen πŸ”—

gauge

Last time a container was seen by the exporter.

container_memory_failcnt πŸ”—

counter

Number of memory usage hits limits.

container_memory_failures_total πŸ”—

counter

Cumulative count of memory allocation failures.

container_memory_usage_bytes πŸ”—

gauge

Current memory usage in bytes.

container_memory_working_set_bytes πŸ”—

gauge

Current working set in bytes.

container_network_receive_bytes_total πŸ”—

counter

Cumulative count of bytes received.

container_network_receive_errors_total πŸ”—

counter

Cumulative count of errors encountered while receiving.

container_network_receive_packets_dropped_total πŸ”—

counter

Cumulative count of packets dropped while receiving.

container_network_receive_packets_total πŸ”—

counter

Cumulative count of packets received.

container_network_transmit_bytes_total πŸ”—

counter

Cumulative count of bytes transmitted.

container_network_transmit_errors_total πŸ”—

counter

Cumulative count of errors encountered while transmitting.

container_network_transmit_packets_dropped_total πŸ”—

counter

Cumulative count of packets dropped while transmitting.

container_network_transmit_packets_total πŸ”—

counter

Cumulative count of packets transmitted.

container_spec_cpu_shares πŸ”—

gauge

CPU share of the container.

container_spec_memory_limit_bytes πŸ”—

gauge

Memory limit for the container.

container_spec_memory_swap_limit_bytes πŸ”—

gauge

Memory swap limit for the container.

container_start_time_seconds πŸ”—

gauge

Start time of the container since unix epoch in seconds.

container_tasks_state πŸ”—

gauge

Number of tasks in given state.

kubernetes.daemon_set.current_scheduled πŸ”—

gauge

The number of nodes that are running at least 1 daemon pod and are supposed to run the daemon pod

kubernetes.daemon_set.desired_scheduled πŸ”—

gauge

The total number of nodes that should be running the daemon pod (including nodes currently running the daemon pod)

kubernetes.daemon_set.misscheduled πŸ”—

gauge

The number of nodes that are running the daemon pod, but are not supposed to run the daemon pod

kubernetes.daemon_set.ready πŸ”—

gauge

The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready

kubernetes.deployment.available πŸ”—

gauge

Total number of available pods (ready for at least minReadySeconds) targeted by this deployment.

kubernetes.deployment.desired πŸ”—

gauge

Number of desired pods

kubernetes.replica_set.available πŸ”—

gauge

Total number of available pods (ready for at least minReadySeconds) targeted by this replica set.

kubernetes.replica_set.desired πŸ”—

gauge

Number of desired pods

kubernetes.replication_controller.available πŸ”—

gauge

Total number of available pods (ready for at least minReadySeconds) targeted by this replication controller.

kubernetes.replication_controller.desired πŸ”—

gauge

Number of desired pods

kubernetes.container_restart_count πŸ”—

gauge

How many times the container has restarted (capped at 5 due to K8s GC)

kubernetes.node_ready πŸ”—

gauge

Whether this node is ready (1), not ready (0) or in an unknown state (-1)

kubernetes.pod_phase πŸ”—

gauge

Current phase of the pod (1 - Pending, 2 - Running, 3 - Succeeded, 4 - Failed, 5 - Unknown)

machine_cpu_cores πŸ”—

gauge

Number of CPU cores on the node.

machine_cpu_frequency_khz πŸ”—

gauge

Node’s CPU frequency.

machine_memory_bytes πŸ”—

gauge

Amount of memory installed on the node.