Docs » Integrations Guide » Integrations Reference » Consul

../../_images/integrations_consul.png Consul πŸ”—

DESCRIPTION πŸ”—

This integration primarily consists of the Smart Agent monitor collectd/consul. Below is an overview of that monitor.

Smart Agent Monitor πŸ”—

Monitors the Consul data store by using the Consul collectd Python plugin, which collects metrics from Consul instances by hitting these endpoints:

Supports Consul 0.7.0+.

INSTALLATION πŸ”—

This integration is part of the SignalFx Smart Agent as the collectd/consul monitor. You should first deploy the Smart Agent to the same host as the service you want to monitor, and then continue with the configuration instructions below.

CONFIGURATION πŸ”—

To activate this monitor in the Smart Agent, add the following to your agent config:

monitors:  # All monitor config goes under this key
 - type: collectd/consul
   ...  # Additional config

For a list of monitor options that are common to all monitors, see Common Configuration.

Config option Required Type Description
pythonBinary no string Path to a python binary that should be used to execute the Python code. If not set, a built-in runtime will be used. Can include arguments to the binary as well.
host yes string
port yes integer
aclToken no string Consul ACL token
useHTTPS no bool Set to true to connect to Consul using HTTPS. You can figure the certificate for the server with the caCertificate config option. (default: false)
telemetryServer no bool (default: false)
telemetryHost no string IP address or DNS to which Consul is configured to send telemetry UDP packets. Relevant only if telemetryServer is set to true. (default: 0.0.0.0)
telemetryPort no integer Port to which Consul is configured to send telemetry UDP packets. Relevant only if telemetryServer is set to true. (default: 8125)
enhancedMetrics no bool Set to true to enable collecting all metrics from Consul's runtime telemetry send via UDP or from the /agent/metrics endpoint. (default: false)
caCertificate no string If Consul server has HTTPS enabled for the API, specifies the path to the CA's Certificate.
clientCertificate no string If client-side authentication is enabled, specifies the path to the certificate file.
clientKey no string If client-side authentication is enabled, specifies the path to the key file.
signalFxAccessToken no string

METRICS πŸ”—

Metric Name Description Type
consul.dns.stale_queries Number of times an agent serves a DNS query based on information from a server that is more than 5 seconds out of date gauge
consul.memberlist.msg.suspect This increments when an agent suspects another as failed when executing random probes as part of the gossip protocol gauge
consul.serf.member.flap This metric increments when an agent is marked dead and then recovers within a short time period gauge
gauge.consul.catalog.nodes.total The total number of nodes in the Consul datacenter gauge
gauge.consul.catalog.nodes_by_service Number of nodes providing a given service gauge
gauge.consul.catalog.services.total The total number of services registered with Consul in the given datacenter gauge
gauge.consul.catalog.services_by_node Number of services registered with a node gauge
gauge.consul.consul.dns.domain_query.AGENT.avg This tracks how long it takes to service forward DNS lookups on the given Consul agent gauge
gauge.consul.consul.dns.domain_query.AGENT.max This tracks maximum time takes to service forward DNS lookups on the given Consul agent gauge
gauge.consul.consul.dns.domain_query.AGENT.min This tracks minimum time it takes to service forward DNS lookups on the given Consul agent gauge
gauge.consul.consul.dns.ptr_query.AGENT.avg This tracks average time it takes to service reverse DNS lookups on the given Consul agent gauge
gauge.consul.consul.dns.ptr_query.AGENT.max This tracks maximum time it takes to service reverse DNS lookups on the given Consul agent gauge
gauge.consul.consul.dns.ptr_query.AGENT.min This tracks minimum time it takes to service reverse DNS lookups on the given Consul agent gauge
gauge.consul.consul.leader.reconcile.avg Time it takes the leader to reconcile the differences between Serf membership and Consul's store gauge
gauge.consul.consul.rpc.query A general measure of all read volume gauge
gauge.consul.health.nodes.critical Number of nodes for which health checks are reporting Critical state gauge
gauge.consul.health.nodes.passing Number of nodes which health checks are reporting to be in Passing state gauge
gauge.consul.health.nodes.warning Number of nodes which health checks are reporting to be in Warning state gauge
gauge.consul.health.services.critical Number of services for which health checks are reporting Critical state gauge
gauge.consul.health.services.passing Number of services which health checks are reporting to be in Passing state gauge
gauge.consul.health.services.warning Number of services which health checks are reporting to be in Warning state gauge
gauge.consul.is_leader Metric to map consul server's in leader or follower state gauge
gauge.consul.network.dc.latency.avg Average datacenter latency between 2 datacenters gauge
gauge.consul.network.dc.latency.max Maximum datacenter latency between 2 datacenters gauge
gauge.consul.network.dc.latency.min Minimum datacenter latency between 2 datacenters gauge
gauge.consul.network.node.latency.avg Average network latency between given node and other nodes in the datacenter gauge
gauge.consul.network.node.latency.max Minimum network latency between given node and other nodes in the datacenter gauge
gauge.consul.network.node.latency.min Minimum network latency between given node and other nodes in the datacenter gauge
gauge.consul.peers Number of consul Raft peers or consul agents in server mode in a given datacenter gauge
gauge.consul.raft.apply This metric is a general indicator of the write load on the Consul servers gauge
gauge.consul.raft.commitTime.avg This measures the mean time it takes to commit a new entry to the Raft log on the leader gauge
gauge.consul.raft.commitTime.max This measures the max time it takes to commit a new entry to the Raft log on the leader gauge
gauge.consul.raft.commitTime.min This measures the minimum time it takes to commit a new entry to the Raft log on the leader gauge
gauge.consul.raft.leader.dispatchLog.avg This measures the mean time it takes for the leader to write log entries to disk gauge
gauge.consul.raft.leader.dispatchLog.max This measures the maximum time it takes for the leader to write log entries to disk gauge
gauge.consul.raft.leader.dispatchLog.min This measures the minimum time it takes for the leader to write log entries to disk gauge
gauge.consul.raft.leader.lastContact.avg This measures the time since the leader was last able to contact the follower nodes when checking its leader lease gauge
gauge.consul.raft.leader.lastContact.max This measures the maximum time since the leader was last able to contact the follower nodes when checking its leader lease gauge
gauge.consul.raft.leader.lastContact.min This measures the minimum time since the leader was last able to contact the follower nodes when checking its leader lease gauge
gauge.consul.raft.replication.appendEntries.rpc.AGENT.avg This measures the time it takes to replicate log entries to followers gauge
gauge.consul.raft.replication.appendEntries.rpc.AGENT.max This measures the maximum time it takes to replicate log entries to followers gauge
gauge.consul.raft.replication.appendEntries.rpc.AGENT.min This measures the minimum time it takes to replicate log entries to followers gauge
gauge.consul.raft.state.candidate Tracks the number of times given node enters the candidate state, i.e., the number of times the Consul server starts a leader election gauge
gauge.consul.raft.state.leader This metric increments whenever a Consul server becomes a leader gauge
gauge.consul.rpc.query gauge
gauge.consul.runtime.alloc_bytes Number of bytes allocated to Consul process on the node gauge
gauge.consul.runtime.heap_objects Number of heap objects allocated to Consul, indicates memory pressure on a Consul agent gauge
gauge.consul.runtime.num_goroutines Number of GO routines run by Consul process on the node gauge
gauge.consul.serf.events Number of serf events processed by Consul gauge
gauge.consul.serf.events.consul:new-leader gauge
gauge.consul.serf.member.join This metric tracks successful node joins to the Serf memberlist gauge
gauge.consul.serf.member.left This metric tracks successful node leaves to the Serf memberlist gauge
gauge.consul.serf.queue.Event.avg Average number of serf events in queue yet to be processed by Consul agent gauge
gauge.consul.serf.queue.Event.max Maximum number of serf events in queue yet to be processed by Consul agent gauge
gauge.consul.serf.queue.Event.min Minimum number of serf events in queue yet to be processed by Consul agent gauge
gauge.consul.serf.queue.Query.avg Average number of serf queries in queue yet to be processed by Consul agent gauge
gauge.consul.serf.queue.Query.max Maximum number of serf queries in queue yet to be processed by Consul agent gauge
gauge.consul.serf.queue.Query.min Minimum number of serf queries in queue yet to be processed by Consul agent gauge

consul.dns.stale_queries πŸ”—

gauge

Number of times an agent serves a DNS query based on information from a server that is more than 5 seconds out of date. This metric has the dimensions datacenter, consul_node and consul_mode.

consul.memberlist.msg.suspect πŸ”—

gauge

This increments when an agent suspects another as failed when executing random probes as part of the gossip protocol. These can be an indicator of overloaded agents, network problems, or configuration errors where agents can not connect to each other on the required ports. This metric has the dimensions datacenter, consul_node and consul_mode.

consul.serf.member.flap πŸ”—

gauge

This metric increments when an agent is marked dead and then recovers within a short time period. This can be an indicator of overloaded agents, network problems, or configuration errors where agents can not connect to each other on the required ports. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.catalog.nodes.total πŸ”—

gauge

The total number of nodes in the Consul datacenter. This metric is common to the cluster and, therefore, reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode to indicate which mode - server or client - is the reporting consul agent.

gauge.consul.catalog.nodes_by_service πŸ”—

gauge

Number of nodes providing a given service. This metric is reported by the leader only. The dimension consul_service indicates which service the metric corresponds too. Additionally, the metric also has the datacenter and consul_mode dimension.

gauge.consul.catalog.services.total πŸ”—

gauge

The total number of services registered with Consul in the given datacenter. This metric is common to the cluster and, therefore, reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode to indicate which mode - server or client - is the reporting consul agent.

gauge.consul.catalog.services_by_node πŸ”—

gauge

Number of services registered with a node. This metric is reported by the leader only. The dimension consul_node indicates which node the metric corresponds too. Additionally, the metric also has the datacenter and consul_mode dimension.

gauge.consul.consul.dns.domain_query.AGENT.avg πŸ”—

gauge

This tracks how long it takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.domain_query.AGENT.max πŸ”—

gauge

This tracks maximum time takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.domain_query.AGENT.min πŸ”—

gauge

This tracks minimum time it takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.ptr_query.AGENT.avg πŸ”—

gauge

This tracks average time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.ptr_query.AGENT.max πŸ”—

gauge

This tracks maximum time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.ptr_query.AGENT.min πŸ”—

gauge

This tracks minimum time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.leader.reconcile.avg πŸ”—

gauge

Time it takes the leader to reconcile the differences between Serf membership and Consul’s store. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.rpc.query πŸ”—

gauge

A general measure of all read volume. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.health.nodes.critical πŸ”—

gauge

Number of nodes for which health checks are reporting Critical state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.nodes.passing πŸ”—

gauge

Number of nodes which health checks are reporting to be in Passing state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.nodes.warning πŸ”—

gauge

Number of nodes which health checks are reporting to be in Warning state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.services.critical πŸ”—

gauge

Number of services for which health checks are reporting Critical state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.services.passing πŸ”—

gauge

Number of services which health checks are reporting to be in Passing state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.services.warning πŸ”—

gauge

Number of services which health checks are reporting to be in Warning state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.is_leader πŸ”—

gauge

Metric to map consul server’s in leader or follower state. A follower instance returns value of 0 and leader returns a value of 1. Used by a Heat Map in the dashboard which makes recognizing the leader from followers visually easy. This metric comes with the dimension - consul_server_state which can be either leader or follower. Also has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.network.dc.latency.avg πŸ”—

gauge

Average datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.

gauge.consul.network.dc.latency.max πŸ”—

gauge

Maximum datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.

gauge.consul.network.dc.latency.min πŸ”—

gauge

Minimum datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.

gauge.consul.network.node.latency.avg πŸ”—

gauge

Average network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.

gauge.consul.network.node.latency.max πŸ”—

gauge

Minimum network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.

gauge.consul.network.node.latency.min πŸ”—

gauge

Minimum network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.

gauge.consul.peers πŸ”—

gauge

Number of consul Raft peers or consul agents in server mode in a given datacenter. This metric is reported by the leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode

gauge.consul.raft.apply πŸ”—

gauge

This metric is a general indicator of the write load on the Consul servers. This metric has the global dimensions consul_node, consul_mode and datacenter.

gauge.consul.raft.commitTime.avg πŸ”—

gauge

This measures the mean time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.commitTime.max πŸ”—

gauge

This measures the max time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.commitTime.min πŸ”—

gauge

This measures the minimum time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.dispatchLog.avg πŸ”—

gauge

This measures the mean time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.dispatchLog.max πŸ”—

gauge

This measures the maximum time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.dispatchLog.min πŸ”—

gauge

This measures the minimum time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.lastContact.avg πŸ”—

gauge

This measures the time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.lastContact.max πŸ”—

gauge

This measures the maximum time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.lastContact.min πŸ”—

gauge

This measures the minimum time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.replication.appendEntries.rpc.AGENT.avg πŸ”—

gauge

This measures the time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.replication.appendEntries.rpc.AGENT.max πŸ”—

gauge

This measures the maximum time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.replication.appendEntries.rpc.AGENT.min πŸ”—

gauge

This measures the minimum time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.state.candidate πŸ”—

gauge

Tracks the number of times given node enters the candidate state, i.e., the number of times the Consul server starts a leader election. If this increments without a leadership change occurring it could indicate that a single server is overloaded or is experiencing network connectivity issues. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.state.leader πŸ”—

gauge

This metric increments whenever a Consul server becomes a leader. If there are frequent leadership changes this may be indication that the servers are overloaded and aren’t meeting the soft real-time requirements for Raft, or that there are networking problems between the servers. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.rpc.query πŸ”—

gauge

gauge.consul.runtime.alloc_bytes πŸ”—

gauge

Number of bytes allocated to Consul process on the node. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.runtime.heap_objects πŸ”—

gauge

Number of heap objects allocated to Consul, indicates memory pressure on a Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.runtime.num_goroutines πŸ”—

gauge

Number of GO routines run by Consul process on the node. Gives the general load pressure indicator for Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.events πŸ”—

gauge

Number of serf events processed by Consul. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.events.consul:new-leader πŸ”—

gauge

gauge.consul.serf.member.join πŸ”—

gauge

This metric tracks successful node joins to the Serf memberlist. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.member.left πŸ”—

gauge

This metric tracks successful node leaves to the Serf memberlist. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Event.avg πŸ”—

gauge

Average number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Event.max πŸ”—

gauge

Maximum number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Event.min πŸ”—

gauge

Minimum number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Query.avg πŸ”—

gauge

Average number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Query.max πŸ”—

gauge

Maximum number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Query.min πŸ”—

gauge

Minimum number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

Metrics that are categorized as container/host (default) are in bold and italics in the list below.

These are the metrics available for this integration.

  • consul.dns.stale_queries (gauge)
    Number of times an agent serves a DNS query based on information from a server that is more than 5 seconds out of date. This metric has the dimensions datacenter, consul_node and consul_mode.
  • consul.memberlist.msg.suspect (gauge)
    This increments when an agent suspects another as failed when executing random probes as part of the gossip protocol. These can be an indicator of overloaded agents, network problems, or configuration errors where agents can not connect to each other on the required ports. This metric has the dimensions datacenter, consul_node and consul_mode.
  • consul.serf.member.flap (gauge)
    This metric increments when an agent is marked dead and then recovers within a short time period. This can be an indicator of overloaded agents, network problems, or configuration errors where agents can not connect to each other on the required ports. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.catalog.nodes.total (gauge)
    The total number of nodes in the Consul datacenter. This metric is common to the cluster and, therefore, reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode to indicate which mode - server or client - is the reporting consul agent.
  • gauge.consul.catalog.nodes_by_service (gauge)
    Number of nodes providing a given service. This metric is reported by the leader only. The dimension consul_service indicates which service the metric corresponds too. Additionally, the metric also has the datacenter and consul_mode dimension.
  • gauge.consul.catalog.services.total (gauge)
    The total number of services registered with Consul in the given datacenter. This metric is common to the cluster and, therefore, reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode to indicate which mode - server or client - is the reporting consul agent.
  • gauge.consul.catalog.services_by_node (gauge)
    Number of services registered with a node. This metric is reported by the leader only. The dimension consul_node indicates which node the metric corresponds too. Additionally, the metric also has the datacenter and consul_mode dimension.
  • gauge.consul.consul.dns.domain_query.AGENT.avg (gauge)
    This tracks how long it takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.consul.dns.domain_query.AGENT.max (gauge)
    This tracks maximum time takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.consul.dns.domain_query.AGENT.min (gauge)
    This tracks minimum time it takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.consul.dns.ptr_query.AGENT.avg (gauge)
    This tracks average time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.consul.dns.ptr_query.AGENT.max (gauge)
    This tracks maximum time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.consul.dns.ptr_query.AGENT.min (gauge)
    This tracks minimum time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.consul.leader.reconcile.avg (gauge)
    Time it takes the leader to reconcile the differences between Serf membership and Consul’s store. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.consul.rpc.query (gauge)
    A general measure of all read volume. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.health.nodes.critical (gauge)
    Number of nodes for which health checks are reporting Critical state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.
  • gauge.consul.health.nodes.passing (gauge)
    Number of nodes which health checks are reporting to be in Passing state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.
  • gauge.consul.health.nodes.warning (gauge)
    Number of nodes which health checks are reporting to be in Warning state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.
  • gauge.consul.health.services.critical (gauge)
    Number of services for which health checks are reporting Critical state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.
  • gauge.consul.health.services.passing (gauge)
    Number of services which health checks are reporting to be in Passing state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.
  • gauge.consul.health.services.warning (gauge)
    Number of services which health checks are reporting to be in Warning state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.
  • gauge.consul.is_leader (gauge)
    Metric to map consul server’s in leader or follower state. A follower instance returns value of 0 and leader returns a value of 1. Used by a Heat Map in the dashboard which makes recognizing the leader from followers visually easy. This metric comes with the dimension - consul_server_state which can be either leader or follower. Also has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.network.dc.latency.avg (gauge)
    Average datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.
  • gauge.consul.network.dc.latency.max (gauge)
    Maximum datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.
  • gauge.consul.network.dc.latency.min (gauge)
    Minimum datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.
  • gauge.consul.network.node.latency.avg (gauge)
    Average network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.
  • gauge.consul.network.node.latency.max (gauge)
    Minimum network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.
  • gauge.consul.network.node.latency.min (gauge)
    Minimum network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.
  • gauge.consul.peers (gauge)
    Number of consul Raft peers or consul agents in server mode in a given datacenter. This metric is reported by the leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode
  • gauge.consul.raft.apply (gauge)
    This metric is a general indicator of the write load on the Consul servers. This metric has the global dimensions consul_node, consul_mode and datacenter.
  • gauge.consul.raft.commitTime.avg (gauge)
    This measures the mean time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.commitTime.max (gauge)
    This measures the max time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.commitTime.min (gauge)
    This measures the minimum time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.leader.dispatchLog.avg (gauge)
    This measures the mean time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.leader.dispatchLog.max (gauge)
    This measures the maximum time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.leader.dispatchLog.min (gauge)
    This measures the minimum time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.leader.lastContact.avg (gauge)
    This measures the time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.leader.lastContact.max (gauge)
    This measures the maximum time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.leader.lastContact.min (gauge)
    This measures the minimum time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.replication.appendEntries.rpc.AGENT.avg (gauge)
    This measures the time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.replication.appendEntries.rpc.AGENT.max (gauge)
    This measures the maximum time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.replication.appendEntries.rpc.AGENT.min (gauge)
    This measures the minimum time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.state.candidate (gauge)
    Tracks the number of times given node enters the candidate state, i.e., the number of times the Consul server starts a leader election. If this increments without a leadership change occurring it could indicate that a single server is overloaded or is experiencing network connectivity issues. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.raft.state.leader (gauge)
    This metric increments whenever a Consul server becomes a leader. If there are frequent leadership changes this may be indication that the servers are overloaded and aren’t meeting the soft real-time requirements for Raft, or that there are networking problems between the servers. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.rpc.query (gauge)
  • gauge.consul.runtime.alloc_bytes (gauge)
    Number of bytes allocated to Consul process on the node. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.runtime.heap_objects (gauge)
    Number of heap objects allocated to Consul, indicates memory pressure on a Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.runtime.num_goroutines (gauge)
    Number of GO routines run by Consul process on the node. Gives the general load pressure indicator for Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.events (gauge)
    Number of serf events processed by Consul. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.events.consul:new-leader (gauge)
  • gauge.consul.serf.member.join (gauge)
    This metric tracks successful node joins to the Serf memberlist. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.member.left (gauge)
    This metric tracks successful node leaves to the Serf memberlist. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.queue.Event.avg (gauge)
    Average number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.queue.Event.max (gauge)
    Maximum number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.queue.Event.min (gauge)
    Minimum number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.queue.Query.avg (gauge)
    Average number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.queue.Query.max (gauge)
    Maximum number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.
  • gauge.consul.serf.queue.Query.min (gauge)
    Minimum number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

Non-default metrics (version 4.7.0+) πŸ”—

The following information applies to the agent version 4.7.0+ that has enableBuiltInFiltering: true set on the top level of the agent config.

To emit metrics that are not default, you can add those metrics in the generic monitor-level extraMetrics config option. Metrics that are derived from specific configuration options that do not appear in the above list of metrics do not need to be added to extraMetrics.

To see a list of metrics that will be emitted you can run agent-status monitors after configuring this monitor in a running agent instance.

Legacy non-default metrics (version < 4.7.0) πŸ”—

The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set enableBuiltInFiltering: true at the top level of your agent config, see the section above. See upgrade instructions in Old-style inclusion list filtering.

If you have a reference to the whitelist.json in your agent’s top-level metricsToExclude config option, and you want to emit metrics that are not in that allow list, then you need to add an item to the top-level metricsToInclude config option to override that allow list (see Inclusion filtering. Or you can just copy the whitelist.json, modify it, and reference that in metricsToExclude.