Docs » Integrations Guide » Integrations Reference » Consul

image0 Consul

Metadata associated with the consul plugin for collectd can be found here. The relevant code for the plugin can be found here.

DESCRIPTION

This is the SignalFx Consul plugin. Follow these instructions to install the Consul plugin for collectd.

The consul-collectd plugin collects metrics from Consul instances hitting these endpoints:

FEATURES

Built-in dashboards

  • CONSUL CLUSTER: Provides a high-level overview of metrics for a single Consul cluster.

image1

image2

  • CONSUL HEALTH: Provides key metrics to monitoring Consul’s performance.

image3

image4

image5

  • CONSUL SERVER: Provides server-specific metrics.

image6

image7

  • CONSUL CLIENT: Provides client-specific metrics.

image8

REQUIREMENTS AND DEPENDENCIES

Version information

Software Version
collectd 4.9 or later
python 2.6 or later
Consul 0.7.0 or later
Python plugin for collectd (included with SignalFx collectd agent)

INSTALLATION

If you are using the new Smart Agent, see the docs for thecollectd/consulmonitor for more information. The configuration documentation below may be helpful as well, but consult the Smart Agent repo’s docs for the exact schema.

  1. Download collectd-consul. Place the consul_plugin.py and urllib_ssl_handler.py file in /usr/share/collectd/collectd-consul
  2. Place the sample configuration file for this plugin to /etc/collectd/managed_config
  3. Modify the sample configuration file as described in Configuration, below
  4. Install the Python requirements with sudo pip install -r requirements.txt
  5. Restart collectd

CONFIGURATION

If running Consul version below 0.9.1, configure the Consul agents that are to be monitored to send telemetry by adding the below configuration to Consul agents configuration file.

{"telemetry":
  {"statsd_address": "host:port"}
}

This plugin will start a UDP server listening at above host and port.

Using the example configuration file 10-consul.conf as a guide, provide values for the configuration options listed below that make sense for your environment and allow you to connect to the consul members

Configuration Option Description Default Value
ApiHost IP address or DNS to which the Consul HTTP/HTTPS server binds to on the instance to be monitored localhost
ApiPort Port to which the Consul HTTP/HTTPS server binds to on the instance to be monitored 8500
ApiProtocol Possible values - http or https http
AclToken Consul ACL token. None
TelemetryServer Possible values - true or false*Set to *true to enable collecting Consul’s internal metrics via UDP from Consul’s telemetry.If set to false and Consul version is 0.9.1 and above, the metrics will be collected from API.If set to false and Consul version is less than 0.9.1, Consul’s internal metrics will not be available. false
TelemetryHost IP address or DNS to which consul is configured to send telemetry UDP packets. Relevant if TelemetryServer set to true. localhost
TelemetryPort Port to which consul is configured to send telemetry UDP packets. Relevant if TelemetryServer set to true. 8125
EnhancedMetrics Possible values - true or false*Set to *true to enable collecting all metrics from Consul’s runtime telemetry send via UDP or from the /agent/metrics endpoint. false
ExcludeMetric Blocks metrics by prefix matching, if EnhancedMetrics is true. This can be used to exclude metrics sent from /agent/metrics endpoint or from Consul’s runtime telemetry send via UDP. None
IncludeMetric Allows metrics by prefix matching, if EnhancedMetrics is false. This can be used to include metrics sent from /agent/metrics endpoint or from Consul’s runtime telemetry send via UDP. None
SfxToken SignalFx org access token. If added to the config, an event is sent to SignalFx on leader transition and can be viewed on the Consul dashboard. None
Dimension Add single custom global dimension to your metrics, formatted as “key=value” None
Dimensions Add multiple global dimensions, formatted as “key1=value1,key2=value2,...” None
CaCertificate If Consul server has https enabled for the API, provide the path to the CA Certificate. None
ClientCertificate If client-side authentication is enabled, provide the path to the certificate file. None
ClientKey If client-side authentication is enabled, provide the path to the key file. None
Debug Possible values - true or false false

Example configuration:

LoadPlugin python

<Plugin python>
  ModulePath "/usr/share/collectd/collectd-consul"

  Import consul_plugin
  <Module consul_plugin>
    ApiHost "server-1"
    ApiPort 8500
    ApiProtocol "http"
    AclToken "token"
    SfxToken "SignalFX_token"
    TelemetryServer true
    TelemetryHost "17.2.3.4"
    TelemetryPort 8125
    EnhancedMetrics true
    ExcludeMetric "consul.consul.http"
    ExcludeMetric "consul.memberlist"
    Dimension "foo=bar"
    Dimensions "foo=bar,bar=baz"
    CaCertificate "path/to/ca_cert"
    ClientKey "path/to/client_key"
    ClientCertificate "path/to/client/certificate"
    Debug true
  </Module>
</Plugin>

The plugin can be configured to collect metrics from multiple instances in the following manner.

LoadPlugin python

<Plugin python>
  ModulePath "/usr/share/collectd/collectd-consul"

  Import consul_plugin
  <Module consul_plugin>
    ApiHost "server-1"
    ApiPort 8500
    ApiProtocol "http"
    AclToken "token"
    SfxToken "SignalFX_token"
    TelemetryServer true
    TelemetryHost "17.2.3.4"
    TelemetryPort 8125
    EnhancedMetrics true
    ExcludeMetric "consul.consul.http"
    ExcludeMetric "consul.memberlist"
    Dimension "foo=bar"
    Debug true
  </Module>
  <Module consul_plugin>
    ApiHost "server-2"
    ApiPort 8500
    ApiProtocol "http"
    IncludeMetric "consul.fsm"
    Dimensions "foo=bar,bar=baz"
    TelemetryServer false
  </Module>
</Plugin>

USAGE

Interpreting Built-in dashboards

  • CONSUL CLUSTER:
  • Total Services: Shows the total number of services registered with the Consul cluster.

image9

  • Total Nodes: Shows the total number of nodes in the Consul cluster’s catalog. Nodes include instances running consul agent in either client or server mode and external nodes registered with the Consul store.

image10

  • Number of services by node: Descending list showing the number of services that are registered with a given node. The node name displayed is the Consul NodeName config value.

image11

  • Number of Nodes by Service: Descending list showing the number of nodes that are providing a given service in the datacenter.

image12

  • Service health check results: A list showing the results of service health checks that are registered with Consul. Checks can result in 3 states - passing, warning and critical.

image13

  • Node health check results: Node checks are done on the individual host level. If a host fails a check, all services registered with it are marked as failed and Consul no longer returns the node in service discovery requests. The chart is a list showing the results of node health checks. Checks can result in 3 states - passing, warning and critical.

image14

  • Total Peers: Number of consul Raft peers or consul agents in server mode in a given datacenter.

image15

  • Consul Server Map: Displays the followers and leader in given datacenter.

image16

  • Mean node network latency: Shows the average latency of a given node from other nodes in the Consul cluster. The dimension consul_node corresponds to the source node. The maximum and minimum values for this metric are also available.

image17

  • Mean datacenter latency: Average datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. The maximum and minimum values for this metric are also available.

image18

  • CONSUL HEALTH:
  • Leadership Change Event: Event feed showing leader tranisiton events. The event has the new and old leader node name as dimensions.

image19

  • Leadership Transitions: Tracks number of leadership transitions. If there are frequent leadership changes this may be an indication that the servers are overloaded and aren’t meeting the soft real-time requirements for Raft, or that there are networking problems between the servers.

image20

  • Leader last contact with followers: This shows the time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease.

image21

  • Leader latency to commit to disk: Time it takes for the leader to write log entries to disk.

image22

  • Raft commit time: Time it takes to commit a new entry to the Raft log on the leader.

image23

  • Number of Raft Transactions: This is a general indicator of the write load on the Consul servers.

image24

  • Leader Time to Append Entries: This measures the time it takes the leader to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers.

image25

  • Number of RPC queries: Total number of rpc queries per interval. This is a general measure of all read volume.

image26

  • Cluster Joins and Leaves: This chart tracks successful node joins and leaves in the Serf memberlist.

image27

  • Leader time to reconcile: Shows the time it takes for the leader to reconcile Serf membership and what is reflected in Consul’s store.

image28

  • Serf Events: Consul provides an event feature by which custom events can be propagated across your entire datacenter. This chart shows the number of events processed by Consul agents per interval. Using this chart you can track if triggered events were processed by a consul node. Additinally, you can also easily setup a chart to track events for a selected node in the CLIENT and SERVER dashboard.

image29

  • Serf Event Queue: Shows the avg and max number of backlog of serf events in queue of Consul agents.

image30

  • CONSUL CLIENT
  • Number of allocated heap objects: Gives the number of heap objects allocated to the consul process. Indicates memory pressure on a Consul node.

image31

  • Allocated Bytes: Number of allocated bytes to the Consul process.

image32

  • Number of GO routines: The number of GO routines Consul is running. This is a general load pressure indicator for Consul agent.

image33

  • Network Latency: Shows the avg, max and min network latency between the node and other nodes in the datacenter.

image34

  • Time to service DNS queries: Consul provides both DNS and HTTP interfaces for service discovery. This shows the time it takes to service forward and reverse DNS lookups by the selected node.

image35

  • CONSUL SERVER
    All charts metioned in the Client dashboard are also present in the Server dashboard. In addition to those, the following charts are present
  • Raft candidate state: This chart tracks if the selected Consul server starts an election. If this metric increments without a leadership change occurring it could indicate that a single server is overloaded or is experiencing network connectivity issues.

image36

All metrics reported by the Consul collectd plugin will contain the following dimensions by default:

  • datacenter, this is the datacenter to which the Consul agent belongs to. The value for this dimension is read from the agents configuration
  • consul_node, this is the Consul node name as seen in Consul agents configuration
  • consul_mode, consul agent is in client or server mode

The metric consul.is_leader is reported by consul servers and have the dimension - consul_server_state which can be either leader or follower.

Additional default metrics to track

  • consul.memberlist.msg.suspect - This metric counts the number of times an agent suspects another as failed when executing random probes as part of the gossip protocol. These can be an indicator of overloaded agents, network problems, or configuration errors where agents can not connect to each other on the required ports.
  • consul.serf.member.flap - This metric tracks when an agent is marked dead and then recovers within a short time period. This can be an indicator of overloaded agents, network problems, or configuration errors where agents can not connect to each other on the required ports.
  • consul.dns.stale_queries - This metric tracks when an agent serves a DNS query based on information from a server that is more than 5 seconds out of date.

A few other details:

  • plugin is always set to consul
  • To add additional metrics from the telemetry stream or /agent/metrics endpoint, use the configuration options mentioned in configuration. If metrics are being included individually, make sure to give valid prefixes. For e.g., to add metrics which track time taken to serve http requests, Consul emits these metrics in the form consul.http.<verb>.<path>. So to enable metrics which track time taken to service GET requests on Key/Value endpoint, add this consul.http.GET.v1.kv to the IncludeMetric cofiguration. If you want to allow metrics which track time taken to service all GET requests, add consul.http.GET to the configuration. When enhance metrics are enabled, you can block metrics in a similar manner.
  • The metrics from /agent/metric endpoint are aggregated over an interval of 10 seconds. Keep this in mind when changing the default collectd interval from 10 seconds.

METRICS

List of default metrics collected from telemetry stream or agent/metrics endpoint-

  • consul.raft.state.leader
  • consul.raft.state.candidate
  • consul.raft.leader.lastContact
  • consul.raft.leader.dispatchLog
  • consul.raft.commitTime
  • consul.raft.apply
  • consul.raft.replication.appendEntries.rpc.
  • consul.rpc.query
  • consul.consul.leader.reconcile
  • consul.serf.events
  • consul.serf.queue.Event
  • consul.serf.queue.Query
  • consul.serf.member.join
  • consul.serf.member.left
  • consul.runtime.heap_objects
  • consul.runtime.alloc_bytes
  • consul.runtime.num_goroutines
  • consul.dns.domain_query.
  • consul.dns.ptr_query.
  • consul.dns.stale_queries.
  • consul.serf.member.flap
  • consul.memberlist.msg.suspect

List of default metrics collected from additional endpoints -

  • consul.is_leader
  • consul.peers
  • consul.catalog.nodes.total
  • consul.catalog.service.total
  • consul.catalog.nodes_by_service
  • consul.catalog.services_by_node
  • consul.health.nodes.passing
  • consul.health.nodes.warning
  • consul.health.nodes.critical
  • consul.health.services.passing
  • consul.health.services.warning
  • consul.health.services.critical
  • consul.network.node.latency
  • consul.network.dc.latency

Below is a list of all metrics.

Metric Name Brief Type
consul.dns.stale_queries Number of times an agent serves a DNS query with stale information gauge
consul.memberlist.msg.suspect Number of suspect messages received per interval gauge
consul.serf.member.flap Tracks flapping agents gauge
gauge.consul.catalog.nodes.total Number of nodes in the Consul datacenter gauge
gauge.consul.catalog.nodes_by_service Number of nodes providing a given service gauge
gauge.consul.catalog.services.total Total number of services registered with Consul in the given datacenter gauge
gauge.consul.catalog.services_by_node Number of services registered with a node gauge
gauge.consul.consul.dns.domain_query.AGENT.avg Average time to complete a forward DNS query gauge
gauge.consul.consul.dns.domain_query.AGENT.max Max time to complete a forward DNS query gauge
gauge.consul.consul.dns.domain_query.AGENT.min Min time to complete a forward DNS query gauge
gauge.consul.consul.dns.ptr_query.AGENT.avg Average time to complete a Reverse DNS query gauge
gauge.consul.consul.dns.ptr_query.AGENT.max Max time to complete a Reverse DNS query gauge
gauge.consul.consul.dns.ptr_query.AGENT.min Min time to complete a Reverse DNS query gauge
gauge.consul.consul.leader.reconcile.avg Leader time to reconcile the differences between Serf membership and Consul’s store gauge
gauge.consul.consul.rpc.query A general measure of all read volume gauge
gauge.consul.health.nodes.critical Number of nodes for which health checks are reporting Critical state gauge
gauge.consul.health.nodes.passing Number of nodes for which health checks are reporting Passing state gauge
gauge.consul.health.nodes.warning Number of nodes for which health checks are reporting Warning state gauge
gauge.consul.health.services.critical Number of services for which health checks are reporting Critical state gauge
gauge.consul.health.services.passing Number of services for which health checks are reporting Passing state gauge
gauge.consul.health.services.warning Number of services for which health checks are reporting Warning state gauge
gauge.consul.is_leader Metric to map consul server’s in leader or follower state gauge
gauge.consul.network.dc.latency.avg Average network latency between 2 datacenters gauge
gauge.consul.network.dc.latency.max Maximum network latency between 2 datacenters gauge
gauge.consul.network.dc.latency.min Minimum network latency between 2 datacenters gauge
gauge.consul.network.node.latency.avg Average network latency between given node and other nodes in the datacenter gauge
gauge.consul.network.node.latency.max Minimum network latency between given node and other nodes in the datacenter gauge
gauge.consul.network.node.latency.min Minimum network latency between given node and other nodes in the datacenter gauge
gauge.consul.peers Number of Raft peers in Consul datacenter gauge
gauge.consul.raft.apply Number of raft transactions gauge
gauge.consul.raft.commitTime.avg Average of the time it takes to commit an entry on the leader gauge
gauge.consul.raft.commitTime.max Max of the time it takes to commit an entry on the leader gauge
gauge.consul.raft.commitTime.min Minimum of the time it takes to commit an entry on the leader gauge
gauge.consul.raft.leader.dispatchLog.avg Average of the time it takes for the leader to write log entries to disk gauge
gauge.consul.raft.leader.dispatchLog.max Maximum of the time it takes for the leader to write log entries to disk gauge
gauge.consul.raft.leader.dispatchLog.min Minimum of the time it takes for the leader to write log entries to disk gauge
gauge.consul.raft.leader.lastContact.avg Mean of the time since the leader was last able to contact follower nodes gauge
gauge.consul.raft.leader.lastContact.max Max of the time since the leader was last able to contact follower nodes gauge
gauge.consul.raft.leader.lastContact.min Min of the time since the leader was last able to contact follower nodes gauge
gauge.consul.raft.replication.appendEntries.rpc.AGENT.avg Mean time taken to complete the AppendEntries RPC gauge
gauge.consul.raft.replication.appendEntries.rpc.AGENT.max Max time taken to complete the AppendEntries RPC gauge
gauge.consul.raft.replication.appendEntries.rpc.AGENT.min Min time taken to complete the AppendEntries RPC gauge
gauge.consul.raft.state.candidate Tracks the number of times given node enters the candidate state gauge
gauge.consul.raft.state.leader Tracks the number of leadership transitions per interval gauge
gauge.consul.runtime.alloc_bytes Number of bytes allocated to Consul process on the node gauge
gauge.consul.runtime.heap_objects Number of heap objects allocated to Consul gauge
gauge.consul.runtime.num_goroutines Number of GO routines run by Consul process gauge
gauge.consul.serf.events Number of serf events processed gauge
gauge.consul.serf.member.join Tracks successful node joins gauge
gauge.consul.serf.member.left Tracks successful node leaves gauge
gauge.consul.serf.queue.Event.avg Average number of serf events in queue yet to be processed gauge
gauge.consul.serf.queue.Event.max Maximum number of serf events in queue yet to be processed during the interval gauge
gauge.consul.serf.queue.Event.min Minimum number of serf events in queue yet to be processed during the interval gauge
gauge.consul.serf.queue.Query.avg Average number of serf queries in queue yet to be processed during the interval gauge
gauge.consul.serf.queue.Query.max Maximum number of serf queries in queue yet to be processed during the interval gauge
gauge.consul.serf.queue.Query.min Minimum number of serf queries in queue yet to be processed during the interval gauge

consul.dns.stale_queries

gauge

Number of times an agent serves a DNS query based on information from a server that is more than 5 seconds out of date. This metric has the dimensions datacenter, consul_node and consul_mode.

consul.memberlist.msg.suspect

gauge

This increments when an agent suspects another as failed when executing random probes as part of the gossip protocol. These can be an indicator of overloaded agents, network problems, or configuration errors where agents can not connect to each other on the required ports. This metric has the dimensions datacenter, consul_node and consul_mode.

consul.serf.member.flap

gauge

This metric increments when an agent is marked dead and then recovers within a short time period. This can be an indicator of overloaded agents, network problems, or configuration errors where agents can not connect to each other on the required ports. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.catalog.nodes.total

gauge

The total number of nodes in the Consul datacenter. This metric is common to the cluster and, therefore, reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode to indicate which mode - server or client - is the reporting consul agent.

gauge.consul.catalog.nodes_by_service

gauge

Number of nodes providing a given service. This metric is reported by the leader only. The dimension consul_service indicates which service the metric corresponds too. Additionally, the metric also has the datacenter and consul_mode dimension.

gauge.consul.catalog.services.total

gauge

The total number of services registered with Consul in the given datacenter. This metric is common to the cluster and, therefore, reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode to indicate which mode - server or client - is the reporting consul agent.

gauge.consul.catalog.services_by_node

gauge

Number of services registered with a node. This metric is reported by the leader only. The dimension consul_node indicates which node the metric corresponds too. Additionally, the metric also has the datacenter and consul_mode dimension.

gauge.consul.consul.dns.domain_query.AGENT.avg

gauge

This tracks how long it takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.domain_query.AGENT.max

gauge

This tracks maximum time takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.domain_query.AGENT.min

gauge

This tracks minimum time it takes to service forward DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.ptr_query.AGENT.avg

gauge

This tracks average time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.ptr_query.AGENT.max

gauge

This tracks maximum time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.dns.ptr_query.AGENT.min

gauge

This tracks minimum time it takes to service reverse DNS lookups on the given Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.leader.reconcile.avg

gauge

Time it takes the leader to reconcile the differences between Serf membership and Consul’s store. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.consul.rpc.query

gauge

A general measure of all read volume. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.health.nodes.critical

gauge

Number of nodes for which health checks are reporting Critical state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.nodes.passing

gauge

Number of nodes which health checks are reporting to be in Passing state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.nodes.warning

gauge

Number of nodes which health checks are reporting to be in Warning state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.services.critical

gauge

Number of services for which health checks are reporting Critical state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.services.passing

gauge

Number of services which health checks are reporting to be in Passing state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.health.services.warning

gauge

Number of services which health checks are reporting to be in Warning state. This metric is reported by leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode.

gauge.consul.is_leader

gauge

Metric to map consul server’s in leader or follower state. A follower instance returns value of 0 and leader returns a value of 1. Used by a Heat Map in the dashboard which makes recognizing the leader from followers visually easy. This metric comes with the dimension - consul_server_state which can be either leader or follower. Also has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.network.dc.latency.avg

gauge

Average datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.

gauge.consul.network.dc.latency.max

gauge

Maximum datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.

gauge.consul.network.dc.latency.min

gauge

Minimum datacenter latency between 2 datacenters. This metric has the additional dimension destination_dc dimension. The latency is calculated between this destination datacenter and the agent’s datacenter given by the datacenter dimension. Only the leader in the source datacenter calculates this metric. The metric also has the dimensions consul_mode and consul_node.

gauge.consul.network.node.latency.avg

gauge

Average network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.

gauge.consul.network.node.latency.max

gauge

Minimum network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.

gauge.consul.network.node.latency.min

gauge

Minimum network latency between given node and other nodes in the datacenter. The dimension consul_node corresponds to the source node. The metric also has the dimensions datacenter and consul_mode.

gauge.consul.peers

gauge

Number of consul Raft peers or consul agents in server mode in a given datacenter. This metric is reported by the leader only. This metric is reported with the dimension datacenter, consul_node name and consul_mode

gauge.consul.raft.apply

gauge

This metric is a general indicator of the write load on the Consul servers. This metric has the global dimensions consul_node, consul_mode and datacenter.

gauge.consul.raft.commitTime.avg

gauge

This measures the mean time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.commitTime.max

gauge

This measures the max time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.commitTime.min

gauge

This measures the minimum time it takes to commit a new entry to the Raft log on the leader. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.dispatchLog.avg

gauge

This measures the mean time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.dispatchLog.max

gauge

This measures the maximum time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.dispatchLog.min

gauge

This measures the minimum time it takes for the leader to write log entries to disk. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.lastContact.avg

gauge

This measures the time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.lastContact.max

gauge

This measures the maximum time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.leader.lastContact.min

gauge

This measures the minimum time since the leader was last able to contact the follower nodes when checking its leader lease. It can be used as a measure for how stable the Raft timing is and how close the leader is to timing out its lease. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.replication.appendEntries.rpc.AGENT.avg

gauge

This measures the time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.replication.appendEntries.rpc.AGENT.max

gauge

This measures the maximum time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.replication.appendEntries.rpc.AGENT.min

gauge

This measures the minimum time it takes to replicate log entries to followers. This is a general indicator of the load pressure on the Consul servers, as well as the performance of the communication between the servers. This metric is sent by the leader for each follower. The metric has the followers ip or hostname added to the metric name. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.state.candidate

gauge

Tracks the number of times given node enters the candidate state, i.e., the number of times the Consul server starts a leader election. If this increments without a leadership change occurring it could indicate that a single server is overloaded or is experiencing network connectivity issues. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.raft.state.leader

gauge

This metric increments whenever a Consul server becomes a leader. If there are frequent leadership changes this may be indication that the servers are overloaded and aren’t meeting the soft real-time requirements for Raft, or that there are networking problems between the servers. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.runtime.alloc_bytes

gauge

Number of bytes allocated to Consul process on the node. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.runtime.heap_objects

gauge

Number of heap objects allocated to Consul, indicates memory pressure on a Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.runtime.num_goroutines

gauge

Number of GO routines run by Consul process on the node. Gives the general load pressure indicator for Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.events

gauge

Number of serf events processed by Consul. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.member.join

gauge

This metric tracks successful node joins to the Serf memberlist. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.member.left

gauge

This metric tracks successful node leaves to the Serf memberlist. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Event.avg

gauge

Average number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Event.max

gauge

Maximum number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Event.min

gauge

Minimum number of serf events in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Query.avg

gauge

Average number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Query.max

gauge

Maximum number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.

gauge.consul.serf.queue.Query.min

gauge

Minimum number of serf queries in queue yet to be processed by Consul agent. This metric has the dimensions datacenter, consul_node and consul_mode.