Docs » Integrations Guide » Use the SignalFx collectd Agent

Use the SignalFx collectd Agent 🔗


The Smart Agent supersedes the SignalFx collectd agent. The Smart Agent wraps collectd and provides auto-discovery of services, among other features. You should use the Smart Agent instead of collectd, which is described below for legacy purposes.

The SignalFx collectd agent is based on collectd, an open source daemon that collects statistics from a system and publishes them to a destination of your choice. You can use the SignalFx collectd agent out of the box to monitor infrastructure metrics, and extend it to monitor a wide range of software by installing collectd plugins. It’s fast, performs well at scale, and enjoys great community support.

Sending data using the SignalFx collectd agent allows you to take advantage of extensive collectd support.

  • Splunk Infrastructure Monitoring Infrastructure Navigator visualizes hosts that are monitored using the SignalFx collectd agent.
  • Infrastructure Monitoring provides built-in dashboards to show infrastructure metrics as reported by the collectd agent and other software or service metrics as reported by collectd plugins.
  • Validated plugins for collectd help you monitor specific software in your environment. Browse plugins that have been validated on Github, or on the Integrations page in Splunk Infrastructure Monitoring.
  • The SignalFx metadata plugin for collectd enriches your data by sending metadata about your hosts to Splunk Infrastructure Monitoring. This plugin is included by default in SignalFx collectd packages.

Set up the SignalFx collectd agent 🔗

To get the most value out of Infrastructure Monitoring, you will most likely want to install the SignalFx collectd agent along with a number of collectd plugins. If you have stringent networking security requirements, you may want to send the collectd traffic through an HTTP proxy or the SignalFx Gateway (formerly called the metric proxy).

Installing the collectd agent 🔗

You can install the SignalFx collectd agent using any of the following methods:

If you don’t know which of these is most appropriate to your environment, see the directions for using the shell script. Before installation, make sure your systems meet the necessary requirements and dependencies.

Configuring the collectd agent 🔗

The collectd agent is accompanied by a default configuration file, collectd.conf, that does not need modification for the agent to function. An example configuration file for the SignalFx collectd agent can be found in our GitHub repository. If you plan to use additional collectd plugins, you also need to modify collectd.conf.

Using collectd through the SignalFx Gateway 🔗

If instances of SignalFx collectd are unable to transmit outside the network, the SignalFx Gateway (formerly called the metric proxy) can be used to receive connections from many instances of collectd, and forward transmissions to Infrastructure Monitoring using a single outgoing HTTP connection. This is suitable for environments in which transmissions exiting a network are highly restricted. Click here to read more about the SignalFx Gateway.

Transmitting through an existing HTTP proxy 🔗

The collectd agent can be configured to use an HTTP proxy if needed. The changes would need to be made on files that are sourced by the init scripts. Modify or create the indicated file, with the following contents:

On CentOS/RHEL: /etc/sysconfig/collectd

On Debian/Ubuntu: /etc/default/collectd

Sample contents of the file:

export http_proxy="http://HTTP_PROXY:PROXY_PORT"
export https_proxy="https://HTTPS_PROXY:PROXY_PORT"

Replace HTTP_PROXY and HTTPS_PROXY with the hostname of the HTTP proxy to be used, and PROXY_PORT with the port at which to access it.

Upgrading or uninstalling the collectd agent 🔗

For the most part, upgrading or uninstalling the SignalFx collectd agent is done using the native mechanism on the Linux distributions you are using. Note that the installation mechanisms above make use of a specific repository; upgrades should be done using the same repository to avoid conflicts between the SignalFx collectd agent and the open source version of collectd.

Install collectd plugins 🔗

The collectd community developed plugins for collecting and sending data from a wide range of infrastructures and applications. Before being acquired by Splunk, SignalFx validated a specific set of plugins for collecting and sending data. In addition, SignalFx created a set of built-in dashboards for metrics sent from collectd plugins.

The collectd agent includes a default set of plugins for gathering basic infrastructure metrics and the SignalFx metadata plugin. The SignalFx metadata plugin enriches your collectd data by sending metadata about your collectd hosts to Infrastructure Monitoring.

To add plugins for monitoring software, systems and infrastructure in your environment, browse available plugins and find installation instructions on the Integrations page in Splunk Infrastructure Monitoring, or search for them on the community site.

Using collectd metrics 🔗

Metric names in collectd 🔗

Metrics collected by collectd plugins generally have dot-delimited names, such as:

  • network.usage.tx_packets
  • gauge.kafka-underreplicated-partitions

In some cases, the names include a metric type (“gauge”) and/or the name of the software or infrastructure that is being measured (“kafka”). Because these are not used consistently across collectd plugins, there are instead a few easy-to-use ways of finding the metric that you want to chart:

  • In the built-in dashboards, there are graphs for the most common metrics for each piece of software or infrastructure. By clicking on the name of the relevant chart on those dashboards to enter the chart builder, you can see the names of the metrics in use.
  • Using the Metadata Catalog, you can select or add a filter for the plugin used to monitor the desired software or infrastructure, then view all of the associated metrics.
  • Metric descriptions are documented on a per-plugin basis, and are accessible via the Metadata Catalog by selecting the metric, then clicking on the More details link under the name of the metric.

Recognizing metric names 🔗

If you know what metadata is being published by a collectd plugin, then you can infer how its metrics will be displayed. Each metric that collectd publishes includes a predefined set of metadata: host, plugin, plugin_instance, type, type_instance, and dsnames.

Integration Monitoring creates metric names using type, type_instance and dsnames. If any of these metadata types are empty sets, then they are not included in the name. If there are multiple values for each of these metadata types, then each valid combination of values generates a distinct metric name.

A sample collectd load average metric submission might look like the following:

{     “dsnames”: [         “shortterm”,         “midterm”,         “longterm”     ],     “dstypes”: [         “gauge”,         “gauge”,         “gauge”     ],     “host”: “i-b13d1e5f”,     “interval”: 10.0,     “plugin”: “load”,     “plugin_instance”: “”,     “time”: 1415062577.4960001,     “type”: “load”,

“type_instance”: “”,

    “values”: [         0.37,         0.60999999999999999,         0.76000000000000001     ] },

Given the foregoing data, three datapoints are created, one for each combination of type and dsnames: load.shortterm, load.midterm and load.longterm. Each datapoint will have the dimensions host : “i-b13d1e5f” and plugin : “load”. We don’t create a type dimension because it is already being used in the metric name.

Mapping GenericJMX metrics into SignalFx 🔗

collectd includes many “generic” plugins. Generic plugins do not have a default behavior, and require some configuration before they can be used. One of the more common generic plugins used with Infrastructure Monitoring is the GenericJMX plugin, which is good for sending in metrics for Kafka, Zookeeper, Cassandra and many other Java-based open source projects.

As with standard collectd plugins, Splunk Infrastructure Monitoring relies on properties set within plugins to derive metric names. A sample configuration of the GenericJMX plugin used for Kafka might look like the following:

<Plugin java>
   <Plugin "GenericJMX">
       <MBean "kafka-all-messages">
               Type "counter"
               Table false
               Attribute "Count"
               InstancePrefix "kafka-all-messages-in"

In this case, the “Type” and “InstancePrefix” for each <Value> stanza are used to create a metric called counter.kafka-all-messages-in.

If the metric is tied to an “instance” of something, such as a database, you will typically want the name of that instance to be a dimension for your metric. A sample configuration of the GenericJMX plugin used for Kafka metrics where you care about a subset might look like this:

<MBean "kafka-all-messages">
     InstancePrefix "mytopicname"
           Type "counter"
           Table false
           Attribute "Count"
           InstancePrefix "kafka-messages-in"

Because the “name” of the instance is included as an “InstancePrefix” in the <MBean> stanza, Infrastructure Monitoring creates a dimension called “plugin_instance” with a value of “name”. In this case, the result is a metric called “kafka-messages-in” with a dimension “plugin_instance” with the value “mytopicname”.

Note that you can only create a single “plugin_instance” dimension per metric this way.

Metric metadata in collectd 🔗

Metadata from collectd that is not used in the metric name is imported as dimensions for use in filtering and aggregation. In addition, given the above fields coming into Infrastructure Monitoring from collectd, you have the option to add dimensions to your metrics or events using the plugin_instance or type_instance that many plugins allow you to customize.  By encoding dimensions in the format [key=value,key2=value2] within the instance names, these are translated to dimensions and available to you.

For example, type_instance:gc[level=full,time=cpu]-eden

The type_instance that would be used to create the metric is gc-eden, and the dimensions of level=full and time=cpu would be added to those metrics.

The SignalFx collectd agent has 1024 bytes in both the type_instance and plugin_instance fields to allow for custom dimensions, but if you exceed those limits, the fields will be truncated.

SignalFx metadata plugin 🔗

The SignalFx metadata plugin augments the base collectd agent with:

  • A number of aggregate metrics that are used in the Infrastructure Navigator, such as cpu.utilization
  • The ability to accept and forward on metrics from the DogStatsD variant of StatsD
  • Data about processes running on the host
  • The ability to collect and send events to Infrastructure Monitoring through the collectd agent

Differences between proprietary and community versions of collectd 🔗

The SignalFx collectd agent introduces the following changes from community collectd:

  • Increased Character limit: The number of characters that can be used in dimension key-value pairs has been increased to 1024, up from 64. In practice, this allows you to send as many dimensions as you want.
  • Buffer Flushing: To ensure that metrics always arrive in a timely manner, the SignalFx collectd agent includes a timer to ensure that data is transmitted either when the data buffer is full or when a time limit is reached, whichever happens first. This capability is particularly useful if you are only collecting small quantities of time-sensitive metrics.
  • HTTP Error Logging: We are providing greater visibility into how collectd itself is functioning, by logging HTTP codes from unsuccessful data transmissions by the write_http plugin, and by logging the name of every plugin that is loaded at startup.

All changes have been submitted back to the collectd project for the benefit of the community at large.