Docs » Integrations Guide » Integrations Reference » Jenkins

image0 Jenkins

This directory consolidates all the metadata associated with the jenkins plugin for collectd. The relevant code for the plugin can be found here

DESCRIPTION

This is the SignalFx Jenkins plugin. Follow these instructions to install the Jenkins plugin for collectd.

The collectd-jenkins plugin collects metrics from jenkins instances hitting these endpoints: ../api/json (job metrics) and metrics/<MetricsKey>/.. (default and optional Codahale/Dropwizard JVM metrics).

FEATURES

Built-in dashboards

  • Jenkins: Provides a high-level overview of metrics for a jenkins cluster.

image1

image2

  • Jenkins MASTER: Provides metrics from jenkins instance(s) on a particular host.

image3

REQUIREMENTS AND DEPENDENCIES

Version information

Software Version
collectd 4.9 or later
python 2.6 or later
Jenkins 1.580.3 or later
Python plugin for collectd (included with SignalFx collectd agent)

INSTALLATION

  1. Download collectd-jenkins. Place the jenkins.py file in /usr/share/collectd/collectd-jenkins
  2. Copy the sample configuration file for this plugin in /etc/collectd/managed_config
  3. Modify the sample configuration file as described in Configuration, below
  4. Install the Metrics Plugin in Jenkins. Manage Jenkins > Manage Plugins > Available > Search "Metrics Plugin"
  5. Install the Python requirements with sudo pip install -r requirements.txt
  6. Restart collectd

CONFIGURATION

Using the example configuration file 10-jenkins.conf as a guide, provide values for the configuration options listed below that make sense for your environment and allow you to connect to the jenkins instances

Metrics from /metrics/<MetricsKey>/metrics endpoint can be activated through the configuration file. Note, that SignalFx does not support histograms, meter and timer metric types as they are too verbose in Jenkins and also values of type string and list(hence, metrics of these will be skipped if provided in the configuration)

configuration option definition example value
ModulePath Path on disk where collectd can find this module. “/usr/share/collectd/collectd-jenkins/”
Host Host name of the jenkins instance “localhost”
Port Port at which the instance can be reached “2379”
MetricsKey Access key required to fetch Codahale metrics “6ZHwGBkGR91dxbFenpfz_g2h0-ocmK-CvdHLdmg”
Username User with security access if configured “admin”
APIToken API Token of the user “f04fff7c860d884f2ef00a2b2d481c2f”
EnhancedMetrics Boolean to indicate whether advanced stats from /metrics/<MetricsKey>/metrics are needed “false”
IncludeMetric Metric name from the /metrics/<MetricsKey>/metrics endpoint to include(valid when EnhancedMetrics is “false”) “vm.daemon.count”
ExcludeMetric Metric name from the /metrics/<MetricsKey>/metrics endpoint to exclude(valid when EnhancedMetrics is “true”) “vm.terminated.count”
Dimension Space-separated key-value pair for a user-defined dimension dimension_name dimension_value
Interval Number of seconds between calls to Jenkins API. 10
ssl_keyfile Path to the keyfile “path/to/file”
ssl_certificate Path to the certificate “path/to/file”
ssl_ca_certs Path to the ca file “path/to/file”

Example configuration:

LoadPlugin python
<Plugin python>
    ModulePath "/usr/share/collectd/collectd-jenkins"
    Import jenkins
    <Module jenkins>
        Host "127.0.0.1"
        Port "8080"
        Username "admin"
        APIToken "f04fff7c860d884f2ef00a2b2d481c2f"
        MetricsKey "6ZHwGBkGR91dxbFenpfz_g2h0-ocmK-CvdHLdmg"
        Interval 60
        ssl_keyfile "/etc/cert/jenkins.key"
        ssl_certificate "/etc/cert/jenkins.crt"
        ssl_ca_certs "/etc/cert/ca.crt"
    </Module>
</Plugin>

The plugin can be configured to collect metrics from multiple instances in the following manner.

LoadPlugin python
<Plugin python>
    ModulePath "/usr/share/collectd/collectd-jenkins"
    Import jenkins
    <Module jenkins>
        Host "127.0.0.1"
        Port "8080"
        Username "admin"
        APIToken "f04fff7c860d884f2ef00a2b2d481c2f"
        MetricsKey "6ZHwGBkGR91dxbFenpfz_g2h0-ocmK-CvdHLdmg"
        Interval 10
    </Module>
    <Module jenkins>
        Host "127.0.0.1"
        Port "8010"
        Username "admin"
        APIToken "f04bbb7c860d8b4f1ef00a2b2d481c2f"
        MetricsKey "6Z76HwGBHOj4uBOlsxbFenpfz_g2UAh0-ocmK-CvdHLSRdmg"
        EnhancedMetrics False
        IncludeMetric "vm.daemon.count"
        IncludeMetric "vm.terminated.count"
    </Module>
    <Module jenkins>
        Host "127.0.0.1"
        Port "8000"
        MetricsKey "6Z95HwOj4uBOakGR91dxbFenpfz_g2wBlUAh0-ocmK-CvdSvE1LGRdmg"
        EnhancedMetrics True
        ExcludeMetric "vm.terminated.count"
        ExcludeMetric "vm.daemon.count"
        Dimension foo bar
    </Module>
</Plugin>

USAGE

Interpreting Built-in dashboards

  • Jenkins:
  • Alive Status: Shows the number of Jenkins Masters that are alive.

image4

  • Health Score: Shows the mean health score of each Jenkins instance on all hosts.

image5

  • Job Failure Rate: Shows the rate of jobs failed in the past day.

image6

  • Executor Usage: Shows the usage pattern of the executors. Gives an overview of the load on the Jenkins instances.

image7

  • Top 5 Failed Jobs: Shows the top 5 failed jobs over the past day based on the total failure count.

image8

  • Busy Executors vs Pending Jobs: A line graph showing comparison between in-use executors and pending jobs in queue. On comparing this chart with two above, reason for job failures can be narrowed down further quickly.

image9

  • Average Duration - Past Day: Shows average duration of top 5 jobs that are taking the most time.

image10

  • Slave Status: Shows the number of slave agents that are alive.

image11

  • VM Memory Utilization: Area graph of the memory used by each Jenkins JVM.

image12

  • Heap Usage: Line graph of the utilization percentage of Heap memory by each Jenkins instance.

image13

  • Non-Heap Used: Line graph of the non-heap memory used by each Jenkins instance.

image14

  • Jenkins Master:
  • Top 5 Failed Jobs: Shows the top 5 failed jobs over the past day based on the total failure count in an instance(s).

image15

  • Health Checks: The status of each health check as reported by DropWizard Metrics. This gives a quick overview of what’s wrong with the instance.

image16

  • Slave Status: Shows the number of slave agents of the instance(s) that are alive.

image17

image18
All DropWizard metrics reported by the jenkins collectd plugin will not contain any dimensions by default. Whereas, the job metrics sent will contain the following dimensions by default:
  • Job, name of the job
  • Result, the status of the job

A few other details:

  • plugin is always set to jenkins
  • plugin_instance will contain the IP address and the port of the member given in the configuration
  • To add metrics from the /metrics/<MetricsKey>/metrics endpoint, use the configuration options mentioned in configuration. If metrics are being included individually, make sure to give names that are valid. For example, vm.daemon.count or vm.terminated.count

METRICS

By default, metrics about a job and instance are provided. Click here for details. Metrics from /metrics/<MetricsKey>/metrics endpoint can be activated through the configuration file. Note, that SignalFx does not support histograms, meter and timer metric types as they are too verbose in Jenkins and also values of type string and list(hence, metrics of these will be skipped if provided in the configuration). See usage for details.

Metric naming

<metric type>.jenkins.node.<name of metric>. This is the format of default metric names reported by the plugin. Optional metrics are named as available from the /metrics/<MetricsKey>/metrics endpoint.

Below is a list of all metrics.

Metric Name Brief Type
gauge.jenkins.job.duration Time taken to complete the job in ms. gauge
gauge.jenkins.node.executor.count.value Total Number of executors in an instance gauge
gauge.jenkins.node.executor.in-use.value Total number of executors being used in an instance gauge
gauge.jenkins.node.health-check.score Mean health score of an instance gauge
gauge.jenkins.node.health.disk.space Binary value of disk space health gauge
gauge.jenkins.node.health.plugins Boolean value indicating state of plugins gauge
gauge.jenkins.node.health.temporary.space Binary value of temporary space health gauge
gauge.jenkins.node.health.thread-deadlock Boolean value indicating a deadlock gauge
gauge.jenkins.node.online.status Boolean value of instance is reachable or not gauge
gauge.jenkins.node.queue.size.value Total number pending jobs in queue gauge
gauge.jenkins.node.slave.online.status Boolean value for slave is reachable or not gauge
gauge.jenkins.node.vm.memory.heap.usage Percent utilization of the heap memory gauge
gauge.jenkins.node.vm.memory.non-heap.used Total amount of non-heap memory used gauge
gauge.jenkins.node.vm.memory.total.used Total Memory used by instance gauge

gauge.jenkins.job.duration

gauge

The total time taken to complete a job or the time the job ran for before being aborted.

gauge.jenkins.node.executor.count.value

gauge

The total number of executors present in an instance.

gauge.jenkins.node.executor.in-use.value

gauge

The total number of executors being used currently in an instance.

gauge.jenkins.node.health-check.score

gauge

The mean of the successful health checks of an instance.

gauge.jenkins.node.health.disk.space

gauge

The health of disk space is represented by a 1 or 0 indicating whether the disk space available is within threshold limits.

gauge.jenkins.node.health.plugins

gauge

The value represents whether all plugins have been loaded successfully or not.

gauge.jenkins.node.health.temporary.space

gauge

The health of temporary space is represented by a 1 or 0 indicating whether the temporary space available is within threshold limits.

gauge.jenkins.node.health.thread-deadlock

gauge

The presence of a thread-deadlock is represented by a boolean value.

gauge.jenkins.node.online.status

gauge

The reachability of the instance is represented as a boolean value.

gauge.jenkins.node.queue.size.value

gauge

The total number of pending jobs in queue.

gauge.jenkins.node.slave.online.status

gauge

The reachability of the slave agents is represented as a boolean value.

gauge.jenkins.node.vm.memory.heap.usage

gauge

The percentage of heap mempry utilized by the instance.

gauge.jenkins.node.vm.memory.non-heap.used

gauge

The total amount of non-heap memory used by an instance.

gauge.jenkins.node.vm.memory.total.used

gauge

The total memory used by the Jenkins instance in bytes. This is a Dropwizard metric reported via the Metrics Plugin.