Docs » Integrations Guide » Integrations Reference » Marathon

image0 Marathon

Metadata associated with the Marathon plugin for collectd can be found here. The relevant code for the plugin can be found here.

DESCRIPTION

The collectd-marathon plugin collects metrics about Marathon applications and tasks.

Features

Built-in dashboards

  • Marathon: Overview of Marathon environment.

image1

  • Marathon Application: Focus on Marathon Applications.

image2

  • Marathon Resources: Focus on a Marathon Resource Allocation.

image3

  • Marathon Task: Focus on a Marathon Task.

image4

REQUIREMENTS AND DEPENDENCIES

Version information

Software Version
collectd 5.0 or later
Python 2.6 or later
Marathon 1.1.1 or later
Python plugin for collectd (included with SignalFx collectd agent)

INSTALLATION

If you are using the new Smart Agent, see the docs for thecollectd/marathonmonitor for more information. The configuration documentation below may be helpful as well, but consult the Smart Agent repo’s docs for the exact schema.

  1. Download the collectd-marathon Python module onto a host that has access to the Marathon API.

  2. Run the following command to install the modules dependencies using pip, replacing the example path with the download location of the collectd-marathon module:

    sudo pip install -r /path/to/collectd-marathon/requirements.txt
    
  3. Download SignalFxs sample configuration file for this plugin to /etc/collectd/managed_config.

  4. Modify the configuration file to provide values that make sense for your environment, as described in Configuration below.

  5. Restart collectd.

CONFIGURATION

Using the sample configuration file 20-collectd-marathon.conf as a guide, provide values for the configuration options listed below that make sense for your environment.

configuration option definition default value
ModulePath Path on disk where collectd can find this module. "/usr/share/collectd/collectd-marathon"
Import Path to the name of the python module with out the .py extension marathon
LogTraces Logs traces from the plugin’s execution true
verbose Turns on verbose log statements False
host A python list of ["<scheme>", "<host>", "<port>", "username", "password", "<dcos_auth_url>"]. scheme is either “http” or “https”. The username and password are only required for Basic Authentication with the Marathon API. dcos_auth_url is a string that takes the dcos authentication URL which the plugin uses to get authentication tokens from. Set scheme to “https” if operating DC/OS in strict mode and dcos_auth_url to “https://leader.mesos/acs/api/v1/auth/login” (which is the default DNS entry provided by DC/OS) no default

Note: Metrics from the /metrics endpoint are not available while operating in DC/OS strict mode.

An example configuration would look like the following:

<LoadPlugin "python">
  Globals true
</LoadPlugin>

<Plugin "python">
  ModulePath "/usr/share/collectd/collectd-marathon"
  Import "marathon"
  LogTraces true
  <Module "marathon">
    # Note that the last config option can also be set to the base URL of the
    # DC/OS UI and /acs/api/v1/auth/login is the authentication endpoint the plugin
    # uses to obtain token for subsequent requests.
    host  ["https", "localhost", "8443", "username", "password", "https://leader.mesos/acs/api/v1/auth/login"]
    verbose False
  </Module>
</Plugin>

USAGE

All metrics reported by the Marathon collectd plugin will contain the following dimensions:

  • host will contain the hostname (as known by collectd) of the machine reporting the metrics.
  • plugin is always set to marathon.
  • plugin_instance will always be marathon concated with . and the Mesos agent id. Ex. marathon.<mesos agent id>.

Sample of built-in dashboard in SignalFx:

image5

METRICS

Below is a list of all metrics.

Metric Name Brief Type
gauge.marathon-api-metric Metrics reported by the Marathon Metrics API gauge
gauge.marathon.app.cpu.allocated Number of CPUs allocated to an application gauge
gauge.marathon.app.cpu.allocated.per.instance Configured number of CPUs allocated to each application instance gauge
gauge.marathon.app.delayed Indicates if the application is delayed or not gauge
gauge.marathon.app.deployments.total Number of application deployments gauge
gauge.marathon.app.disk.allocated Storage allocated to a Marathon application gauge
gauge.marathon.app.disk.allocated.per.instance Configured storage allocated each to application instance gauge
gauge.marathon.app.gpu.allocated GPU Allocated to a Marathon application gauge
gauge.marathon.app.gpu.allocated.per.instance Configured number of GPUs allocated to each application instance gauge
gauge.marathon.app.instances.total Number of application instances gauge
gauge.marathon.app.memory.allocated Memory Allocated to a Marathon application gauge
gauge.marathon.app.memory.allocated.per.instance Configured amount of memory allocated to each application instance gauge
gauge.marathon.app.tasks.running Number tasks running for an application gauge
gauge.marathon.app.tasks.staged Number tasks staged for an application gauge
gauge.marathon.app.tasks.unhealthy Number unhealthy tasks for an application gauge
gauge.marathon.task.healthchecks.failing.total The number of failing health checks for a task gauge
gauge.marathon.task.healthchecks.passing.total The number of passing health checks for a task gauge
gauge.marathon.task.staged.time.elapsed The amount of time the task spent in staging gauge
gauge.marathon.task.start.time.elapsed Time elapsed since the task started gauge

gauge.marathon-api-metric

gauge

Metrics reported by the Marathon Metrics API

The Marathon API offers a set of metrics that are reported by this plugin.
API Endpoint: /metrics

Counters

This plugin reports all “counters” as gauge.<metric name. The Marathon API does the counting for the plugin, so it reads and reports the counts as a gauge type value.

Gauges

This plugin reports all gauge metrics as gauge.<metric name.

Meters

This plugin reports all metrics listed under “meters” as
gauge.<metric name.<unit.per.<unit.

gauge.marathon.app.cpu.allocated

gauge

Number of CPUs allocated to an application

Represents the configured instance allocation multiplied by the number of
instances.

gauge.marathon.app.cpu.allocated.per.instance

gauge

Configured number of CPUs allocated to each application instance

gauge.marathon.app.delayed

gauge

Indicates if the application is delayed by returning 1 or 0 if it is not

gauge.marathon.app.deployments.total

gauge

Number of deployments for an application

gauge.marathon.app.disk.allocated

gauge

Storage allocated to a Marathon application

Represents the configured instance allocation multiplied by the number of
instances.

gauge.marathon.app.disk.allocated.per.instance

gauge

Configured storage allocated to each application instance

gauge.marathon.app.gpu.allocated

gauge

GPU Allocated to a Marathon application

Represents the configured instance allocation multiplied by the number of
instances.

gauge.marathon.app.gpu.allocated.per.instance

gauge

Configured number of GPUs allocated to each application instance

gauge.marathon.app.instances.total

gauge

Number of application instances

gauge.marathon.app.memory.allocated

gauge

Memory Allocated to a Marathon application

Represents the configured instance allocation multiplied by the number of
instances.

gauge.marathon.app.memory.allocated.per.instance

gauge

Configured amount of memory allocated to each application instance

gauge.marathon.app.tasks.running

gauge

Number of tasks running for an application

gauge.marathon.app.tasks.staged

gauge

Number of tasks staged for an application

gauge.marathon.app.tasks.unhealthy

gauge

Number of unhealthy tasks for an application

gauge.marathon.task.healthchecks.failing.total

gauge

The number of failing health checks for a task

gauge.marathon.task.healthchecks.passing.total

gauge

The number of passing health checks for a task

gauge.marathon.task.staged.time.elapsed

gauge

The amount of time the task spent in staging

gauge.marathon.task.start.time.elapsed

gauge

Time elapsed since the task started