Docs » Integrations Guide » Integrations Reference » Marathon

../../_images/integrations_marathon.png Marathon 🔗

DESCRIPTION 🔗

This integration primarily consists of the Smart Agent monitor collectd/marathon. Below is an overview of that monitor.

Smart Agent Monitor 🔗

Monitors a Mesos Marathon instance using the collectd Marathon Python plugin.

See the integrations doc for more information on configuration.

Sample YAML configuration:

monitors:
  - type: collectd/marathon
    host: 127.0.0.1
    port: 8080
    scheme: http

Sample YAML configuration for DC/OS:

monitors:
  - type: collectd/marathon
    host: 127.0.0.1
    port: 8080
    scheme: https
    dcosAuthURL: https://leader.mesos/acs/api/v1/auth/login

REQUIREMENTS AND DEPENDENCIES 🔗

Version information 🔗

Software Version
collectd 5.0 or later
Python 2.6 or later
Marathon 1.1.1 or later
Python plugin for collectd (included with SignalFx collectd agent)

INSTALLATION 🔗

This integration is part of the SignalFx Smart Agent as the collectd/marathon monitor. You should first deploy the Smart Agent to the same host as the service you want to monitor, and then continue with the configuration instructions below.

CONFIGURATION 🔗

To activate this monitor in the Smart Agent, add the following to your agent config:

monitors:  # All monitor config goes under this key
 - type: collectd/marathon
   ...  # Additional config

For a list of monitor options that are common to all monitors, see Common Configuration.

Config option Required Type Description
pythonBinary no string Path to a python binary that should be used to execute the Python code. If not set, a built-in runtime will be used. Can include arguments to the binary as well.
host yes string
port yes integer
username no string Username used to authenticate with Marathon.
password no string Password used to authenticate with Marathon.
scheme no string Set to either http or https. (default: http)
dcosAuthURL no string The dcos authentication URL which the plugin uses to get authentication tokens from. Set scheme to "https" if operating DC/OS in strict mode and dcosAuthURL to "https://leader.mesos/acs/api/v1/auth/login" (which is the default DNS entry provided by DC/OS)

USAGE 🔗

All metrics reported by the Marathon collectd plugin will contain the following dimensions:

  • host will contain the hostname (as known by collectd) of the machine reporting the metrics.
  • plugin is always set to marathon.
  • plugin_instance will always be marathon concated with . and the Mesos agent id. Ex. marathon.<mesos agent id>.

Sample of built-in dashboard in SignalFx:

../../_images/dashboard_marathon_overview.png

METRICS 🔗

Metric Name Description Type
gauge.service.mesosphere.marathon.app.cpu.allocated Number of CPUs allocated to an application gauge
gauge.service.mesosphere.marathon.app.cpu.allocated.per.instance Configured number of CPUs allocated to each application instance gauge
gauge.service.mesosphere.marathon.app.delayed Indicates if the application is delayed or not gauge
gauge.service.mesosphere.marathon.app.deployments.total Number of application deployments gauge
gauge.service.mesosphere.marathon.app.disk.allocated Storage allocated to a Marathon application gauge
gauge.service.mesosphere.marathon.app.disk.allocated.per.instance Configured storage allocated each to application instance gauge
gauge.service.mesosphere.marathon.app.gpu.allocated GPU Allocated to a Marathon application gauge
gauge.service.mesosphere.marathon.app.gpu.allocated.per.instance Configured number of GPUs allocated to each application instance gauge
gauge.service.mesosphere.marathon.app.instances.total Number of application instances gauge
gauge.service.mesosphere.marathon.app.memory.allocated Memory Allocated to a Marathon application gauge
gauge.service.mesosphere.marathon.app.memory.allocated.per.instance Configured amount of memory allocated to each application instance gauge
gauge.service.mesosphere.marathon.app.tasks.running Number tasks running for an application gauge
gauge.service.mesosphere.marathon.app.tasks.staged Number tasks staged for an application gauge
gauge.service.mesosphere.marathon.app.tasks.unhealthy Number unhealthy tasks for an application gauge
gauge.service.mesosphere.marathon.task.healthchecks.failing.total The number of failing health checks for a task gauge
gauge.service.mesosphere.marathon.task.healthchecks.passing.total The number of passing health checks for a task gauge
gauge.service.mesosphere.marathon.task.staged.time.elapsed The amount of time the task spent in staging gauge
gauge.service.mesosphere.marathon.task.start.time.elapsed Time elapsed since the task started gauge

gauge.service.mesosphere.marathon.app.cpu.allocated 🔗

gauge

Number of CPUs allocated to an application

gauge.service.mesosphere.marathon.app.cpu.allocated.per.instance 🔗

gauge

Configured number of CPUs allocated to each application instance

gauge.service.mesosphere.marathon.app.delayed 🔗

gauge

Indicates if the application is delayed or not

gauge.service.mesosphere.marathon.app.deployments.total 🔗

gauge

Number of application deployments

gauge.service.mesosphere.marathon.app.disk.allocated 🔗

gauge

Storage allocated to a Marathon application

gauge.service.mesosphere.marathon.app.disk.allocated.per.instance 🔗

gauge

Configured storage allocated each to application instance

gauge.service.mesosphere.marathon.app.gpu.allocated 🔗

gauge

GPU Allocated to a Marathon application

gauge.service.mesosphere.marathon.app.gpu.allocated.per.instance 🔗

gauge

Configured number of GPUs allocated to each application instance

gauge.service.mesosphere.marathon.app.instances.total 🔗

gauge

Number of application instances

gauge.service.mesosphere.marathon.app.memory.allocated 🔗

gauge

Memory Allocated to a Marathon application

gauge.service.mesosphere.marathon.app.memory.allocated.per.instance 🔗

gauge

Configured amount of memory allocated to each application instance

gauge.service.mesosphere.marathon.app.tasks.running 🔗

gauge

Number tasks running for an application

gauge.service.mesosphere.marathon.app.tasks.staged 🔗

gauge

Number tasks staged for an application

gauge.service.mesosphere.marathon.app.tasks.unhealthy 🔗

gauge

Number unhealthy tasks for an application

gauge.service.mesosphere.marathon.task.healthchecks.failing.total 🔗

gauge

The number of failing health checks for a task

gauge.service.mesosphere.marathon.task.healthchecks.passing.total 🔗

gauge

The number of passing health checks for a task

gauge.service.mesosphere.marathon.task.staged.time.elapsed 🔗

gauge

The amount of time the task spent in staging

gauge.service.mesosphere.marathon.task.start.time.elapsed 🔗

gauge

Time elapsed since the task started

All metrics of this integration are emitted by default; however, none are categorized as container/host – they are all custom.

These are the metrics available for this integration.

  • gauge.service.mesosphere.marathon.app.cpu.allocated (gauge)
    Number of CPUs allocated to an application
  • gauge.service.mesosphere.marathon.app.cpu.allocated.per.instance (gauge)
    Configured number of CPUs allocated to each application instance
  • gauge.service.mesosphere.marathon.app.delayed (gauge)
    Indicates if the application is delayed or not
  • gauge.service.mesosphere.marathon.app.deployments.total (gauge)
    Number of application deployments
  • gauge.service.mesosphere.marathon.app.disk.allocated (gauge)
    Storage allocated to a Marathon application
  • gauge.service.mesosphere.marathon.app.disk.allocated.per.instance (gauge)
    Configured storage allocated each to application instance
  • gauge.service.mesosphere.marathon.app.gpu.allocated (gauge)
    GPU Allocated to a Marathon application
  • gauge.service.mesosphere.marathon.app.gpu.allocated.per.instance (gauge)
    Configured number of GPUs allocated to each application instance
  • gauge.service.mesosphere.marathon.app.instances.total (gauge)
    Number of application instances
  • gauge.service.mesosphere.marathon.app.memory.allocated (gauge)
    Memory Allocated to a Marathon application
  • gauge.service.mesosphere.marathon.app.memory.allocated.per.instance (gauge)
    Configured amount of memory allocated to each application instance
  • gauge.service.mesosphere.marathon.app.tasks.running (gauge)
    Number tasks running for an application
  • gauge.service.mesosphere.marathon.app.tasks.staged (gauge)
    Number tasks staged for an application
  • gauge.service.mesosphere.marathon.app.tasks.unhealthy (gauge)
    Number unhealthy tasks for an application
  • gauge.service.mesosphere.marathon.task.healthchecks.failing.total (gauge)
    The number of failing health checks for a task
  • gauge.service.mesosphere.marathon.task.healthchecks.passing.total (gauge)
    The number of passing health checks for a task
  • gauge.service.mesosphere.marathon.task.staged.time.elapsed (gauge)
    The amount of time the task spent in staging
  • gauge.service.mesosphere.marathon.task.start.time.elapsed (gauge)
    Time elapsed since the task started

The agent does not do any built-in filtering of metrics coming out of this monitor.