Docs » Integrations Guide » Use the Smart Agent » Monitors » collectd/hadoop

collectd/hadoop 🔗

Monitor Type: collectd/hadoop (Source)

Accepts Endpoints: Yes

Multiple Instances Allowed: Yes

Overview 🔗

Collects metrics about a Hadoop 2.0+ cluster using the collectd Hadoop Python plugin. If a remote JMX port is exposed in the hadoop cluster, then you may also configure the collectd/hadoopjmx monitor to collect additional metrics about the hadoop cluster.

The collectd/hadoop monitor will collect metrics from the Resource Manager REST API for the following:

  • Cluster Metrics
  • Cluster Scheduler
  • Cluster Applications
  • Cluster Nodes
  • MapReduce Jobs

Sample Config 🔗

Sample YAML configuration:

monitors:
- type: collectd/hadoop
  host: 127.0.0.1
  port: 8088

Configuration 🔗

To activate this monitor in the Smart Agent, add the following to your agent config:

monitors:  # All monitor config goes under this key
 - type: collectd/hadoop
   ...  # Additional config

For a list of monitor options that are common to all monitors, see Common Configuration.

Config option Required Type Description
pythonBinary no string Path to a python binary that should be used to execute the Python code. If not set, a built-in runtime will be used. Can include arguments to the binary as well.
host yes string Resource Manager Hostname
port yes integer Resource Manager Port
verbose no bool Log verbose information about the plugin (default: false)

Metrics 🔗

These are the metrics available for this monitor. Metrics that are categorized as container/host (default) are in bold and italics in the list below.

  • counter.hadoop.cluster.metrics.total_mb (cumulative)
  • counter.hadoop.cluster.metrics.total_nodes (cumulative)
  • counter.hadoop.cluster.metrics.total_virtual_cores (cumulative)
  • gauge.hadoop.cluster.metrics.active_nodes (gauge)
  • gauge.hadoop.cluster.metrics.allocated_mb (gauge)
  • gauge.hadoop.cluster.metrics.allocated_virtual_cores (gauge)
  • gauge.hadoop.cluster.metrics.apps_completed (gauge)
  • gauge.hadoop.cluster.metrics.apps_failed (gauge)
  • gauge.hadoop.cluster.metrics.apps_killed (gauge)
  • gauge.hadoop.cluster.metrics.apps_pending (gauge)
  • gauge.hadoop.cluster.metrics.apps_running (gauge)
  • gauge.hadoop.cluster.metrics.apps_submitted (gauge)
  • gauge.hadoop.cluster.metrics.available_mb (gauge)
  • gauge.hadoop.cluster.metrics.available_virtual_cores (gauge)
  • gauge.hadoop.cluster.metrics.containers_allocated (gauge)
  • gauge.hadoop.cluster.metrics.containers_pending (gauge)
  • gauge.hadoop.cluster.metrics.containers_reserved (gauge)
  • gauge.hadoop.cluster.metrics.decommissioned_nodes (gauge)
  • gauge.hadoop.cluster.metrics.lost_nodes (gauge)
  • gauge.hadoop.cluster.metrics.rebooted_nodes (gauge)
  • gauge.hadoop.cluster.metrics.reserved_mb (gauge)
  • gauge.hadoop.cluster.metrics.reserved_virtual_cores (gauge)
  • gauge.hadoop.cluster.metrics.total_mb (gauge)
  • gauge.hadoop.cluster.metrics.total_virtual_cores (gauge)
  • gauge.hadoop.cluster.metrics.unhealthy_nodes (gauge)
  • gauge.hadoop.mapreduce.job.elapsedTime (gauge)
  • gauge.hadoop.mapreduce.job.failedMapAttempts (gauge)
  • gauge.hadoop.mapreduce.job.failedReduceAttempts (gauge)
  • gauge.hadoop.mapreduce.job.mapsTotal (gauge)
  • gauge.hadoop.mapreduce.job.successfulMapAttempts (gauge)
  • gauge.hadoop.mapreduce.job.successfulReduceAttempts (gauge)
  • gauge.hadoop.resource.manager.apps.allocatedMB (gauge)
  • gauge.hadoop.resource.manager.apps.allocatedVCores (gauge)
  • gauge.hadoop.resource.manager.apps.clusterUsagePercentage (gauge)
  • gauge.hadoop.resource.manager.apps.memorySeconds (gauge)
  • gauge.hadoop.resource.manager.apps.priority (gauge)
  • gauge.hadoop.resource.manager.apps.progress (gauge)
  • gauge.hadoop.resource.manager.apps.queueUsagePercentage (gauge)
  • gauge.hadoop.resource.manager.apps.runningContainers (gauge)
  • gauge.hadoop.resource.manager.apps.vcoreSeconds (gauge)
  • gauge.hadoop.resource.manager.nodes.availMemoryMB (gauge)
  • gauge.hadoop.resource.manager.nodes.availableVirtualCores (gauge)
  • gauge.hadoop.resource.manager.nodes.numContainers (gauge)
  • gauge.hadoop.resource.manager.nodes.usedMemoryMB (gauge)
  • gauge.hadoop.resource.manager.nodes.usedVirtualCores (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.absoluteCapacity (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.absoluteMaxCapacity (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.absoluteUsedCapacity (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.allocatedContainers (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.capacity (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.maxApplications (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.maxApplicationsPerUser (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.maxCapacity (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.numActiveApplications (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.numApplications (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.numContainers (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.numPendingApplications (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.pendingContainers (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.reservedContainers (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.usedCapacity (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.userLimit (gauge)
  • gauge.hadoop.resource.manager.scheduler.leaf.queue.userLimitFactor (gauge)
  • gauge.hadoop.resource.manager.scheduler.root.queue.capacity (gauge)
  • gauge.hadoop.resource.manager.scheduler.root.queue.maxCapacity (gauge)
  • gauge.hadoop.resource.manager.scheduler.root.queue.usedCapacity (gauge)

Group applications 🔗

All of the following metrics are part of the applications metric group. All of the non-default metrics below can be turned on by adding applications to the monitor config option extraGroups:

  • hadoop.resource.manager.apps.allocatedMB (gauge)
  • hadoop.resource.manager.apps.allocatedVCores (gauge)
  • hadoop.resource.manager.apps.clusterUsagePercentage (gauge)
  • hadoop.resource.manager.apps.memorySeconds (gauge)
  • hadoop.resource.manager.apps.numAMContainerPreempted (gauge)
  • hadoop.resource.manager.apps.numNonAMContainerPreempted (gauge)
  • hadoop.resource.manager.apps.preemptedResourceMB (gauge)
  • hadoop.resource.manager.apps.preemptedResourceVCores (gauge)
  • hadoop.resource.manager.apps.priority (gauge)
  • hadoop.resource.manager.apps.progress (gauge)
  • hadoop.resource.manager.apps.queueUsagePercentage (gauge)
  • hadoop.resource.manager.apps.runningContainers (gauge)
  • hadoop.resource.manager.apps.vcoreSeconds (gauge)

Group cluster 🔗

All of the following metrics are part of the cluster metric group. All of the non-default metrics below can be turned on by adding cluster to the monitor config option extraGroups:

  • hadoop.cluster.metrics.active_nodes (gauge)
  • hadoop.cluster.metrics.allocated_mb (gauge)
  • hadoop.cluster.metrics.allocated_virtual_cores (gauge)
  • hadoop.cluster.metrics.apps_completed (gauge)
  • hadoop.cluster.metrics.apps_failed (gauge)
  • hadoop.cluster.metrics.apps_killed (gauge)
  • hadoop.cluster.metrics.apps_pending (gauge)
  • hadoop.cluster.metrics.apps_running (gauge)
  • hadoop.cluster.metrics.apps_submitted (gauge)
  • hadoop.cluster.metrics.available_mb (gauge)
  • hadoop.cluster.metrics.available_virtual_cores (gauge)
  • hadoop.cluster.metrics.containers_allocated (gauge)
  • hadoop.cluster.metrics.containers_pending (gauge)
  • hadoop.cluster.metrics.containers_reserved (gauge)
  • hadoop.cluster.metrics.decommissioned_nodes (gauge)
  • hadoop.cluster.metrics.lost_nodes (gauge)
  • hadoop.cluster.metrics.rebooted_nodes (gauge)
  • hadoop.cluster.metrics.reserved_mb (gauge)
  • hadoop.cluster.metrics.reserved_virtual_cores (gauge)
  • hadoop.cluster.metrics.total_mb (counter)
  • hadoop.cluster.metrics.total_nodes (counter)
  • hadoop.cluster.metrics.total_virtual_cores (counter)
  • hadoop.cluster.metrics.unhealthy_nodes (gauge)

Group fifo-scheduler 🔗

All of the following metrics are part of the fifo-scheduler metric group. All of the non-default metrics below can be turned on by adding fifo-scheduler to the monitor config option extraGroups:

  • hadoop.resource.manager.scheduler.fifo.availNodeCapacity (gauge)
  • hadoop.resource.manager.scheduler.fifo.capacity (gauge)
  • hadoop.resource.manager.scheduler.fifo.maxQueueMemoryCapacity (gauge)
  • hadoop.resource.manager.scheduler.fifo.minQueueMemoryCapacity (gauge)
  • hadoop.resource.manager.scheduler.fifo.numContainers (gauge)
  • hadoop.resource.manager.scheduler.fifo.numNodes (gauge)
  • hadoop.resource.manager.scheduler.fifo.totalNodeCapacity (gauge)
  • hadoop.resource.manager.scheduler.fifo.usedCapacity (gauge)
  • hadoop.resource.manager.scheduler.fifo.usedNodeCapacity (gauge)

Group leaf-queue 🔗

All of the following metrics are part of the leaf-queue metric group. All of the non-default metrics below can be turned on by adding leaf-queue to the monitor config option extraGroups:

  • hadoop.resource.manager.scheduler.leaf.queue.absoluteCapacity (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.absoluteMaxCapacity (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.absoluteUsedCapacity (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.allocatedContainers (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.capacity (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.maxActiveApplications (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.maxActiveApplicationsPerUser (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.maxApplications (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.maxApplicationsPerUser (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.maxCapacity (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.numActiveApplications (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.numApplications (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.numContainers (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.numPendingApplications (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.pendingContainers (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.reservedContainers (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.usedCapacity (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.userLimit (gauge)
  • hadoop.resource.manager.scheduler.leaf.queue.userLimitFactor (gauge)

Group mapreduce-jobs 🔗

All of the following metrics are part of the mapreduce-jobs metric group. All of the non-default metrics below can be turned on by adding mapreduce-jobs to the monitor config option extraGroups:

  • hadoop.mapreduce.job.elapsedTime (gauge)
  • hadoop.mapreduce.job.failedMapAttempts (gauge)
  • hadoop.mapreduce.job.failedReduceAttempts (gauge)
  • hadoop.mapreduce.job.killedMapAttempts (gauge)
  • hadoop.mapreduce.job.killedReduceAttempts (gauge)
  • hadoop.mapreduce.job.mapsCompleted (gauge)
  • hadoop.mapreduce.job.mapsPending (gauge)
  • hadoop.mapreduce.job.mapsRunning (gauge)
  • hadoop.mapreduce.job.mapsTotal (gauge)
  • hadoop.mapreduce.job.newMapAttempts (gauge)
  • hadoop.mapreduce.job.newReduceAttempts (gauge)
  • hadoop.mapreduce.job.reducesCompleted (gauge)
  • hadoop.mapreduce.job.reducesPending (gauge)
  • hadoop.mapreduce.job.reducesTotal (gauge)
  • hadoop.mapreduce.job.runningMapAttempts (gauge)
  • hadoop.mapreduce.job.runningReduceAttempts (gauge)
  • hadoop.mapreduce.job.successfulMapAttempts (gauge)
  • hadoop.mapreduce.job.successfulReduceAttempts (gauge)

Group node-resources 🔗

All of the following metrics are part of the node-resources metric group. All of the non-default metrics below can be turned on by adding node-resources to the monitor config option extraGroups:

  • hadoop.resource.manager.node.nodeCPUUsage (gauge)
  • hadoop.resource.manager.node.nodePhysicalMemoryMB (gauge)
  • hadoop.resource.manager.node.nodeVirtualMemoryMB (gauge)

Group nodes 🔗

All of the following metrics are part of the nodes metric group. All of the non-default metrics below can be turned on by adding nodes to the monitor config option extraGroups:

  • hadoop.resource.manager.nodes.availMemoryMB (gauge)
  • hadoop.resource.manager.nodes.availableVirtualCores (gauge)
  • hadoop.resource.manager.nodes.numContainers (gauge)
  • hadoop.resource.manager.nodes.usedMemoryMB (gauge)
  • hadoop.resource.manager.nodes.usedVirtualCores (gauge)

Group queue-users 🔗

All of the following metrics are part of the queue-users metric group. All of the non-default metrics below can be turned on by adding queue-users to the monitor config option extraGroups:

  • hadoop.resource.manager.scheduler.queue.users.numActiveApplications (gauge)
  • hadoop.resource.manager.scheduler.queue.users.numPendingApplications (gauge)

Group resource-objects 🔗

All of the following metrics are part of the resource-objects metric group. All of the non-default metrics below can be turned on by adding resource-objects to the monitor config option extraGroups:

  • hadoop.resource.manager.scheduler.queue.resource.memory (gauge)
  • hadoop.resource.manager.scheduler.queue.resource.vCores (gauge)

Group root-queue 🔗

All of the following metrics are part of the root-queue metric group. All of the non-default metrics below can be turned on by adding root-queue to the monitor config option extraGroups:

  • hadoop.resource.manager.scheduler.root.queue.capacity (gauge)
  • hadoop.resource.manager.scheduler.root.queue.maxCapacity (gauge)
  • hadoop.resource.manager.scheduler.root.queue.usedCapacity (gauge)

Non-default metrics (version 4.7.0+) 🔗

To emit metrics that are not default, you can add those metrics in the generic monitor-level extraMetrics config option. Metrics that are derived from specific configuration options that do not appear in the above list of metrics do not need to be added to extraMetrics.

To see a list of metrics that will be emitted you can run agent-status monitors after configuring this monitor in a running agent instance.