Docs » Integrations Guide » Integrations Reference » Couchbase

image0 Couchbase

Metadata associated with SignalFx’s Couchbase integration can be found here. The relevant code for the plugin can be found here.

DESCRIPTION

collectd-couchbase is a collectd plugin that collects statistics from Couchbase.

FEATURES

Built-in dashboards

  • Couchbase Clusters: Overview of data from all Couchbase clusters reporting.

image1

  • Couchbase Nodes: Overview of all data from Couchbase nodes.

image2

  • Couchbase Node: Focus on a single Couchbase node.

image3

  • Couchbase Buckets: Performance and activity of Couchbase buckets.

image4

  • Couchbase Bucket: Focus on a single Couchbase bucket.

image5

REQUIREMENTS AND DEPENDENCIES

Version information

Software Version
collectd 4.9 or later
python 2.7 or later
couchbase 3.0 or later
Python plugin for collectd (included with SignalFx collectd agent)

INSTALLATION

  1. Download the collectd-couchbase Python module.
  2. Download SignalFx’s sample configuration file for this plugin to /etc/collectd/managed_config.
  3. Modify the sample configuration file as described in Configuration, below.
  4. Restart collectd.

CONFIGURATION

Using the example configuration file 10-couchbase.conf as a guide, provide values for the configuration options listed below that make sense for your environment and allow you to connect to the Couchbase nodes and buckets to be monitored.

configuration option definition example value
ModulePath Path on disk where collectd can find this module. “/opt/collectd-couchbase”
CollectTarget Define what this Module block will monitor: “NODE”, for a Couchbase node, or “BUCKET” for a Couchbase bucket. “BUCKET”
CollectBucket If CollectTarget is “BUCKET”, the name of the bucket that this Module block will monitor. “custom_bucket”
Host Hostname or IP address of the Couchbase server. “localhost”
Port Port at which the Couchbase server can be reached. “8091”
ClusterName Name of this Couchbase cluster. “default”
CollectMode Change to “detailed” to collect all available metrics from Couchbase stats API. Defaults to “default”, collecting a curated set that works well with SignalFx. See metric_info.py for more information. “default”
Interval Number of seconds between calls to Couchbase API. 10
Username If CollectTarget is “BUCKET” and this bucket requires authentication, username to authenticate to this bucket. If this bucket does not require authentication, do not include this option in the Module block. “USERNAME”
Password If CollectTarget is “BUCKET” and this bucket requires authentication, password to authenticate to this bucket. If this bucket does not require authentication, do not include this option in the Module block. “PASSWORD”
FieldLength The number of characters used to encode dimension data. CAUTION: Modify this value only if you specifically compiled collectd with a non-default value for DATA_MAX_NAME_LEN in plugin.h. “1024”

USAGE

Below are screen captures of dashboards created for this plugin by SignalFx, illustrating the metrics emitted by this plugin.

For general reference on how to monitor Couchbase, see CouchbaseMonitoring and Monitor using the RESTAPI.

Note on bucket metrics

This plugin emits some metrics about the bucket’s performance across the cluster, and some metrics about the bucket’s performance per node.

Metrics beginning with gauge.bucket.basic.* and gauge.bucket.quota.* are reported once per cluster. All other bucket metrics (gauge.bucket.*) are reported by every node that hosts that bucket. In order to analyze bucket performance for the entire bucket, apply functions like Sum or Mean to group node-level metrics together by bucket.

Monitoring a Couchbase cluster

On the Couchbase Nodes overview dashboard, you can see at a glance the status the nodes and buckets in a given cluster. Nodes in the cluster should be seeing balanced activity. Buckets in the cluster should each have adequate memory remaining.

image6

This cluster’s three nodes have roughly the same number of gets per second, and its two buckets have plenty of headroom.

This dashboard also includes a percentile distribution of CPU utilization per node, allowing quick identification of unusually hot nodes. This chart shows minimum, 10th percentile, median (50th percentile), 90th percentile, and maximum CPU utilization for each node in the cluster.

image7

This cluster’s CPU utilization distribution shows only a small amount of variation in utilization, suggesting that each of the nodes is using about the same amount.

Monitoring a Couchbase node

Zooming in to an individual node shows that node’s activity, cache performance, and compute resource usage.

image8

This node is lightly loaded. To compare its activity to other nodes in this cluster, we’d use the Couchbase Nodes dashboard above.

We can check the node’s cache performance using a graph that shows the number of gets per second in yellow, overlaid on the number of cache hits in blue. The ratio between gets and cache hits is computed as “hit ratio” and is shown as a dotted line. When every get request results in a cache hit, the graph is green and the dotted line remains high. When there are fewer cache hits than gets, the graph shows yellow areas and the dotted line drops.

image9

This lightly-loaded node has a 100% cache hit ratio: it can serve every get request that it receives from memory.

Monitoring Couchbase buckets

The Couchbase Buckets overview shows activity for all buckets being monitored.

image10

The buckets in this cluster happen to have about the same number of items and are serving about the same number of operations per second.

Monitoring a single Couchbase bucket

Selecting a particular bucket to show on the Couchbase Bucket dashboard lets us go deep on that bucket’s performance.

Resident items ratio and cache miss rate are inversely related: as the ratio of items in this bucket that are resident in memory drops, the number of get requests that require a fetch from disk will increase.

image11

This bucket has a 100% resident items ratio: all of the items that it contains can be served from memory, instead of disk.

The performance of Couchbase buckets is bound by memory. When memory is exhausted, new items can be stored only by ejecting old items. An attempt to store a new item in a bucket with insufficient memory headroom produces an out-of-memory error: either a “temp” error (an old item will be ejected, try again) or a “non-temp” error (this item cannot be stored at all). Any out-of-memory error is cause for concern.

image12

This bucket has available memory, and shows no out-of-memory errors.

Couchbase persists in-memory items to disk. This graph shows the number of items that have been added to the disk write queue in yellow, and the number of items that have been successfully written in blue. When Couchbase is able to keep up with disk writes, these metrics are equal and the graph is green. When the disk queue is filling faster than it can be drained, this graph shows yellow areas.

image13

This bucket is keeping up with disk writes: the number of items added to the queue is about equal to the number of items successfully written to disk.

METRICS

Below is a list of all metrics.

Metric Name Brief Type
gauge.bucket.basic.dataUsed Size of user data within buckets of the specified state that are resident in RAM (%) gauge
gauge.bucket.basic.diskFetches Number of disk fetches gauge
gauge.bucket.basic.diskUsed Amount of disk used (bytes) gauge
gauge.bucket.basic.itemCount Number of items associated with the bucket gauge
gauge.bucket.basic.memUsed Amount of memory used by the bucket (bytes) gauge
gauge.bucket.basic.opsPerSec Number of operations per second gauge
gauge.bucket.basic.quotaPercentUsed Percentage of RAM used (for active objects) against the configure bucket size (%) gauge
gauge.bucket.op.cmd_get Requested objects gauge
gauge.bucket.op.couch_docs_fragmentation Percent fragmentation of documents in this bucket. gauge
gauge.bucket.op.couch_views_ops View operations per second gauge
gauge.bucket.op.curr_connections Open connection per bucket gauge
gauge.bucket.op.curr_items Total number of stored items per bucket gauge
gauge.bucket.op.disk_write_queue Number of items waiting to be written to disk gauge
gauge.bucket.op.ep_bg_fetched Number of items fetched from disk gauge
gauge.bucket.op.ep_cache_miss_rate Ratio of requested objects found in cache vs retrieved from disk gauge
gauge.bucket.op.ep_diskqueue_drain Items removed from disk queue gauge
gauge.bucket.op.ep_diskqueue_fill Enqueued items on disk queue gauge
gauge.bucket.op.ep_mem_high_wat Memory high water mark - point at which active objects begin to be ejected from bucket gauge
gauge.bucket.op.ep_mem_low_wat Memory low water mark gauge
gauge.bucket.op.ep_num_value_ejects Number of objects ejected out of the bucket gauge
gauge.bucket.op.ep_oom_errors Request rejected - bucket is at quota, panic gauge
gauge.bucket.op.ep_queue_size Number of items queued for storage gauge
gauge.bucket.op.ep_tmp_oom_errors Request rejected - couchbase is making room by ejecting objects, try again later gauge
gauge.bucket.op.mem_used Memory used gauge
gauge.bucket.op.ops Total of gets, sets, increment and decrement gauge
gauge.bucket.op.vb_active_resident_items_ratio Ratio of items kept in memory vs stored on disk gauge
gauge.bucket.quota.ram Amount of RAM used by the bucket (bytes). gauge
gauge.bucket.quota.rawRAM Amount of raw RAM used by the bucket (bytes). gauge
gauge.nodes.cmd_get Number of get commands gauge
gauge.nodes.couch_docs_actual_disk_size Amount of disk space used by Couch docs.(bytes) gauge
gauge.nodes.couch_docs_data_size Data size of couch documents associated with a node (bytes) gauge
gauge.nodes.couch_spatial_data_size Size of object data for spatial views (bytes) gauge
gauge.nodes.couch_spatial_disk_size Amount of disk space occupied by spatial views, in bytes. gauge
gauge.nodes.couch_views_actual_disk_size Amount of disk space occupied by Couch views (bytes). gauge
gauge.nodes.couch_views_data_size Size of object data for Couch views (bytes). gauge
gauge.nodes.curr_items Number of current items gauge
gauge.nodes.curr_items_tot Total number of items associated with node gauge
gauge.nodes.ep_bg_fetched Number of disk fetches performed since server was started gauge
gauge.nodes.get_hits Number of get hits gauge
gauge.nodes.mcdMemoryAllocated Amount of memcached memory allocated (bytes). gauge
gauge.nodes.mcdMemoryReserved Amount of memcached memory reserved (bytes). gauge
gauge.nodes.mem_used Memory used by the node (bytes) gauge
gauge.nodes.memoryFree Amount of memory free for the node (bytes). gauge
gauge.nodes.memoryTotal Total memory available to the node (bytes). gauge
gauge.nodes.ops Number of operations performed on Couchbase gauge
gauge.nodes.system.cpu_utilization_rate The CPU utilization rate (%) gauge
gauge.nodes.system.mem_free Free memory available to the node (bytes) gauge
gauge.nodes.system.mem_total Total memory available to the node (bytes) gauge
gauge.nodes.system.swap_total Total swap size allocated (bytes) gauge
gauge.nodes.system.swap_used Amount of swap space used (bytes) gauge
gauge.nodes.vb_replica_curr_items Number of items/documents that are replicas gauge
gauge.storage.hdd.free Free harddrive space in the cluster (bytes) gauge
gauge.storage.hdd.quotaTotal Harddrive quota total for the cluster (bytes) gauge
gauge.storage.hdd.total Total harddrive space available to cluster (bytes) gauge
gauge.storage.hdd.used Harddrive space used by the cluster (bytes) gauge
gauge.storage.hdd.usedByData Harddrive use by the data in the cluster(bytes) gauge
gauge.storage.ram.quotaTotal Ram quota total for the cluster (bytes) gauge
gauge.storage.ram.quotaTotalPerNode Ram quota total per node (bytes) gauge
gauge.storage.ram.quotaUsed Ram quota used by the cluster (bytes) gauge
gauge.storage.ram.quotaUsedPerNode Ram quota used per node (bytes) gauge
gauge.storage.ram.total Total ram available to cluster (bytes) gauge
gauge.storage.ram.used Ram used by the cluster (bytes) gauge
gauge.storage.ram.usedByData Ram used by the data in the cluster (bytes) gauge

gauge.bucket.basic.dataUsed

gauge

Size of user data within buckets of the specified state that are resident in RAM (%).

gauge.bucket.basic.diskFetches

gauge

Number of disk fetches.

gauge.bucket.basic.diskUsed

gauge

Amount of disk used (bytes).

gauge.bucket.basic.itemCount

gauge

Number of items associated with the bucket.

gauge.bucket.basic.memUsed

gauge

Amount of memory used by the bucket (bytes).

gauge.bucket.basic.opsPerSec

gauge

Number of operations per second.

gauge.bucket.basic.quotaPercentUsed

gauge

Percentage of RAM used (for active objects) against the configure bucket size (%).

gauge.bucket.op.cmd_get

gauge

requested objects.

gauge.bucket.op.couch_docs_fragmentation

gauge

Percent fragmentation of documents in this bucket.

gauge.bucket.op.couch_views_ops

gauge

view operations per second.

gauge.bucket.op.curr_connections

gauge

open connection per bucket.

gauge.bucket.op.curr_items

gauge

total number of stored items per bucket.

gauge.bucket.op.disk_write_queue

gauge

number of items waiting to be written to disk.

gauge.bucket.op.ep_bg_fetched

gauge

number of items fetched from disk.

gauge.bucket.op.ep_cache_miss_rate

gauge

ratio of requested objects found in cache vs retrieved from disk.

gauge.bucket.op.ep_diskqueue_drain

gauge

items removed from disk queue.

gauge.bucket.op.ep_diskqueue_fill

gauge

enqueued items on disk queue.

gauge.bucket.op.ep_mem_high_wat

gauge

memory high water mark - point at which active objects begin to be ejected from bucket.

gauge.bucket.op.ep_mem_low_wat

gauge

memory low water mark.

gauge.bucket.op.ep_num_value_ejects

gauge

number of objects ejected out of the bucket.

gauge.bucket.op.ep_oom_errors

gauge

request rejected - bucket is at quota, panic.

gauge.bucket.op.ep_queue_size

gauge

number of items queued for storage.

gauge.bucket.op.ep_tmp_oom_errors

gauge

request rejected - couchbase is making room by ejecting objects, try again later.

gauge.bucket.op.mem_used

gauge

memory used.

gauge.bucket.op.ops

gauge

total of gets, sets, increment and decrement.

gauge.bucket.op.vb_active_resident_items_ratio

gauge

ratio of items kept in memory vs stored on disk.

gauge.bucket.quota.ram

gauge

Amount of RAM used by the bucket (bytes)..

gauge.bucket.quota.rawRAM

gauge

Amount of raw RAM used by the bucket (bytes)..

gauge.nodes.cmd_get

gauge

Number of get commands.

gauge.nodes.couch_docs_actual_disk_size

gauge

Amount of disk space used by Couch docs.(bytes).

gauge.nodes.couch_docs_data_size

gauge

Data size of couch documents associated with a node (bytes).

gauge.nodes.couch_spatial_data_size

gauge

Size of object data for spatial views, in bytes.

gauge.nodes.couch_spatial_disk_size

gauge

Amount of disk space occupied by spatial views, in bytes.

gauge.nodes.couch_views_actual_disk_size

gauge

Amount of disk space occupied by Couch views (bytes)..

gauge.nodes.couch_views_data_size

gauge

Size of object data for Couch views (bytes)..

gauge.nodes.curr_items

gauge

Number of current items.

gauge.nodes.curr_items_tot

gauge

Total number of items associated with node.

gauge.nodes.ep_bg_fetched

gauge

Number of disk fetches performed since server was started.

gauge.nodes.get_hits

gauge

Number of get hits.

gauge.nodes.mcdMemoryAllocated

gauge

Amount of memcached memory allocated (bytes)..

gauge.nodes.mcdMemoryReserved

gauge

Amount of memcached memory reserved (bytes)..

gauge.nodes.mem_used

gauge

Memory used by the node (bytes).

gauge.nodes.memoryFree

gauge

Amount of memory free for the node (bytes)..

gauge.nodes.memoryTotal

gauge

Total memory available to the node (bytes)..

gauge.nodes.ops

gauge

Number of operations performed on Couchbase.

gauge.nodes.system.cpu_utilization_rate

gauge

The CPU utilization rate (%).

gauge.nodes.system.mem_free

gauge

Free memory available to the node (bytes).

gauge.nodes.system.mem_total

gauge

Total memory available to the node (bytes).

gauge.nodes.system.swap_total

gauge

Total swap size allocated (bytes).

gauge.nodes.system.swap_used

gauge

Amount of swap space used (bytes).

gauge.nodes.vb_replica_curr_items

gauge

Number of items/documents that are replicas.

gauge.storage.hdd.free

gauge

This is storage related metric. Free harddrive space in the cluster (bytes).

gauge.storage.hdd.quotaTotal

gauge

This is storage related metric. Harddrive quota total for the cluster (bytes).

gauge.storage.hdd.total

gauge

This is storage related metric. Total harddrive space available to cluster (bytes).

gauge.storage.hdd.used

gauge

This is storage related metric. Harddrive space used by the cluster (bytes).

gauge.storage.hdd.usedByData

gauge

This is storage related metric. Harddrive use by the data in the cluster(bytes).

gauge.storage.ram.quotaTotal

gauge

This is storage related metric. Ram quota total for the cluster (bytes).

gauge.storage.ram.quotaTotalPerNode

gauge

This is storage related metric. Ram quota total per node (bytes).

gauge.storage.ram.quotaUsed

gauge

This is storage related metric. Ram quota used by the cluster (bytes).

gauge.storage.ram.quotaUsedPerNode

gauge

This is storage related metric. Ram quota used per node (bytes).

gauge.storage.ram.total

gauge

This is storage related metric. Total ram available to cluster (bytes).

gauge.storage.ram.used

gauge

This is storage related metric. Ram used by the cluster (bytes).

gauge.storage.ram.usedByData

gauge

This is storage related metric. Ram used by the data in the cluster (bytes).