Docs » µAPM Deployment Guide » Managing multiple application environments with SignalFx µAPM

Managing multiple application environments with SignalFx µAPM

A common infrastructure and application deployment pattern is to have multiple, distinct application environments that don’t interact directly with each other but that are all being monitored by SignalFx: QA and production environments, or multiple distinct deployments in different datacenters, regions or cloud providers. As those environments may have very different performance characteristics, it is important to monitor them independently and for the Smart Gateway to establish distinct baselines for the traces and spans emanating from each one of those environments to guarantee the efficacy of SignalFx’s NoSample tail-based sampling algorithms.

SignalFx’s Microservices APM refers to those distinct environments as “clusters”. Each cluster is to be served by its own Smart Gateway deployment, configured with distinct cluster names. In the SignalFx UI, those clusters are presented separately, with all service maps, trace and span metrics, and historical performance appropriately segregated by cluster.

This cluster name must be specified in each Smart Gateway’s configuration and in each Smart Agent’s configuration. All Smart Agents and Smart Gateways that are part of the same application environment should use the same cluster name (for example, qa or prod-us).

Deploying multiple Smart Gateway clusters

Note about realms

A realm is a self-contained deployment of SignalFx in which your organization is hosted. Different realms have different API endpoints (e.g. the endpoint for sending data is ingest.us1.signalfx.com for the us1 realm, and ingest.eu0.signalfx.com for the eu0 realm).

Various statements in the instructions below include a YOUR_SIGNALFX_REALM placeholder that you should replace with the actual name of your realm. This realm name is shown on your profile page in SignalFx. If you do not include the realm name when specifying an endpoint, SignalFx will interpret it as pointing to the us0 realm.

Cluster name configuration

As mentioned in the SignalFx µAPM architecture, all traces and spans from your environment must flow through the Smart Gateway. When operating multiple application environments, the applications from each environment must send their traces and spans through their respective Smart Gateway deployment. Regardless of how big those deployments are, from single-instance Smart Gateway to highly-available Smart Gateway clusters, each deployed Smart Gateway must be configured with the appropriate cluster name that identifies this application environment.

The cluster name is configured in the Smart Gateway’s JSON configuration file using the top level ClusterName setting:

{
  "ServerName": "YOUR_GATEWAY_HOST",
  "ClusterName": "YOUR_CLUSTER_NAME",
  "ListenFrom": [
    {
      "Type": "signalfx",
      "ListenAddr": "0.0.0.0:8080"
    }
  ],
  "ForwardTo": [
    {
      "Type": "signalfx",
      "URL": "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com/v2/datapoint",
      "EventURL": "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com/v2/event",
      "TraceURL": "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com/v1/trace",
      "DefaultAuthToken": "YOUR_SIGNALFX_API_TOKEN",
      "Name": "smart-gateway-forwarder",
      "TraceSample": {
        "BackupLocation": "/var/lib/gateway/data"
      }
    }
  ]
}

Trace retention budget allocation

SignalFx’s µAPM NoSample architecture automatically selects the most interesting traces to retain based on several factors, such as latency outliers, errors, or rareness of spans or traces as a whole. The number of traces selected by the Smart Gateway each minute – also known as retained traces per minute (TPM) – is determined by your SignalFx subscription.

This quota must be split among all your clusters, based on what proportion of this volume you want to allocate to each one of your application environments. To allocate a proportion percentage of your quota to a particular cluster, set the ClusterPercent property in your Smart Gateway configuration’s TraceSample section to the desired percentage (between 0 and 100; defaults to 100). All members of the same Smart Gateway cluster must be configured with the same value ClusterPercent value.

{
  "ServerName": "YOUR_GATEWAY_HOST",
  "ClusterName": "YOUR_CLUSTER_NAME",
  "ListenFrom": [
    {
      "Type": "signalfx",
      "ListenAddr": "0.0.0.0:8080"
    }
  ],
  "ForwardTo": [
    {
      "Type": "signalfx",
      "URL": "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com/v2/datapoint",
      "EventURL": "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com/v2/event",
      "TraceURL": "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com/v1/trace",
      "DefaultAuthToken": "YOUR_SIGNALFX_API_TOKEN",
      "Name": "smart-gateway-forwarder",
      "TraceSample": {
        "BackupLocation": "/var/lib/gateway/data",
        "ClusterPercent": 50
      }
    }
  ]
}

Smart Agent configuration for host correlation

Note

The Smart Agent configuration options used to specify which application cluster a host participates in require Smart Agent version 4.7.2 or above. For more information on how to upgrade the Smart Agent, read the Smart Agent Quick Start and the Smart Agent Release Notes.

To enable service-to-host correlation between your µAPM-instrumented services and the hosts that power them for each distinct application environment, you must configure the Smart Agent on these hosts with the corresponding cluster name. Set the top-level cluster configuration property in the Smart Agent’s configuration file, usually found at /etc/signalfx/agent.yaml, to match the Smart Gateway’s ClusterName used in that environment:

---
signalFxAccessToken: YOUR_SIGNALFX_TOKEN
signalFxRealm: YOUR_SIGNALFX_REALM
cluster: YOUR_CLUSTER_NAME
...

Setting or changing the value of the cluster configuration property requires a restart the Smart Agent. If it is installed directly on the host (as opposed to being run as a Docker container), restart the agent with the following command:

$ sudo service signalfx-agent restart

This configuration property instructs the Smart Agent to add the cluster property to your host-identifying dimensions, allowing you to filter or aggregate the metrics reported by the Smart Agent through the available values of this cluster property.

Managing multiple clusters in the SignalFx UI

Cluster selector on the Services and Traces pages

On the µAPM Services and µAPM Traces pages, a dropdown selector allows for selecting the cluster that you want to look at. You can filter all the data, service maps, traces, and metrics, to show only information about the selected cluster.

Monitoring multiple Smart Gateway clusters

Regardless of how many application environments you are monitoring with SignalFx µAPM, you’ll want to keep an eye on the performance and resource consumption of your deployed Smart Gateways through the corresponding built-in dashboards. Those dashboards offer a cluster dashboard variable that lets you filter your view to a specified Smart Gateway cluster.