Advanced configuration for Kubernetes π
See the following advanced configuration options for the Collector for Kubernetes.
For basic Helm chart configuration, see Configure Helm for Kubernetes. For log configuration, refer to Configure logs for Kubernetes.
Note
The values.yaml file lists all supported configurable parameters for the Helm chart, along with a detailed explanation of each parameter. Review it to understand how to configure this chart.
The Helm chart can also be configured to support different use cases, such as trace sampling and sending data through a proxy server. See Examples of chart configuration for more information.
Override the default configuration π
You can override the default OpenTelemetry agent configuration to use your own configuration. To do this, include a custom configuration using the agent.config
parameter in the values.yaml file. For example:
agent:
enabled: true
# Metric collection from k8s control plane components.
controlPlaneMetrics:
apiserver:
enabled: true
controllerManager:
enabled: true
coredns:
enabled: false
proxy:
enabled: true
scheduler:
enabled: false
This custom configuration is merged into the default agent configuration.
Caution
After merging the files you need to fully redefine parts of the configuration, for example service
, pipelines
, logs
, and processors
.
Override a control plane configuration π
If any of the control plane metric receivers are activated under the agent.controlPlaneMetrics
configuration section, then the Helm chart will configure the Collector to use the activated receivers to collect metrics from the control plane.
To collect control plane metrics, the Helm chart uses the Collector on each node to use the receiver creator to represent control plane receivers at runtime. The receiver creator has a set of discovery rules that know which control plane receivers to create. The default discovery rules can vary depending on the Kubernetes distribution and version. See Receiver creator receiver for more information.
If your control plane is using non-standard specifications, then you can provide a custom configuration to allow the Collector to successfully connect to it.
The Collector relies on pod-level network access to collect metrics from the control plane pods. Since most cloud Kubernetes as a service distributions donβt expose the control plane pods to the end user, collecting metrics from these distributions is not supported.
Availability and configuration instructions π
The following distributions are supported:
Kubernetes 1.22 (kops created)
OpenShift version 4.9
The following distributions are not supported:
AKS
EKS
EKS/Fargate
GKE
GKE/Autopilot
See the agent template for the default configurations for the control plane receivers.
Refer to the following documentation for information on the configuration options and supported metrics for each control plane receiver:
etcd server. To retrieve etcd metrics, see Setting up etcd metrics .
Known issue π
There is a known limitation for the Kubernetes proxy control plane receiver. When using a Kubernetes cluster created via kops, a network connectivity issue prevents proxy metrics from being collected. The limitation can be addressed by updating the kubeProxy metric bind address in the kops cluster specification:
Set
kubeProxy.metricsBindAddress: 0.0.0.0
in the kops cluster specification.Run
kops update cluster {cluster_name}
andkops rolling-update cluster {cluster_name}
to deploy the change.
Using custom configurations for non-standard control plane components π
You can override the default configuration values used to connect to the control plane. If your control plane uses nonstandard ports or custom TLS settings, you need to override the default configurations. The following example shows how to connect to a nonstandard API server that uses port 3443 for metrics and custom TLS certs stored in the /etc/myapiserver/ directory.
agent:
config:
receivers:
receiver_creator:
receivers:
# Template for overriding the discovery rule and configuration.
# smartagent/{control_plane_receiver}:
# rule: {rule_value}
# config:
# {config_value}
smartagent/kubernetes-apiserver:
rule: type == "port" && port == 3443 && pod.labels["k8s-app"] == "kube-apiserver"
config:
clientCertPath: /etc/myapiserver/clients-ca.crt
clientKeyPath: /etc/myapiserver/clients-ca.key
skipVerify: true
useHTTPS: true
useServiceAccount: false
Run the container in non-root user mode π
Collecting logs often requires reading log files that are owned by the root user. By default, the container runs with securityContext.runAsUser = 0 which gives the root user permission to read those files. To run the container in non-root user mode, set .agent.securityContext. The log data permissions will be adjusted to match the securityContext configurations. For instance:
- agent:
- securityContext:
runAsUser: 20000 runAsGroup: 20000
Note
Running the collector agent for log collection in non-root mode is not currently supported in CRI-O and OpenShift environments at this time, for more details see the related GitHub feature request issue .
Use the Network Explorer to collect telemetry π
Network Explorer allows you to collect network telemetry and send it to the OpenTelemetry Collector gateway.
To enable the Network Explorer, set the enabled
flag to true
:
networkExplorer:
enabled: true
Caution
Activating the network explorer automatically activates the OpenTelemetry Collector gateway.
Prerequisites π
Network Explorer is only supported in the following Kubernetes-based environments on Linux hosts:
RedHat Linux 7.6+
Ubuntu 16.04+
Debian Stretch+
Amazon Linux 2
Google COS
Modify the reducer footprint π
The reducer is a single pod per Kubernetes cluster. If your cluster contains a large number of pods, nodes, and services, you can increase the resources allocated to it.
The reducer processes telemetry in multiple stages, with each stage partitioned into one or more shards, where each shard is a separate thread. Increasing the number of shards in each stage expands the capacity of the reducer. There are three stages: ingest, matching, and aggregation. You can set between 1 to 32 shards for each stage. There is one shard per reducer stage by default.
The following example sets the reducer to use 4 shards per stage.
networkExplorer:
reducer:
ingestShards: 4
matchingShards: 4
aggregationShards: 4
Customize network telemetry generated by the Network Explorer π
Metrics can be deactivated, either individually or by entire categories. See the values.yaml for a complete list of categories and metrics.
To disable an entire category, give the category name, followed by .all
:
networkExplorer:
reducer:
disableMetrics:
- tcp.all
Disable individual metrics by their names:
networkExplorer:
reducer:
disableMetrics:
- tcp.bytes
You can mix categories and names. For example, yo disable all http metrics and the udp.bytes
metric use:
networkExplorer:
reducer:
disableMetrics:
- http.all
- udp.bytes
Reactivate metrics π
To activate metrics you have deactivated, use enableMetrics
.
The disableMetrics
flag is evaluated before enableMetrics
, so you can deactivate an entire category, then re-activate individual metrics in that category that you are interested in.
For example, to deactivate all internal and http metrics but keep ebpf_net.collector_health
, use:
networkExplorer:
reducer:
disableMetrics:
- http.all
- ebpf_net.all
enableMetrics:
- ebpf_net.collector_health
Configure features using gates π
Use the agent.featureGates
, clusterReceiver.featureGates
, and gateway.featureGates
configs to activate or deactivate features of the otel-collector
agent, clusterReceiver
, and gateway, respectively. These configs are used to populate the otelcol binary startup argument -feature-gates
.
For example, to activate feature1
in the agent, activate feature2
in the clusterReceiver
, and deactivate feature2
in the gateway, run:
helm install {name} --set agent.featureGates=+feature1 --set clusterReceiver.featureGates=feature2 --set gateway.featureGates=-feature2 {other_flags}
Set the pod security policy manually π
Support of Pod Security Policies (PSP) was removed in Kubernetes 1.25. If you still rely on PSPs in an older cluster, you can add PSP manually:
Run the following command to install the PSP. Donβt forget to add the
--namespace
kubectl argument if needed:
cat <<EOF | kubectl apply -f - apiVersion: policy/v1beta1 kind: PodSecurityPolicy metadata: name: splunk-otel-collector-psp labels: app: splunk-otel-collector-psp annotations: seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'runtime/default' apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default' seccomp.security.alpha.kubernetes.io/defaultProfileName: 'runtime/default' apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default' spec: privileged: false allowPrivilegeEscalation: false hostNetwork: true hostIPC: false hostPID: false volumes: - 'configMap' - 'emptyDir' - 'hostPath' - 'secret' runAsUser: rule: 'RunAsAny' seLinux: rule: 'RunAsAny' supplementalGroups: rule: 'RunAsAny' fsGroup: rule: 'RunAsAny' EOF
Add the following custom ClusterRole rule in your values.yaml file along with all other required fields like
clusterName
,splunkObservability
orsplunkPlatform
:
rbac: customRules: - apiGroups: [extensions] resources: [podsecuritypolicies] verbs: [use] resourceNames: [splunk-otel-collector-psp]
Install the Helm chart:
helm install my-splunk-otel-collector -f my_values.yaml splunk-otel-collector-chart/splunk-otel-collector