Skip to end of banner
Go to start of banner

Deploy OTEL as a Daemonset on Kubernetes to log Groundplex metrics and alerts

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

You can monitor a Groundplex deployed in Kubernetes (K8s) with Datadog, or any observablity tool that supports OpenTelemetry (OTEL). This requires deployment of an OTEL collector to harvest the metrics and the logs that Groundplex nodes generate. The OTEL collection daemons should run in the same pods as the Groundplex nodes. You can achieve this by deploying the OTEL collector as a K8s Daemonset. For more details, refer to the OTEL documentation: Important Components for Kubernetes.

The GitHub open-telemetry-collector-contrib repository provides resources for deploying an OTEL collector in Kubernetes. This page shows how to deploy the collector as a daemon set and view log output in Datadog, using resources from the GitHub repo.

Prerequisites:

  • A working K8s cluster

  • The configmap.yaml, service.yaml, roles.yaml, serviceaccount.yaml and opentelemetry.yaml files from the open-telemetry-collector-contrib repository (available in an attached zip file for your convenience)

  • A prepared SnapLogic Groundplex Kubernetes configuration

  • Your Datadog API key

  • A few pipelines running on the Groundplex to produce the events to log

Get started

We’ve provided a ZIP file containing the pre-configured YAML files required by Kubernetes. To add your Datadog API key, you only need to edit one of the files.

  1. Download and extract the file.

  2. Open configmap.yaml.

  3. On line 60, replace <DD api key> with your Datadog API key. For example: 60f0**************************1c

  4. View the comments and configuration:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: otel-agent-conf
      labels:
        app: opentelemetry
        component: otel-agent-conf
    data:
      otel-agent-config: |
        receivers:
          otlp:
            protocols:
              grpc:
              http:
          # The hostmetrics receiver is required to get correct infrastructure metrics in Datadog.
          hostmetrics:
            collection_interval: 10s
            scrapers:
              paging:
                metrics:
                  system.paging.utilization:
                    enabled: true
              cpu:
                metrics:
                  system.cpu.utilization:
                    enabled: true
              disk:
              filesystem:
                metrics:
                  system.filesystem.utilization:
                    enabled: true
              load:
              memory:
              network:
              processes:
    
          filelog:
            include_file_path: true
            poll_interval: 500ms
            include:
              # This will ensure that logs from the following path are collected.
              - /var/log/**/*otel-collector*/*.log
          # # Uncomment this block below to get access to system metrics regarding
          # # the OpenTelemetry Collector and its environment, such as spans or metrics
          # # processed, running and sent, queue sizes, uptime, k8s information
          # # and much more.
          #
          # # The prometheus receiver scrapes essential metrics regarding the OpenTelemetry Collector.
          # prometheus:
          #   config:
          #     scrape_configs:
          #     - job_name: 'otelcol'
          #       scrape_interval: 10s
          #       static_configs:
          #       - targets: ['0.0.0.0:8888']
        exporters:
          logging:
          datadog:
            api:
              key: <DD api key>
        processors:
          resourcedetection:
            # ensures host.name and other important resource tags
            # get picked up
            detectors: [system, env, docker]
            timeout: 5s
            override: false
          # adds various tags related to k8s
          k8sattributes:
            passthrough: false
            auth_type: "serviceAccount"
    
            pod_association:
            - sources:
              - from: resource_attribute
                name: k8s.pod.ip
            - sources:
              - from: resource_attribute
                name: k8s.pod.uid
            - sources: # If neither of those work, use the request's connection to get the pod IP.
              - from: connection
    
            extract:
              metadata:
                - k8s.pod.name
                - k8s.pod.uid
                - k8s.deployment.name
                - k8s.node.name
                - k8s.namespace.name
                - k8s.pod.start_time
                - k8s.replicaset.name
                - k8s.replicaset.uid
                - k8s.daemonset.name
                - k8s.daemonset.uid
                - k8s.job.name
                - k8s.job.uid
                - k8s.cronjob.name
                - k8s.statefulset.name
                - k8s.statefulset.uid
                - container.image.name
                - container.image.tag
                - container.id
                - k8s.container.name
                - container.image.name
                - container.image.tag
                - container.id
    
              labels:
              - tag_name: kube_app_name
                key: app.kubernetes.io/name
                from: pod
              - tag_name: kube_app_instance
                key: app.kubernetes.io/instance
                from: pod
              - tag_name: kube_app_version
                key: app.kubernetes.io/version
                from: pod
              - tag_name: kube_app_component
                key: app.kubernetes.io/component
                from: pod
              - tag_name: kube_app_part_of
                key: app.kubernetes.io/part-of
                from: pod
              - tag_name: kube_app_managed_by
                key: app.kubernetes.io/managed-by
                from: pod
    
          batch:
            # Datadog APM Intake limit is 3.2MB. Let's make sure the batches do not
            # go over that.
            send_batch_max_size: 1000
            send_batch_size: 100
            timeout: 10s
        service:
          # This will make the collector output logs in JSON format
          telemetry:
            logs:
              encoding: "json"
              initial_fields:
                # Add the service field to every log line. It can be used for filtering in Datadog.
                - service: "otel-collector"
          pipelines:
            metrics:
              receivers: [hostmetrics, otlp]
              processors: [resourcedetection, k8sattributes, batch]
              exporters: [logging, datadog]
            traces:
              receivers: [otlp]
              processors: [resourcedetection, k8sattributes, batch]
              exporters: [logging, datadog]
            logs:
              receivers: [filelog, otlp]
              processors: [batch]
              exporters: [logging, datadog]        

Kubernetes requires a service definition, role, and service account for the otel-collector. These are configured in the service.yaml, roles.yaml, and serviceaccount.yaml manifests. Refer to the Kubernetes documentation on services.

The service.yaml file

View the service.yaml file extracted from the zip file. The following entries configure the otel-collector service. The example defines both GRPC and HTTP ports for your reference.

Datadog does not use the HTTP port. If you are not using Datadog, check your tool documentation for its requirements.

apiVersion: v1
kind: Service
metadata:
  name: otel-collector
spec:
  ports:
  - name: grpc-otlp
    port: 4317
    protocol: TCP
    targetPort: 4317
  - name: http-otlp
    port: 4318
    protocol: TCP
    targetPort: 4318
  selector:
    app.kubernetes.io/name: otel-collector
  type: ClusterIP

The roles.yaml file

Open the roles.yaml and view the otel-collector-role, its rules, and binding as shown below.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-collector-role
rules:
rules:
  - apiGroups:
      - ''
    resources:
      - 'pods'
      - 'namespaces'
    verbs:
      - 'get'
      - 'watch'
      - 'list'
  - apiGroups:
      - 'apps'
    resources:
      - 'replicasets'
    verbs:
      - 'get'
      - 'list'
      - 'watch'
  - apiGroups:
      - 'extensions'
    resources:
      - 'replicasets'
    verbs:
      - 'get'
      - 'list'
      - 'watch'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: otel-collector
subjects:
- kind: ServiceAccount
  name: otel-collector-account
  namespace: default
roleRef:
  kind: ClusterRole
  name: otel-collector-role
  apiGroup: rbac.authorization.k8s.io

The serviceaccount.yaml file

Open the serviceaccount.yaml file and view the service account definition:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: otel-collector-account
  namespace: default

The daemonset.yaml file

View the daemonset.yaml file and note that the otel-collector contrib image version on line 26:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otel-agent
  labels:
    app: opentelemetry
    component: otel-collector
spec:
  selector:
    matchLabels:
      app: opentelemetry
      component: otel-collector
  template:
    metadata:
      labels:
        app.kubernetes.io/name: otel-collector
        app: opentelemetry
        component: otel-collector
    spec:
      serviceAccountName: otel-collector-account
      containers:
        - name: collector
          command:
            - "/otelcol-contrib"
            - "--config=/conf/otel-agent-config.yaml"
          image: otel/opentelemetry-collector-contrib:0.101.0
          resources:
            limits:
              cpu: 1
              memory: 2Gi
            requests:
              cpu: 200m
              memory: 400Mi
          ports:
            - containerPort: 4318 # default port for OpenTelemetry HTTP receiver.
              hostPort: 4318
            - containerPort: 4317 # default port for OpenTelemetry gRPC receiver.
              hostPort: 4317
            - containerPort: 8888 # Default endpoint for querying metrics.
          volumeMounts:
            - name: otel-agent-config-vol
              mountPath: /conf
            - name: varlogpods
              mountPath: /var/log/pods
              readOnly: true
            - name: varlibdockercontainers
              mountPath: /var/lib/docker/containers
              readOnly: true
          env:
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
          # The k8s.pod.ip is used to associate pods with k8sattributes.
          # It is useful to have in the Collector pod because receiver metrics can also
          # benefit from the tags.
          - name: OTEL_GRPC_URL
            value: "k8s.pod.ip=$(POD_IP):4317"
      volumes:
        - name: otlpgen
          hostPath:
            path: /otlpgen
        - name: otel-agent-config-vol
          configMap:
            name: otel-agent-conf
            items:
              - key: otel-agent-config
                path: otel-agent-config.yaml
        # Mount nodes log file location.
        - name: varlogpods
          hostPath:
            path: /var/log/pods
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers

Deploy the OTEL collector

  1. Save the pre-configured Kubernetes files in the /etc/kubernetes/manifests/ folder of your K8s installation.

  2. Execute the following commands in the /etc/kubernetes/manifests/ folder:

kubectl apply -f configmap.yaml
kubectl apply -f serviceaccount.yaml
kubectl apply -f roles.yaml
kubectl apply -f service.yaml
kubectl apply -f daemonset.yaml

After Kubernetes applies the files, the OTEL collector starts up. Use the kubectl get all command to confirm. The relevant lines are highlighted in the following screenshot.

Deploy Snaplogic Groundplex nodes to Kubernetes

After deploying a Groundplex as described in Install a Groundplex on Kubernetes, add the information to connect the otel-collector to the Groundplex nodes. You can use the name of the K8s service or the CLUSTER-IP:

  • The name won’t change on re-deployment, but the IP address can. Our example uses the name for this reason.

  • CLUSTER-IP is preferable if you plan to change the name of the service.

Find these values by executing the kubectl cluster-info command to view the services:

  1. Open deployment.yaml and deployment-feed.yaml from the Groundplex installation helm_chart/templates directory. Add the following in the spec: containers: section.

    - name: OTEL_GRPC_URL
      value: http://otel-collector:4317
    - name: POD_IP
      valueFrom:
        fieldRef:
          fieldPath: status.podIP
    - name: OTEL_RESOURCE_ATTRIBUTES
      value: "k8s.pod.ip=$(POD_IP)"
    - name: HOST_IP
      valueFrom:
        fieldRef:
          fieldPath: status.hostIP
  2. Save the file and make sure that all other installation steps are finished.

  3. Execute the command to start the Groundplex nodes:

helm install <snaplogic_name> <name of helm chart folder>
  1. Wait until the nodes spin up and are visible in SnapLogic Monitor.

  2. Execute some pipelines on the Groundplex.

  3. In Datadog, check the logs. You should see log entries related to asset executions. Log entries contain nodeLabel info, which is helpful for understanding which node asset was executed:

    Check the values on the Metrics page:

    You can use widgets to create a dashboard that captures the most important metrics:




  • No labels