In this article

Overview

IT organizations often leverage centralized tools to monitor their systems. Typically, these centralized tools enable IT groups to do the following:

Avoid reliance on the tool to monitor itself.
Observe reduced Total Cost of Ownership (TCO) to manage the monitoring, alerts, and notification system in one place.
Manage the history by the retention of the data for audits.

The SnapLogic® Third-party Observability feature offers the capability to integrate pipeline runtime logs with your third-party monitoring tools. This feature enables your IT organization to track, troubleshoot, analyze, and optimize your integrations in production environments.

The SnapLogic platform uses OpenTelemetry to support telemetry data integration with third-party Observability tools. This beta feature release includes the implementation of the service to enable you to monitor your pipeline execution runtime logs in Datadog and New Relic.

This feature is a private beta. Contact your CSM to set up your organization with SnapLogic third-party Observability Tools Integration.

Certified Third-Party Observability Tools

Datadog
New Relic

This integration solution is designed to be compatible with any vendor or open-source Observability tool that supports the OpenTelemetry Collector-based approach for the collection, processing, and export of telemetry data. Due to the extensive variety of tools available in the observability space, we currently do not plan to certify every tool on the market.

Prerequisites

Must have Groundplexes
Must enable Open Telemetry Feature Flag for your Org

Support for Cloudplexes is on the product roadmap.

Implement OTEL Services

OpenTelemetry is an open-source, vendor-agnostic observability framework and toolkit designed to create and manage OpenTelemetry data. For the SnapLogic application, you can capture metrics and logs. Open Telemetry reporting sends messages of Error severity levels to your connected 3rd-party monitoring tools. Messages of all severity levels are retained in the logs.

How OpenTelemetry Works

The OpenTelemetry Collector receives, processes, and exports OpenTelemetry data. When you deploy the OpenTelemetry service, the Collector retrieves the data from the JCC node logs and routes them to your third-party monitoring tool using the OpenTelemetry Protocol (OTLP).

The following three components comprise the OpenTelemetry Collector:

Receiver - Defines how the logs are sent to the OpenTelemetry Collector.
Processor - Defines the method of log retrieval. In most cases, batch mode is preferred.
Exporter - Specifies the third-party tool.

The diagram below provides shows the architecture of the OpenTelemetry service that is implemented in the SnapLogic Platform.

This document describes only the retrieval of logs sent in and supported by the OpenTelemetry format, not the original log format from the Groundplex JCC node. Support for capturing the log data directly from the Groundplex JCC node is on the product roadmap.

Workflow

Enable the feature flag for your Org.
Set up the OpenTelemetry Service.
Run your pipelines.
Observe pipeline runtime data in the monitoring tool.

Enable Feature Flag for the OpenTelemetry Service

Work with your CSM to enable the following feature flag to your target Orgs/Environments:

"com.snaplogic.cc.log.RuntimeLogWriter.OPEN_TELEMETRY_LOGGER_ENABLED": true

Datadog Monitoring Tools Support Workflow

Step 1: Install the OpenTelemetry package.

Step 2: Configure the OpenTelemetry service.

Step 3: Deploy the OpenTelemetry service.

Step 1: Install the OpenTelemetry Package

Download the OpenTelemetry Collector Contrib package.
Save the package on the same JCC node of the machine that hosts the Groundplex.

Step 2: Configure the OpenTelemetry Services

Create the YAML configuration file.
You can use the following template:

Specify the gRPC URL in the YAML template. This is in the environment variable in your host machine.
For batch-mode processing of the logs, define the following values for the processors.
- send_batch_max_size: 100
- send_batch_size: 10
- timeout: 10s
For exporters, add the values for your Datadog API:
- site: "http://datadoghq.com "
- key: ${env:DD_API_KEY}

The following is an example of a YAML configuration file:

Step 3: Deploy the OpenTelemetry Service

From the JCC node with the OpenTelemetry package, you can deploy on any Groundplexes running the following:
- Linux
- Microsoft Windows
- Docker container
(Optional) To run the service on a separate Docker container, run the following Docker commands (where $DD_API_KEY is the Datadog API key environment variable).

Test the OpenTelemetry Service by running some pipelines on the target Groundplexes, then check your monitoring tool for the runtime data.

We recommend you use the values in the YAML sample. Consult your CSM if you plan to change these values.

Access Additional Metrics

In the November release, a metrics collection has become available to the OpenTelemetry service. These metrics can be transmitted to your monitoring tool by uncommenting the following lines in the YAML template:

  # Data sources: metrics
  filter:
    metrics:
      include:
        match_type:
        metric_names:

To choose which metrics to use for reporting, filter by regex, as shown in the YAML setting below:

Refer to the Metrics Reference for definitions of each metric.

Monitor Logs in Datadog

You should start seeing logs after OpenTelemetry Collector is started. When you run a pipeline, the details are captured in the pipeline runtimes.

Notice that some log files are generated with the runtime.

In the Datadog UI, we can observe the log files immediately.

Click the log file to open up a details pane with real-time information.

New Relic Platform Observability Support Workflow

The SnapLogic application supports integrations with the New Relic platform. You can stream pipeline runtime logs to a New Relic endpoint to track execution status and details. After implementing the OTEL Collector, you can use a YAML file to implement the service.

All pipeline runtime Logs in New Relic with 100 ms replication lag or less are observable. All execution logs contain the Status of INFO.

Step 1: Install the OpenTelemetry Package

Download the OpenTelemetry Collector Contrib package.
Save the package on the same JCC node of the machine that hosts the Groundplex.

Step 2: Prepare the YAML File

Download the YAML file, shown below.

extensions:
  health_check:
  pprof:
    endpoint: 0.0.0.0:1777
  zpages:
    endpoint: 0.0.0.0:55679
receivers:
  otlp:
    protocols:
      grpc:
      http:
  opencensus:
  jaeger:
    protocols:
      grpc:
      thrift_binary:
      thrift_compact:
      thrift_http:
  zipkin:
processors:
  batch:
    end_batch_max_size: 100
    send_batch_size: 10
    timeout: 10s
exporters:
  otlp:
    endpoint: https://otlp.nr-data.net:4317
    headers:  
      "api-key": <NEW RELIC LICENSE API KEY>
  logging:
    verbosity: detailed
service:
  pipelines:
    traces:
      receivers: [otlp, opencensus, jaeger, zipkin]
      processors: [batch]
      exporters: [logging]
    metrics:
      receivers: [otlp, opencensus]
      processors: [batch]
      exporters: [otlp, logging]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp, logging]
  extensions: [health_check, pprof, zpages]

Set up the Connection to New Relic

Set up the OTEL Collector, using the steps in Step 1: Install the OpenTelemetry package.
If you already have the OTEL Collector set up, go to the next step.
Either create an account in the New Relic application, or if you already have an account, continue with the next step.
Copy the API key from the Administration > API keys page in the New Relic UI.
Save the YAML file to the following location: /etc/otelcol
Add the API key to the YAML file under headers:
Apply changes to the YAML file and restart the Open Telemetry service.

Monitor Runtime Logs in New Relic

Run some tasks or pipelines on the target Snaplex, and you can observe the summary of pipeline execution logs in the New Relic UI, as shown in the example image:

Click each log to open a view of the Log Details, as shown below:

Metrics Reference

Metric Name	Description
Java Heap Metrics
`plexnode.java.heap.used.bytes`	Number of bytes used of the heap.
`plexnode.java.heap.total.bytes`	Number of bytes allocated for the heap.
`plexnode.java.heap.used.pct`	Percentage of the heap being used (`used / total * 100%`).
`plexnode.java.heap.max.bytes`	The maximum heap size.
`plexnode.java.heap.used.of.max.pct`	Percentage of the max heap being used (`used / max * 100%`). This is the same as what the product dashboard reflects.
Java Non-Heap Metrics
`plexnode.java.nonheap.used.bytes`	Number of bytes used of the off-heap.
`plexnode.java.nonheap.total.bytes`	Number of bytes allocated for the off-heap.
`plexnode.java.nonheap.used.pct`	Percentage of the off heap being used (`used / total * 100%`).
`plexnode.java.nonheap.max.bytes`	Number of bytes the JVM allowed to allocate off heap.
`plexnode.java.nonheap.used.of.max.pct`	Percentage of the max off heap being used (`used / max * 100%`).
CPU Usage Metrics
`plexnode.cpu.vcpus.count`	Number of vCPUs available on the machine.
`plexnode.cpu.load.1min.average`	Last minute load average on the machine.
`plexnode.cpu.load.pct`	System CPU utilization as % of total.
`plexnode.cpu.process.load.pct`	JCC Process CPU utilization as % of total.
Disk Usage Metrics
`plexnode.disk.total.bytes`	Overall amount of bytes available on the disk.
`plexnode.disk.used.bytes`	Amount of bytes used from the disk.
`plexnode.disk.used.pct`	Percentage of bytes used from the disk.
`plexnode.disk.usable.bytes`	Amount of bytes available on the disk.
`plexnode.disk.usable.pct`	Percentage of bytes available on the disk.
Memory Usage Metrics
`plexnode.mem.physical.total.bytes`	Overall amount of memory in bytes available on the machine.
`plexnode.mem.physical.free.bytes`	The amount of memory in bytes available for allocation on the machine.
`plexnode.mem.physical.used.bytes`	The amount of memory in bytes allocated on the machine.
`plexnode.mem.physical.free.pct`	Percentage of memory available for allocation on the machine.
`plexnode.mem.physical.used.pct`	Percentage of memory allocated on the machine.
`plexnode.mem.swap.total.bytes`	The amount of bytes allocated for swap memory on the machine.
`plexnode.mem.swap.free.bytes`	The amount of bytes available in the swap memory on the machine.
`plexnode.mem.swap.used.bytes`	The amount of bytes being allocated in the swap memory on the machine, -1 if the swap is not enabled.
`plexnode.mem.swap.free.pct`	Percentages of total swap memory available on the machine, or -1 if swap is not enabled.
`plexnode.mem.swap.used.pct`	Percentage of total swap memory allocated on the machine, or -1 if swap is not enabled.
`plexnode.mem.virtual.committed.bytes`	The amount of virtual memory that is guaranteed to be available to the running process in bytes.
File Utilization Metrics
`plexnode.file.descriptor.used.count`	Amount of used file descriptors in the system.
`plexnode.file.descriptor.free.count`	Amount of free file descriptors in the system.
`plexnode.file.descriptor.max`	Amount of available (configured) file descriptors in the system.
`plexnode.file.descriptor.used.pct`	Percentage of used file descriptors in the system.
`plexnode.file.descriptor.free.pct`	Percentage of free file descriptors in the system.
Thread Utilization Metrics
`plexnode.thread.jvm.count`	The current number of live threads including both daemon and non-daemon threads within JVM.
Slots Utilization Metrics
`plexnode.slots.leased`	Amount of slots leased at the moment. NOTE: if the slots are leased and released in-between the scrapes, the change won’t be reflected in the values.
`plexnode.slots.max`	Amount of slots available (configured) on the node.
`plexnode.slots.leased.meter`	Amount of slots leased at the moment for the pipeline. NOTE: if the slots are leased and released in-between the scrapes, the change won’t be reflected in the values.
Network IO Metrics
`plexnode.net.received.bytes`	Overall amount of bytes received through the interface.
`plexnode.net.sent.bytes`	Overall amount of bytes sent through the interface.
`plexnode.net.received.packets`	Overall amount of packets received though the interface.
`plexnode.net.sent.packets`	Overall amount of packets sent through the interface.
`plexnode.net.in.errors`	Overall amount of input errors through the interface.
`plexnode.net.out.error`	Overall amount of output errors through the interface.
`plexnode.net.in.drops`	Incoming/received dropped packets per interface. On Microsoft Windows, returns discarded incoming packets.
Pipeline Activity Metrics
`plexnode.pipelines.intiated.meter`	The pipeline is being initiated and being prepared/started for execution.
`plexnode.pipelines.finished.meter`	The pipeline finished its execution.
`plexnode.pipelines.requested.meter`	The pipeline has been requested to get ready but has not actually started. This happens for some flows of the UI/Designer, for example, when opening a configuration dialog for the Snap.
`plexnode.pipelines.active.total`	The number of active pipelines at the moment
Feedmaster Broker Metrics
`plexnode.feedmaster.destination.enqueued`	The amount of messages the consumer has read out of the queue (applies to Ultra Pipelines).
`plexnode.feedmaster.destination.dequeued`	The amount of messages the producer has written into the queue (applies to Ultra Pipelines).
Snaplex State Metrics
`plexnode.state.leader`	1 if node considers itself a leader, otherwise 0.
`plexnode.state.neighbors.active`	The number of neighbors visible to the node including itself. The node considers to be visible (active) from standing point of the other node if the heartbeat is successful and the state is either `RUNNING` or `COOLDOWN`.

Third-party Observability Tools Integration (Beta)

Overview

Certified Third-Party Observability Tools

Prerequisites

Implement OTEL Services

How OpenTelemetry Works

Workflow

Enable Feature Flag for the OpenTelemetry Service

Datadog Monitoring Tools Support Workflow

Step 1: Install the OpenTelemetry Package

Step 2: Configure the OpenTelemetry Services

Step 3: Deploy the OpenTelemetry Service

Access Additional Metrics

Monitor Logs in Datadog

New Relic Platform Observability Support Workflow

Step 1: Install the OpenTelemetry Package

Step 2: Prepare the YAML File

Set up the Connection to New Relic

Monitor Runtime Logs in New Relic

Metrics Reference