Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info

While Groundplex and Snaplex refer to the same thing, this article uses Snaplex in general when referring to the SnapLogic Manager, and Groundplex in the context of the computing resources underlying it.

Name

Every Snaplex requires a name, for example, ground-dev or ground-prod. In the SnapLogic Designer, you can choose on which Snaplex Pipelines are executed. The Snaplex configuration also has an Environment variable associated with it, for example, dev or prod. When you configure the nodes for the Snaplex, you must set the jcc.environment to the Environment value that you have configured for the Snaplex.

The hostname of the system used by a Snaplex can not use an underscore (_) in its name as per DNS standards. Avoid special characters as well.

After the Snaplex service is started on a node, the service connects to the SnapLogic Cloud service. Runtime logs from the Snaplex are written to the /opt/snaplogic/run/log (or c:\opt\snaplogic\run\log) directory. The Dashboard shows the nodes that are currently connected for each Snaplex.

JCC Node Communication within a Groundplex

We recommend that you set up JCC nodes in a Snaplex within the same network and data center. Communication between JCC nodes in the same Snaplex is required for the following reasons:

  • The Pipeline Execute Snap communicates directly with neighboring JCC nodes in a Snaplex to start child Pipeline executions and send documents between parent and child Pipelines.

  • The data displayed in Data Preview is written to and read from neighboring JCC nodes in the Snaplex.

  • The requests and responses made in Ultra Pipelines are exchanged between a FeedMaster node and all of the JCC nodes in a  Snaplex.

  • A Ground Triggered Task (invoked from a Groundplex) can be executed on a neighboring JCC node due to load-balancing—in which case, the prepare request and the bodies of the request and response are transferred between nodes.

Therefore, any extra latency or network hops between neighboring JCC nodes can introduce performance and reliability problems.

Snaplex Node Configuration

Snaplex nodes are typically configured using a slpropz configuration file, located in the $SL_ROOT/etc folder. 

If you use the slpropz file as your Snaplex configuration, then:

  • After a Snaplex node is started with the slpropz configuration, subsequent configuration updates are applied automatically.

  • Changing the Snaplex properties in Manager causes each Snaplex node to download the updated slpropz and initiates a rolling restart with no downtime on Snaplex instances with more than one node.

  • Some configuration changes, such as an update to the logging properties does not require a restart and are applied immediately.

...

If you have an older Snaplex installation and its configuration is defined in the global.properties file, then the Environment value must match the jcc.environment value In the JCC global.properties file. To migrate your Snaplex configuration to the slpropz mechanism, see Migrating Older Snaplex Nodes.

...

You should always configure your Snaplex instances using the slpropz file because you do not have to edit the slpropz files manually and changes to the Snaplex done through Manager are applied automatically to all nodes in that Snaplex, making configuration issues, which may prevent the Snaplex from starting, automatically reverted.

Computing Requirements

The Groundplex must conform to the following minimum specifications:

...

Memory (RAM) is used by the Pipelines to execute. Some Snaps, like Sort Snaps, which accumulate many documents, consume more memory; the amount of memory used is influenced by the volume and size of the documents being processed. For an optimum sizing analysis based on your requirements, contact your SnapLogic Sales Engineer.

Supported Operating Systems

The SnapLogic on-premises Snaplex is supported on the following operating systems:

...

Large clock skew can also affect communication between the FeedMaster and the JCC nodes. The Date.now() expression language function might be different between Snaplex nodes, and Internal log messages might have skewed timestamps, making it more difficult to debug issues.

Network Requirements

Network Throughput Guidelines

You should consider that a Groundplex requires connectivity to the SnapLogic Integration Cloud when running, as well as connectivity to the Cloud applications which may be used in the processes/Pipelines created and run in your solution. To optimize performance, we recommend the following network throughput guidelines:

...

Network Firewall Requirements

On-premises Snaplex (Groundplex)

To communicate with the SnapLogic Integration Cloud, SnapLogic On-premises Snaplex uses a combination of HTTPS requests and WebSockets communication over the TLS (SSL) tunnel. In order for this combination to operate effectively, you must configure the firewall to allow the following network communication requirements:

Feature

Required

Consequence

HTTP outbound Port 443

Yes

Does not function

HTTP HEAD

Desired

Without HEAD support, a full GET requires more time and bandwidth

Compression

Desired

Slower data transfer

Websockets (WSS protocol)

Yes

Does not function

Snaps using HTTP client without proxy support

Desired

Needs to use Snaps that support proxy settings

Port 8081 

Yes

Unable to reach Snaplex neighbor - https://hostname:8081

Needs to be available for communication between the nodes in a Snaplex.


  • The nodes of a Snaplex need to communicate among themselves, so it is important that they can each resolve each other's hostnames. This is required when you are making local calls into the Snaplex nodes for the execution of Pipelines rather than going through the SnapLogic Platform. The Pipelines are load balanced by SnapLogic with Tasks passed to the target node.

  • Communication between the customer-managed Groundplex and the SnapLogic-managed S3 bucket is over HTTPS with TLS enforced by default. The AWS provided S3 URL also uses an HTTPS connection with TLS enforced by default. If direct access from the Groundplex to the SnapLogic AWS S3 bucket is blocked, then the connection to the AWS S3 bucket communication falls back to a connection via the SnapLogic Control Plane that still uses TLS 1.2.

Network Guidelines for Snap Usage

In the SnapLogic Platform, the Snaps tactually communicate to and from the applications. The protocols and ports required for application communication are mostly determined by the endpoint applications themselves, and not by SnapLogic. It is common for Cloud/SaaS applications to communicate using HTTPS, although older applications and non-cloud/SaaS applications might have their own requirements. 

...

Each of these application connections may allow the use of proxy for the network connection, but it is a configuration option of the application’s connection—not one applied by SnapLogic.

FeedMaster Ports

The FeedMaster listens on the following two ports:

...

The machine hosting the FeedMaster needs to have those ports open on the local firewall, and the other Groundplex nodes need to allow outbound requests to the FeedMaster on those ports.

Snaplex Network Binding

By default, a Snaplex starts and binds to the localhost network interface on port 8090. Any clients can connect to the JCC note only if the client is also running on the same machine. This default is chosen since the Snaplex does not receive any inbound requests normally. Instead, it uses an outbound WebSocket connection to receive its requests from the SnapLogic Cloud services. If requests need to be sent to the Snaplex from the customer network, then you should configure the Snaplex to listen on its network interfaces. This would be required when a feed URL Pipeline execution request is done by pointing directly at the Snaplex host instead of pointing at the Cloud URL. To do this, set (default is 127.0.0.1):

Code Block
 jcc.jetty_host = 0.0.0.0 


in the etc/global.properties by adding it to the Update Snaplex dialog, Node Properties tab, Global properties table.

If you need to configure the hostname used by the Snaplex to be a different value than the machine name (for example newname.mydomain.com), add:

Code Block
jcc.jvm_options = -DSL_INTERNAL_HOST_NAME=newname.mydomain.com -DSL_EXTERNAL_HOST_NAME=newname.mydomain.com

in the etc/global.properties by adding it to the Update Snaplex dialog, Node Properties tab, Global properties table.

JCC Node Communication within a Groundplex

We recommend that you set up JCC nodes in a Snaplex within the same network and data center. Communication between JCC nodes in the same Snaplex is required for the following reasons:

  • The Pipeline Execute Snap communicates directly with neighboring JCC nodes in a Snaplex to start child Pipeline executions and send documents between parent and child Pipelines.

  • The data displayed in Data Preview is written to and read from neighboring JCC nodes in the Snaplex.

  • The requests and responses made in Ultra Pipelines are exchanged between a FeedMaster node and all of the JCC nodes in a  Snaplex.

  • A Ground Triggered Task (invoked from a Groundplex) can be executed on a neighboring JCC node due to load-balancing—in which case, the prepare request and the bodies of the request and response are transferred between nodes.

Therefore, any extra latency or network hops between neighboring JCC nodes can introduce performance and reliability problems.

Name

Every Snaplex requires a name, for example, ground-dev or ground-prod. In the SnapLogic Designer, you can choose on which Snaplex Pipelines are executed. The Snaplex configuration also has an Environment variable associated with it, for example, dev or prod. When you configure the nodes for the Snaplex, you must set the jcc.environment to the Environment value that you have configured for the Snaplex.

The hostname of the system used by a Snaplex can not use an underscore (_) in its name as per DNS standards. Avoid special characters as well.

After the Snaplex service is started on a node, the service connects to the SnapLogic Cloud service. Runtime logs from the Snaplex are written to the /opt/snaplogic/run/log (or c:\opt\snaplogic\run\log) directory. The Dashboard shows the nodes that are currently connected for each Snaplex.

Snaplex Node Configuration

Snaplex nodes are typically configured using a slpropz configuration file, located in the $SL_ROOT/etc folder. 

If you use the slpropz file as your Snaplex configuration, then:

  • After a Snaplex node is started with the slpropz configuration, subsequent configuration updates are applied automatically.

  • Changing the Snaplex properties in Manager causes each Snaplex node to download the updated slpropz and initiates a rolling restart with no downtime on Snaplex instances with more than one node.

  • Some configuration changes, such as an update to the logging properties does not require a restart and are applied immediately.

If you have an older Snaplex installation and its configuration is defined in the global.properties file, then the Environment value must match the jcc.environment value In the JCC global.properties file. To migrate your Snaplex configuration to the slpropz mechanism, see Migrating Older Snaplex Nodes.

You should always configure your Snaplex instances using the slpropz file because you do not have to edit the slpropz files manually and changes to the Snaplex done through Manager are applied automatically to all nodes in that Snaplex, making configuration issues, which may prevent the Snaplex from starting, automatically reverted.

Understanding Distribution of Data Processing across Snaplex Nodes

...

  • Snaplex – Data processing on Groundplex, Cloudplex, and eXtremeplex nodes occur principally in-memory as streaming, which is unencrypted.

  • Larger dataset – When larger datasets are processed that exceed the available compute memory, some Snaps like Sort and Join, which process multiple documents, write Pipeline data to the local disk as unencrypted during Pipeline execution to help optimize the performance.

These temporary files are deleted when the Snap/Pipeline execution completes. You can update your Snaplex to point to a different temporary location in the Global properties table of the Node Properties tab in the Update Snaplex dialog:

Code Block
jcc.jvm_options = -Djava.io.tmpdir=/new/tmp/folder

...

Snaplex Network Binding

By default, a Snaplex starts and binds to the localhost network interface on port 8090. Any clients can connect to the JCC only if the client is also running on the same machine. This default is chosen since the Snaplex does not receive any inbound requests normally. Instead, it uses an outbound WebSocket connection to receive its requests from the SnapLogic Cloud services. If requests need to be sent to the Snaplex from the customer network, then you should configure the Snaplex to listen on its network interfaces. This would be required when a feed URL Pipeline execution request is done by pointing directly at the Snaplex host instead of pointing at the Cloud URL. To do this, set (default is 127.0.0.1):

...

Code Block
 jcc.jetty_host = 0.0.0.0 

...

If you need to configure the hostname used by the Snaplex to be a different value than the machine name (for example newname.mydomain.com), add:

...

Code Block
jcc.jvm_options = -DSL_INTERNAL_HOST_NAME=newname.mydomain.com -DSL_EXTERNAL_HOST_NAME=newname.mydomain.com

in the etc/global.properties by adding it to the Update Snaplex dialog, Node Properties tab, Global properties table.

Load Balancer Guidelines

A load balancer facilitates the efficient distribution of network or application traffic between client devices and backend servers. In the SnapLogic environment, a load balancer is for incoming requests to the Snaplex from client applications. This purpose differs from that of an HTTP proxy, which might be required for outbound requests from the Snaplex to the Control Plane or other endpoints. Typically, the HTTP proxy is required when Groundplex nodes are on client servers with a restricted network configuration.

Use Cases for a Load Balancer

You should provision a load balancer for a Snaplex when external client API calls are sent directly to the Snaplex nodes. We recommend a load balancer in the following use cases:

  • Snaplex-triggered Pipeline executions—Since the Control Plane triggering mechanism imposes additional Org level API limits, we recommend using the Snaplex triggering mechanism for high-volume API usage

  • REST requests to Ultra Task Pipelines—For direct API calls to the Snaplex, the requests must pass through a load balancer. Without a load balancer, request failures occur, and the Snaplex eventually goes offline during Snaplex maintenance or upgrades.

  • API Policies—To apply API policies to APIs or Triggered and Ultra Tasks on a Cloudplex, you must have a load balancer, which SnapLogic provisions. 

You can configure the load balancer to run health checks on the node, which ensure that the node going offline for maintenance does not receive any new requests.

After configuring the load balancer, you must add the load balancer URL to the Snaplex properties.

...

Only the Ultra Task Load balancer field needs to be configured since that enables load balancing for Triggered Task requests as well. Use the following guidelines:

  • If a load balancer points to a FeedMaster node, then you only need to configure the Ultra Task load balancer.

  • If the load balancer points to worker nodes, then you should only configure in Snaplex properties.

Use Cases where a Load Balancer is Not Required

Load balancers are not required for the following types of activities:

  • Pipeline executions triggered through the Control Plane.

  • Scheduled Pipelines, Pipeline/account validation, and Pipeline development.

  • Headless Ultra (since the Ultra Task processing is not driven by REST API calls).

  • Child Pipeline executions triggered through the Pipeline Execute Snap.

Cloudplex Load Balancer

On a Cloudplex, a load balancer is provisioned by SnapLogic, typically only when Ultra Task subscription feature is enabled. The Cloudplex load balancer has a snaplogic.io domain endpoint that points to the FeedMaster nodes. You can provision a load balancer for both Ultra and Snaplex triggered executions.

As an Org admin, you must update the Snaplex settings with the load balancer URL after the load balancer is provisioned.

Groundplex Load Balancer Best Practices

  • For Snaplex instances with FeedMaster nodes, the load balancer should point to the FeedMaster nodes, like https://fm-node1.example.com:8084 and https://fm-node2.example.com:8084.
    A FeedMaster node can process Triggered and Utra Task requests, a JCC can process only Triggered Task requests. In the latter case, it is easier to use the FeedMaster node as the load balancer endpoint. The Ultra Task load balancer field value needs to be updated in the Snaplex settings with the load balancer URL.

  • If there are no FeedMaster nodes, the load balancer can point to the JCC nodes, like https://jcc-node1.example.com:8081 and https://jcc-node2.example.com:8081. A JCC node can process only Triggered Task requests. Make sure you update the load balancer field value in the Snaplex settings with the load balancer URL.

  • You should configure the load balancer to run health checks on the Snaplex node on the /healthz URL. Any other response code besides 200 indicates a health check failure.

  • The load balancer should perform SSL offloading/termination so that the certificate and cipher management can be done on the load balancer without updating the Snaplex nodes. The connection between the client and the load balancer is over HTTPS with your signed certificate. The connection between the load balancer and the Snaplex nodes are also over HTTPS with the default SnapLogic generated certificate.

  • You must set the HTTP request timeout to a value of 900 or higher to allow for long-running requests. This timeout setting is different from the keep-alive timeouts that are used for connection management, like the following examples:

    • The proxy_read_timeout for Nginx.

    • The ProxyTimeout for Apache.

    • The idle timeout for AWS ELB.The following image from the AWS UI shows a sample health check configuration for the AWS ELB.

...

...