Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Because the customer has to install the Snaplex software on nodes in the customer data center or in a private cloud, you should plan your Groundplex deployment.

Every Snaplex has a name, for example ground-dev or ground-prod. In the SnapLogic Designer,you can choose which Snaplex pipelines are executed on. The Snaplex configuration has an environment associated with it, for example dev or prod. When the nodes are configured for the Snaplex, the jcc.environment has to be set to the environment value as configured for the Snaplex.

The hostname of the system used by a Groundplex can not use an underscore (_) in its name as per DNS standards. Avoid special characters as well.

After the Snaplex service is started on a node, the service connects to the SnapLogic cloud service. Runtime logs from the Snaplex are written to the /opt/snaplogic/run/log (or c:\opt\snaplogic\run\log) directory. The Dashboard shows the nodes that are currently connected for each Snaplex.

Best Practices for Setting Up JCC Node Communication within a Groundplex

We recommend that JCC nodes in a Snaplex are setup within the same network and data center. Communication between JCC nodes in the same Snaplex is required for the following reasons:

  • The Pipeline Execute Snap communicates directly with neighboring JCC nodes in a Snaplex to start child Pipeline executions and send documents between parent and child Pipelines.

  • The data displayed in Data Preview is written to and read from neighboring JCC nodes in the Snaplex.

  • The requests and responses made in Ultra Pipelines are exchanged between a FeedMaster node and all of the JCC nodes in a  Snaplex.

  • A Ground Triggered Task (invoked from a Groundplex) can be executed on a neighboring JCC node due to load-balancing—in which case, the prepare request and the bodies of the request and response are transferred between nodes.

Therefore, any extra latency or network hops between neighboring JCC nodes can introduce performance and reliability problems.

Hardware Requirements

The Groundplex (also known as the on-premises Snaplex) is a local server running on hardware, that may be virtual, provisioned by the customer and must conform to the following minimum specifications:

...

Memory (RAM) is used by the Pipelines to execute. Some Snaps, like Sort Snaps, which accumulate many documents, consume more memory; the amount of memory used is influenced by the volume and size of the documents being processed. For an optimum sizing analysis based on your requirements, contact your SnapLogic Sales Engineer.

Operating Systems

The SnapLogic on-premises Snaplex is supported on the following operating systems:

...

Large clock skew can also affect communication between the FeedMaster and the JCC nodes. The Date.now() expression language function might be different between Snaplex nodes, and Internal log messages might have skewed timestamps, making it more difficult to debug issues.

Network Throughput Guidelines

You should consider that, when running, a Groundplex requires connectivity to the SnapLogic Integration Cloud, as well as the cloud applications which may be used in the processes/Pipelines created and run in the solution. To optimize performance, we recommend the following network throughput guidelines:

Network In (Min/Rec)

Min: 10MB/sec, Recommended: 15MB/sec+

Depends on usage

Network Out (Min/Rec)

Min: 5MB/sec, Recommended: 10MB/sec+

Depends on usage

Network Firewall Requirements

On-premises Snaplex (Groundplex)

To communicate with the SnapLogic Integration Cloud, SnapLogic On-premises Snaplex uses a combination of HTTPS requests and WebSockets communication over the TLS (SSL) tunnel. In order for this combination to operate effectively, the firewall must be configured to allow the following network communication requirements:

...

  • The nodes of a Snaplex need to communicate among themselves, so it is important that they can each resolve each other's hostnames. This is required when you are making local calls into the Snaplex nodes for the execution of Pipelines rather than going through the SnapLogic Platform. The Pipelines are load balanced by SnapLogic with Tasks passed to the target node.

  • Communication between the customer-managed Groundplex and the SnapLogic-managed S3 bucket is over HTTPS with TLS enforced by default. The AWS provided S3 URL also uses an HTTPS connection with TLS enforced by default. If direct access from the Groundplex to the SnapLogic AWS S3 bucket is blocked, then the connection to the AWS S3 bucket communication falls back to a connection via the SnapLogic Control Plane that still uses TLS 1.2.

Snaps

In the SnapLogic Platform, the Snaps tactually communicate to and from the applications. The protocols and ports required for application communication are mostly determined by the endpoint applications themselves, and not by SnapLogic. It is common for cloud/SaaS applications to communicate using HTTPS, although older applications and non-cloud/SaaS applications might have their own requirements. 

...

Each of these application connections may allow the use of proxy for the network connection, but it is a configuration option of the application’s connection—not one applied by SnapLogic.

FeedMaster

The FeedMaster listens on the following two ports:

...

The machine hosting the FeedMaster needs to have those ports open on the local firewall, and the other Groundplex nodes need to allow outbound requests to the FeedMaster on those ports.

Understanding Distribution of Data Processing across Snaplex Nodes

When a Pipeline or Task is executed, the work is assigned to one of the JCC nodes in the Snaplex. Depending on a number of variables, the distribution of work across JCC nodes is determined by number of threads in use, the amount of memory available, and the average system load.

To ensure that requests are being shared across JCC nodes, we recommend that you set up a load balancer to distribute the work across JCC nodes in the Snaplex.

Node Cluster

Starting multiple nodes with the JCC service pointing to the same Snaplex configuration automatically forms a cluster of nodes, as long as you follow these requirements for nodes in a Snaplex:

  • The nodes need to communicate to each other on the following ports: 8081, 8084, and 8090.

  • The nodes should have a reliable, low-latency network connection between them.

  • The nodes should be homogeneous in that they should have similar CPU and memory configurations, as well as access the same network endpoints.

Temporary Folder

This section explains what data traversing the SnapLogic platform is encrypted or unencrypted. The temporary folder stores unencrypted data.

Encrypted data:

  • Transit data – Is always encrypted, assuming the endpoint supports encryption.

  • Preview and Account data – Is always encrypted.

Unencrypted data:

  • Snaplex – Data processing on Groundplex, Cloudplex, and eXtremeplex nodes occur principally in-memory as streaming, which is unencrypted.

  • Larger dataset – When larger datasets are processed that exceed the available compute memory, some Snaps like Sort and Join, which process multiple documents, writes Pipeline data to the local disk as unencrypted during Pipeline execution to help optimize the performance. These temporary files are deleted when the Snap/Pipeline execution completes. You can update your Snaplex to point to a different temporary location in the Global properties table of the Node Properties tab in the Update Snaplex dialog:

Code Block
jcc.jvm_options = -Djava.io.tmpdir=/new/tmp/folder

The following Snaps write to temporary files on your local disk:

Snaplex Network Binding

By default, the Snaplex starts and binds to the localhost network interface on port 8090. Any clients can connect to the JCC only if the client is also running on the same machine. This default is chosen since the Snaplex is not be receiving any inbound requests normally. It instead uses an outbound WebSocket connection to receive its requests from the SnapLogic cloud services. If requests need to be sent to the Snaplex from the customer network, then the Snaplex should be configured to listen on its network interfaces. This would be required when a feed URL Pipeline execution request is done by pointing directly at the Snaplex host instead of pointing at the cloud URL. To do this, set (default is 127.0.0.1):

Code Block
 jcc.jetty_host = 0.0.0.0 


in the etc/global.properties by adding it to the Update Snaplex dialog, Node Properties tab, Global properties table.

If the hostname used by the Snaplex needs to be configured to be a different value than the machine name (for example newname.mydomain.com), add:

Code Block
jcc.jvm_options = -DSL_INTERNAL_HOST_NAME=newname.mydomain.com -DSL_EXTERNAL_HOST_NAME=newname.mydomain.com

in the etc/global.properties by adding it to the Update Snaplex dialog, Node Properties tab, Global properties table.