Skip to end of banner
Go to start of banner

Planning a Groundplex Deployment

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

In this Article

Overview

Groundplex deployment comprises many factors for consideration. Most of these considerations are due to IT requirements in your computing environment, but some also depend on Pipeline production and the types of Pipelines you plan to run in production.

While Groundplex and Snaplex refer to the same thing, this article uses Snaplex in general when referring to the SnapLogic Manager, and Groundplex in the context of the computing resources underlying it.

Computing Requirements

The Groundplex must conform to the following minimum specifications:

Nodes (Min/Rec)

Minimum: 1

Recommended: 2 or more nodes

SnapLogic Project and Enterprise platform package nodes can be configured in the following sizes:

  • 2 vCPU and 8GB RAM 

  • 4 vCPU and 16GB RAM 

  • 8 vCPU and 32GB RAM

  • 16 vCPU and 64GB RAM

We recommend two nodes for high availability. For requirements about clustering nodes, see Node Cluster.

All nodes within a Snaplex must be of the same size.

RAM (Min)

Minimum: 8GB

Depending on the size, number, and nature of Pipelines, more RAM is required to maintain an acceptable level of performance.

CPU (Min)

Minimum: 2 core

All Snaps execute in parallel in their own threads: the more cores that are available to the Snaplex, the more performant the system.

DISK (Min/Rec)

Minimum: 40GB

Recommended: 100GB

Local disk space is required for logging and for any Snap that uses the local disk for temporary storage (for example, Sort and Join Snaps). For details, see Temporary Folder.

SnapLogic does not in anyway restrict the disk size of your Groundplex nodes.

Memory (RAM) is used by the Pipelines to execute. Some Snaps, like Sort Snaps, which accumulate many documents, consume more memory; the amount of memory used is influenced by the volume and size of the documents being processed. For an optimum sizing analysis based on your requirements, contact your SnapLogic Sales Engineer.

Supported Operating Systems

The SnapLogic on-premises Snaplex is supported on the following operating systems:

  • CentOS (or Red Hat) Linux 6.3 or newer.

  • Debian and Ubuntu Linux.

  • Windows Server 2008 64 bit, Windows Server 2012, and Windows Server 2016 with a minimum of 8GB RAM.

For improved security, the Groundplex machine timestamp is verified to check if it is in sync with the timestamp on the SnapLogic Cloud. Running a time service on the Groundplex node ensures that the timestamp is always kept in sync.

Large clock skew can also affect communication between the FeedMaster and the JCC nodes. The Date.now() expression language function might be different between Snaplex nodes, and Internal log messages might have skewed timestamps, making it more difficult to debug issues.

Network Requirements

Network Throughput Guidelines

You should consider that a Groundplex requires connectivity to the SnapLogic Integration Cloud when running, as well as connectivity to the Cloud applications which may be used in the processes/Pipelines created and run in your solution. To optimize performance, we recommend the following network throughput guidelines:

Network In (Min/Rec)

Min: 10MB/sec, Recommended: 15MB/sec+

Network Out (Min/Rec)

Min: 5MB/sec, Recommended: 10MB/sec+

Network Firewall Requirements

To communicate with the SnapLogic Integration Cloud, SnapLogic On-premises Snaplex uses a combination of HTTPS requests and WebSockets communication over the TLS (SSL) tunnel. In order for this combination to operate effectively, you must configure the firewall to allow the following network communication requirements:

Feature

Required

Consequence

HTTP outbound Port 443

Yes

Does not function

HTTP HEAD

Desired

Without HEAD support, a full GET requires more time and bandwidth

Compression

Desired

Slower data transfer

Websockets (WSS protocol)

Yes

Does not function

Snaps using HTTP client without proxy support

Desired

Needs to use Snaps that support proxy settings

Port 8081 

Yes

Unable to reach Snaplex neighbor - https://hostname:8081

Needs to be available for communication between the nodes in a Snaplex.

  • The nodes of a Snaplex need to communicate among themselves, so it is important that they can each resolve each other's hostnames. This is required when you are making local calls into the Snaplex nodes for the execution of Pipelines rather than going through the SnapLogic Platform. The Pipelines are load balanced by SnapLogic with Tasks passed to the target node.

  • Communication between the customer-managed Groundplex and the SnapLogic-managed S3 bucket is over HTTPS with TLS enforced by default. The AWS provided S3 URL also uses an HTTPS connection with TLS enforced by default. If direct access from the Groundplex to the SnapLogic AWS S3 bucket is blocked, then the connection to the AWS S3 bucket communication falls back to a connection via the SnapLogic Control Plane that still uses TLS 1.2.

Network Guidelines for Snap Usage

In the SnapLogic Platform, the Snaps tactually communicate to and from the applications. The protocols and ports required for application communication are mostly determined by the endpoint applications themselves, and not by SnapLogic. It is common for Cloud/SaaS applications to communicate using HTTPS, although older applications and non-cloud/SaaS applications might have their own requirements. 

For example:

Application

Protocol

Default Port

RedShift

TCP

5439

Netezza

TCP

5480

Salesforce

HTTPS

443

Oracle

TCP

1521

Each of these application connections may allow the use of proxy for the network connection, but it is a configuration option of the application’s connection—not one applied by SnapLogic.

FeedMaster Ports

The FeedMaster listens on the following two ports:

  • 8084—The FeedMaster's HTTPS port. Requests for the pipelines are sent here as well as some internal requests from the other Groundplex nodes.

  • 8089—The FeedMaster's embedded ActiveMQ broker SSL port. The other Groundplex nodes connects to this port to send/receive messages.

The machine hosting the FeedMaster needs to have those ports open on the local firewall, and the other Groundplex nodes need to allow outbound requests to the FeedMaster on those ports.

Name

Every Snaplex requires a name, for example, ground-dev or ground-prod. In the SnapLogic Designer, you can choose on which Snaplex Pipelines are executed. The Snaplex configuration also has an Environment variable associated with it, for example, dev or prod. When you configure the nodes for the Snaplex, you must set the jcc.environment to the Environment value that you have configured for the Snaplex.

The hostname of the system used by a Snaplex can not use an underscore (_) in its name as per DNS standards. Avoid special characters as well.

After the Snaplex service is started on a node, the service connects to the SnapLogic Cloud service. Runtime logs from the Snaplex are written to the /opt/snaplogic/run/log (or c:\opt\snaplogic\run\log) directory. The Dashboard shows the nodes that are currently connected for each Snaplex.

Understanding Distribution of Data Processing across Snaplex Nodes

When a Pipeline or Task is executed, the work is assigned to one of the JCC nodes in the Snaplex. Depending on a number of variables, the distribution of work across JCC nodes is determined by number of threads in use, the amount of memory available, and the average system load.

To ensure that requests are being shared across JCC nodes, we recommend that you set up a load balancer to distribute the work across JCC nodes in the Snaplex.

Node Cluster

Starting multiple nodes with the JCC service pointing to the same Snaplex configuration automatically forms a cluster of nodes, as long as you follow these requirements for nodes in a Snaplex:

  • The nodes need to communicate to each other on the following ports: 8081, 8084, and 8090.

  • The nodes should have a reliable, low-latency network connection between them.

  • The nodes should be homogeneous in that they should have similar CPU and memory configurations, as well as access the same network endpoints.

JCC Node Communication within a Groundplex

We recommend that you set up JCC nodes in a Snaplex within the same network and data center. Communication between JCC nodes in the same Snaplex is required for the following reasons:

  • The Pipeline Execute Snap communicates directly with neighboring JCC nodes in a Snaplex to start child Pipeline executions and send documents between parent and child Pipelines.

  • The data displayed in Data Preview is written to and read from neighboring JCC nodes in the Snaplex.

  • The requests and responses made in Ultra Pipelines are exchanged between a FeedMaster node and all of the JCC nodes in a  Snaplex.

  • A Ground Triggered Task (invoked from a Groundplex) can be executed on a neighboring JCC node due to load-balancing—in which case, the prepare request and the bodies of the request and response are transferred between nodes.

Therefore, any extra latency or network hops between neighboring JCC nodes can introduce performance and reliability problems.

  • No labels