Snaplex instances support horizontal and vertical scaling, using additional/larger nodes to scale Pipeline workloads. This article explains the vertical scaling scenario, leveraging larger nodes and configuring them appropriately to use the available memory resources.
Configuring Default Memory
You can configure the Maximum heap size for the JCC (Java Component Container) in the Update Snaplex dialog. While Org administrators can set a value for Groundplex, for Cloudplex the value defaults to auto. Setting to auto means that the JCC automatically detects the available memory on the Snaplex node and sets a maximum heap size of about 75% of the available memory. For Groundplex, you can configure an absolute value as well, using the standard JRE memory size configuration, like 10g or 8000m.
When using absolute values for maximum heap size, ensure that the value specified is lower than the physical memory available on the node. All nodes in a Snaplex should have the same physical configuration for proper Pipeline load balancing across nodes.
The default configuration, using auto or using an absolute value for maximum heap size, is good for most Snaplex configuration scenarios where the Pipeline workload on the Snaplex is well-defined.
Configuring Snaplex Memory for Dynamic Workloads
If you expect the Snaplex workload to be ad-hoc or are unable to plan the memory requirements, then the default heap memory settings on the Snaplex may not be suitable. For example, these scenarios might require customizing the settings:
The Pipeline data volume to process varies significantly based on input data.
The Pipeline development or execution is performed by multiple teams, thus making capacity planning difficult.
The resources available in your test or development environment is lower versus the production Snaplex.
In all these scenarios, setting the maximum heap size to a lower value than the available memory can result in memory issues. Since the workload is dynamic, if your existing Pipelines that are running use additional memory, then the JCC might hit your configured maximum memory limit. When this limit is reached, the JCC displays an OutOfMemory exception error and restarts, causing all Pipelines that are currently running to fail.
To allow the JCC to optimally manage memory limits, you can use swap configuration on the Snaplex nodes. Doing so allows your JCC node to operate with the configured memory in normal conditions and if memory is inadequate, the JCC can use the swap memory with minimal degradation in Pipeline performance.
Configuring swap Memory
To utilize swap for the JCC process, perform the following steps:
Enable swap on your host machine. The steps depend on your Operating System (OS). For example, for Ubuntu Linux you can use the steps in this tutorial. We recommend using an SSD-based (solid state drive) local storage for the swap volume.
Update your Maximum heap size Snaplex setting to a larger absolute value, which is around 90% of the sum of memory available on the node and the additional swap that you configure.
Update your Maximum memory Snaplex setting to a lower percentage value, such that the absolute value is lower than the available memory. The load balancer uses this value while allocating Pipelines. The default is to set to 85%, which means that if the node memory usage is above 85% of the maximum heap size, then additional Pipelines cannot start on the node.
Add the following two properties in the Global properties section of the Node Properties tab in the Update Snaplex dialog:
jcc_poll_timeout_seconds is the timeout (default value is 10 seconds) for each health check poll request from the Monitor.
status_timeout_seconds is the time period (default value is 300 seconds) that the Monitor process waits for before the JCC is restarted in the event that the health check requests continuously fail.
Snaplex Memory Configuration Example
In our example, we need to add 16GB swap memory to a Snaplex that currently has 8GB of memory.
Configure 16GB of OS swap space on each Snaplex node.
At this step, the total available memory is 8GB existing memory + 16GB swap memory = 24GB. Configure the Maximum heap size to 90% of 24GB, which is 22g.
To ensure that the Snaplex uses only the existing memory of 8GB for running the usual Pipelines, you can target 7GB of the available 8GB memory to use for the normal workload. Update the Maximum memory setting to 30% [(7/22)*100].
The intent of the above calculation is to ensure that the JCC utilizes 7GB of the available 8GB memory for normal workloads. Beyond that, the load balancer can queue up additional Pipelines or send them to other nodes for processing.
If Pipelines that are running collectively start using over 7GB of memory, then the JCC can utilize up to 22GB of the total heap memory by using the OS swap space per the above configuration.
swap Performance Implications
While enabling swap, the IO performance of the volume you use is critical to achieve an acceptable performance when the swap is utilized.
On AWS, we recommend that you use Instance Store rather than EBS volume for mounting the swap data. See Instance store swap volumes for details. Not all instance types support an SSD instance store. For example, you need to use the m5d.large instance instead of m5.large for SSD-enabled instance store for swap data.
When your workload exceeds the available physical memory and the swap is utilized, the JCC can become slower due to additional IO overhead caused by swapping. Hence, configure a higher timeout for jcc.status_timeout_seconds and jcc.jcc_poll_timeout_seconds for the JCC health checks.
Even after configuring swap, the JCC process can still run out of resources if all the available memory is exhausted. This scenario triggers the JCC process to restart, and all running Pipelines are terminated. We recommend that you use larger nodes with additional memory for the workload to successfully complete.
We recommend that you limit to 16 GB the maximum swap to be used by the JCC. Using a larger swap configuration causes performance degradation during the JRE garbage collection operations.
Memory swapping can result in performance degradation because of disk IO, especially if the Pipeline workload also utilizes local disk IO. We recommend that you use high-performance SSD volumes for the swap space.
For normal workloads where accurate capacity planning is possible, we recommend that you do not use swap. Rather, you can use the default configuration.
When Pipeline workload is dynamic and capacity planning is difficult, you can use the swap-enabled configuration.