In this article

1 Overview
- 1.1 Snap Type
- 1.2 Prerequisites
- 1.3 Support for Ultra Pipelines
- 1.4 Limitations and Known Issues
2 Snap Views
3 Snap Settings
4 Troubleshooting
5 Deleting multiple JSON files from Azure Data Lake Storage
- 5.1 Downloads
6 Snap Pack History

Overview

You can use this Snap to delete the specified file, group of files, or directory from the supplied path and protocol in the Hadoop Distributed File System (HDFS), Azure Blob File System (ABFS), Windows Azure Storage Blob (WASB) and Azure Data Lake (ADL).

Snap Type

The Hadoop Distributed File System (HDFS) Delete Snap is a write-type Snap.

Prerequisites

None.

Support for Ultra Pipelines

Supports Ultra Pipelines.

Limitations and Known Issues

None.

Snap Views

Type	Format	Number of Views	Examples of Upstream and Downstream Snaps	Description

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input

Document

Min: 0
Max: 1

HDFS Reader
HDFS Writer

The file filter, file, and directory details of the file to be deleted.

Output

Document

Min: 1
Max: 1

ORC Writer
Snowflake Insert

The deleted file or a group of files.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the Pipeline by choosing one of the following options from the When errors occur list under the Views tab:

Stop Pipeline Execution: Stops the current Pipeline execution if the Snap encounters an error.
Discard Error Data and Continue: Ignores the error, discards that record, and continues with the remaining records.
Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

Asterisk ( * ): Indicates a mandatory field.
Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.
Expression icon ( ): Indicates the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.
Add icon ( ): Indicates that you can add fields in the fieldset.
Remove icon ( ): Indicates that you can remove fields from the fieldset.
Upload icon ( ): Indicates that you can upload files.

Field Name	Field Type	Description

Field Name	Field Type	Description
Label* Default Value: HDFS delete Example: Hadoop delete	String	The name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snap in your pipeline.
Directory Default Value: hdfs://<hostname>:<port>/ Example: hdfs://ec2-54-198-212-134.compute-1.amazonaws.com:8020/user/john/input/	String/Expression/Suggestion	Specify the URL for the HDFS directory. It should start with the HDFS file protocol in the following format: hdfs://<hostname>:<port>/<path to directory>/ wasb:///<container name>/<path to directory>/ wasbs:///<container name>/<path to directory>/ adl://<container name>/<path to directory>/ abfs(s):///filesystem/<path>/ abfs(s)://filesystem@accountname.endpoint/<path> The Directory property is used only in the Suggest operation. When you click the Suggestion icon, the Snap displays a list of subdirectories under the specific directory. It generates the list by applying the value specified in the File Filter property.
File filter Default Value: * Example: ?	String/Expression	Specify the Glob filter pattern. A file filter is a criteria to include or exclude specific files when processing data in HDFS. Use glob patterns to display a list of directories or files when you click the Suggest icon in the Directory or File property. A complete glob pattern is formed by combining the value of the Directory property with the Filter property. If the value of the Directory property does not end with "/", the Snap appends one so that the value of the Filter property is applied to the directory specified by the Directory property. The following rules are used to interpret glob patterns: The * character matches zero or more characters of a name component without crossing directory boundaries. For example, the .csv pattern matches a path representing a file name ending in .csv, and .* matches all file names containing a period.The characters match zero or more characters across directories. Therefore, it matches all files or directories in the current directory and its subdirectories. For example, /home/ matches all files and directories in the /home/ directory. The ? character matches exactly one character of a name component. For example, 'foo.?' matches file names that start with 'foo.' and are followed by a single-character extension. The \ character is used to escape characters that would otherwise be interpreted as special characters. For example, the expression \\ matches a single backslash, and \{ matches a left brace. The ! character is used to exclude matching files from the output. The [ ] characters form a bracket expression that matches a single character of a name component out of a set of characters. For example, '[abc]' matches 'a', 'b', or 'c'. The hyphen (-) may be used to specify a range, so '[a-z]' specifies a range that matches from 'a' to 'z' (inclusive). These forms can be mixed, so '[abce-g]' matches 'a', 'b', 'c', 'e', 'f' or 'g'. If the character after the [ is a !, it is used for negation, so '[!a-c]' matches any character except 'a', 'b', or 'c'. The '', '?', and '\' characters match within a bracket expression. The '-' character matches itself if it is the first character within the brackets, or the first character after the !, if negating. The '{ }' characters are a group of sub-patterns where the group returns a match if any sub-pattern in the group matches the contents of a target directory. The ',' character is used to separate sub-patterns. Groups cannot be nested. For example, the pattern '.{csv, json}' matches file names ending with '.csv' or '.json'. Leading dot characters in a file name are treated as regular characters in match operations. For example, the '' glob pattern matches the file name ".login". All other characters match themselves. Examples: '.csv' matches all files with a CSV extension in the current directory only. '*.csv' matches all files with a csv extension in the current directory and all its subdirectories. [!{.pdf,.tmp}] excludes all files with the extension PDF or TMP.
File Default Value: N/A Example: sample.csv tmp/another.csv $filename	String/Expression/Suggestion	Specify the file name or a relative path to a file under the directory specified in the Directory property. It should not start with a URL separator "/". The value of the File property depends on the name of the directory specified in the Directory property and the criterion specified in the File filter property.
User Impersonation Default Value: Deselected	Checkbox	Select this checkbox to enable user impersonation. Hadoop allows you to configure proxy users to access HDFS on behalf of other users; this is called impersonation. When user impersonation is enabled on the Hadoop cluster, any jobs submitted using a proxy are executed with the impersonated user's existing privilege levels rather than those of the superuser associated with the cluster. For more information on user impersonation in this Snap, refer to the section on User Impersonation below.
Delete Directory Default Value: Deselected	Checkbox/Expression	Select this checkbox to enable you to delete all the paths in the specified directory.
Number Of Retries Default Value: 0 Example: 12	Integer/Expression	Specify the maximum number of attempts to be made to receive a response. The request is terminated if the attempts do not result in a response. Retry operation, which attempts to receive a response, occurs only when the Snap loses the connection with the server.
Retry Interval (seconds) Default Value: 1 Example: 30	Integer/Expression	Specify the time interval between two successive retry requests. A retry happens only when the previous attempt resulted in an exception.
Snap Execution Default Value: Execute Only Example: Validate & Execute	Dropdown list	Select one of the following three modes in which the Snap executes: Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, it performs full execution of the Snap (unlimited records) during pipeline runtime. Execute only: Performs complete execution of the Snap during pipeline execution without generating preview data. Disabled: Disables the Snap and all Snaps that are downstream from it.

Troubleshooting

Error	Reason	Resolution

Error	Reason	Resolution
Remote filesystem access failed.	The user credentials or URL might be incorrect, or the remote server may be inaccessible. It indicates a problem with the communication between the nodes in your Hadoop cluster or an issue with the underlying HDFS.	Check the user credentials and URL and retry. Check the permissions and access rights of the Hadoop files and directories. Ensure that you have the required permissions to access and modify the data.
A directory is not a valid string.	The expression or value specified in the Directory property is either not existing in HDFS or not accessible.	Please check if a valid expression is entered in the Directory property and if the correct document data is at the input view.

Deleting multiple JSON files from Azure Data Lake Storage

In the given scenario, multiple JSON files with file names containing special characters are created for uploading to the Azure Data Lake Storage.

Configure the HDFS Writer Snap with specific details, such as the destination directory where the files should be added in the Azure Data Lake Storage. You can see that the file is written to the Azure Data Lake Storage in the output preview.

Snap configuration	Output preview

Snap configuration	Output preview

You can delete the same file from the Azure Data Lake Storage with the HDFS delete Snap.

Snap Configuration	Output preview

Snap Configuration	Output preview

Downloads

Download and import the Pipeline into SnapLogic.
Configure Snap accounts, as applicable.
Provide Pipeline parameters, as applicable.

	File	Modified
Labels No labels Preview View	File HDFS_MultiFile_Delete(abfs).slp	Aug 08, 2023 by Kalpana Malladi

SnapLogic Documentation

HDFS Delete