Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Download hadoop.dlland winutils.exehttps://github.com/cdarlint/winutils/tree/master/hadoop-3.2.2/bin (SnapLogic’s Hadoop version is 3.2.2)

  2. Create a temporary directory.

  3. Place the hadoop.dlland winutils.exe files in this path: C:\\hadoop\bin

  4. Set the environment variable HADOOP_HOME to point to C:\\hadoop

  5. Add C:\hadoop\bin to the environment variable PATH as shown below:

  6. Add the JVM options in the Windows Snaplex:jcc.jvm_options= -Djava.library.path=C:\\hadoop\bin

    If you already have an existing jvm_options, then add: "-Djava.library.path=C:\\hadoop\bin" after the space.
    For example:jcc.jvm_options = -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8000 -Djava.library.path=C:\\hadoop\bin

  7. Restart the JCC for configurations to take effect.

...

Error

Reason

Resolution

Unable to connect to the Hive Metastore.

This error occurs when the Parquet Writer Snap is unable to fetch schema for Kerberos-enabled Hive Metastore.

Pass the Hive Metastore's schema directly to the Parquet Writer Snap. To do so:

  1. Enable the 'Schema View' in the Parquet Writer Snap by adding the second Input View.

    1. Connect a Hive Execute Snap to the Schema View. Configure the Hive Execute Snap to execute the DESCRIBE TABLE command to read the table metadata and feed it to the schema view. 

Parquet Snaps may not work as expected in the Windows environment.

Because of the limitations in the Hadoop library on Windows, Parquet Snaps does not function as expected.

To use the Parquet Writer Snap on a Windows Snaplex, follow these steps:

  1. Create a temporary directory. For example: C:\test\hadoop\.

  2. Place two files, "hadoop.dll" and "winutils.exe", in the newly created temporary directory. Use this link ​https://github.com/cdarlint/winutils/tree/master/hadoop-3.2.2/bin to download hadoop.dll and winutills.exe. (SnapLogic’s code base Hadoop is 3.2.2).

  3. Add the JVM options in the Windows Plex as shown below:
    jcc.jvm_options = -Djava.library.path=C:\\testhadoop

  4. If you already have existing jvm_options, then add the following "-Djava.library.path=C:\\testhadoop" after the space. For example:
    jcc.jvm_options = -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8000 -Djava.library.path=C:\\testhadoop

  5. Restart the JCC for configurations to take effect.

Failure: 'boolean org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(java.lang.String, int)',

Because of the limitations in the Hadoop library on Windows, Parquet Snaps does not function as expected.

To resolve this issue, follow these steps:

  1. Download hadoop.dlland winutils.exehttps://github.com/cdarlint/winutils/tree/master/hadoop-3.2.2/bin (SnapLogic’s Hadoop version is 3.2.2)

  2. Create a temporary directory.

  3. Place the hadoop.dlland winutils.exe files in this path: C:\\hadoop\bin

  4. Set the environment variable HADOOP_HOME to point to C:\\hadoop

  5. Add C:\hadoop\bin to the environment variable PATH as shown below:
    Variable name: PATH
    Variable value: VEN_HOME%\bin;%HADOOP_HOME%\bin

  6. Add the JVM options in the Windows Snaplex:jcc.jvm_options= -Djava.library.path=C:\\hadoop\bin

    If you already have an existing jvm_options, then add: "-Djava.library.path=C:\\hadoop\bin" after the space.
    For example:jcc.jvm_options = -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8000 -Djava.library.path=C:\\\hadoop\bin

  7. Restart the JCC for configurations to take effect.

...

Expand
titleUnderstanding the Pipeline

Image RemovedImage Added

Image RemovedImage Added

Downloads

Info
  1. Download and import the Pipeline into SnapLogic.

  2. Configure Snap accounts, as applicable.

  3. Provide Pipeline parameters, as applicable.

...