PostgreSQL - Vector Search

In this article

Overvie

postgresql-vector-search-overview.png

Snap Type

The PostgreSQL - Vector Search Snap is a Read-type Snap.

Prerequisites

A valid account with the required permissions.

Support for Ultra Pipelines

Works in Ultra Pipelines. 

Limitations and Known Issues

None.

Snap Views

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Type

Format

Number of Views

Examples of Upstream and Downstream Snaps

Description

Input 

Document

 

 

  • Min: 1

  • Max: 1

  • Mapper

  • Copy

  • Requires an input vector with the same dimension as the selected vector column.

  • Requires a vector input with an array of float/int data types.

Output

Document

 

  • Min: 1

  • Max: 1

  • Mapper

  • Copy

For each input document, all results are grouped in a single output document.

Error

Error handling is a generic way to handle errors without losing data or failing the Snap execution. You can handle the errors that the Snap might encounter when running the pipeline by choosing one of the following options from the When errors occur list under the Views tab:

  • Stop Pipeline Execution: Stops the current pipeline execution if the Snap encounters an error.

  • Discard Error Data and Continue: Ignores the error, discards that record, and continues with the remaining records.

  • Route Error Data to Error View: Routes the error data to an error view without stopping the Snap execution.

Learn more about Error handling in Pipelines.

Snap Settings

  • Asterisk ( * ): Indicates a mandatory field.

  • Suggestion icon (): Indicates a list that is dynamically populated based on the configuration.

  • Expression icon ( ): Indicates the value is an expression (if enabled) or a static value (if disabled). Learn more about Using Expressions in SnapLogic.

  • Add icon ( ): Indicates that you can add fields in the field set.

  • Remove icon ( ): Indicates that you can remove fields from the field set.

  • Upload icon ( ): Indicates that you can upload files.

Field Name

Field Type

Description

Field Name

Field Type

Description

Label*

 

Default Value: PostgreSQL Vector Search
Example: PostgreSQL VS

String

Specify a name for the Snap. You can modify this to be more specific, especially if you have more than one of the same Snaps in your pipeline.

 

Schema name

 

Default Value: N/A
Example: VECTOR_DEMO

String/Expression/Suggestion

Specify the schema name for searching for a vector.

Table Name*

 

Default Value: N/A 
Example: VECTOR_DEMO.BOOKS

String/Expression/Suggestion

Specify the table name for searching for a vector.

Vector Column*

 

Default Value: N/A
Example: INT_VEC

String/Expression/Suggestion

Specify the vector column name to search.

Where Clause

 

Default Value: N/A
Example: ID > '001i0000007FVjpAAG'

String/Expression/Suggestion

Specify the where clause to use in the vector search query statement.

Because of the limitation of theSQL standard, you cannot use the _SL_DISTANCE column in the where clause.

Limit Rows

 

Default Value: 4
Example: 3
Min Value: 1

Integer/Expression

Specify the number of rows the query must return.

 

Distance Function*

 

Default Value: L2
Example: COSINE

Dropdown List

Choose the similarity function to compare vectors. The available options are:

  • L2: (Euclidean Distance) Measures the straight-line distance between two points. It’s useful when you want to calculate the as-the-crow-flies distance.

  • L1: (Manhattan Distance) Measures the distance between two points on the axes at right angles. It’s useful in grid-based systems, such as, city streets.

  • COSINE: Measures the cosine of the angle between two vectors. It's commonly used in high-dimensional positive spaces to assess similarity regardless of magnitude.

  • Inner Product: (Dot Product) Measures the similarity between two vectors. It’s useful in various applications, such as calculating the angle between vectors or finding projections.

Learn more about the Vector Similarity Functions.

Include vector values

 

Default Value: Deselected

Checkbox/Expression

Select this checkbox to include vector values in the response.

This field does not support input schema from the upstream Snaps.

Include scores

 

Default Value: Selected

Checkbox/Expression

Select this checkbox to include similarity scores in the response.

Ignore empty result

Default Value: Deselected

Checkbox

Select this checkbox to ignore the empty results and not write a document to the output view when a search operation returns no results..

Number of retries

 

Default Value: 0
Example: 3

Integer/Expression

Specify the maximum number of retry attempts the Snap must make if a network failure occurs.

Retry interval (seconds)

 

Default Value: 0
Example: 3

Integer/Expression

Specify the time period between two successive retry requests.

Snap execution

Default Value: Validate & Execute
Example: Execute only

Dropdown list

Select one of the following three modes in which the Snap executes:

  • Validate & Execute: Performs limited execution of the Snap and generates a data preview during pipeline validation. Subsequently, performs full execution of the Snap (unlimited records) during pipeline runtime.

  • Execute only: Performs full execution of the Snap during pipeline execution without generating preview data.

  • Disabled: Disables the Snap and all Snaps that are downstream from it.

Example

Use Cosine Distance to Find Similar Vectors

The example pipeline below demonstrates how to use the PostgreSQL - Vector Search Snap to find similar vectors using the Cosine distance function.

postgresql-example-image.png

Step 1: Configure the Mapper Snap with a vector to find similar vectors in the PostgreSQL database.

Step 2: Configure the PostgreSQL - Vector Search Snap as shown below:

Step 3: Validate the pipeline. On validation, the Snap fetches similar vectors based on the following criteria:

  • The match vectors have cosine similarity distances, indicating their similarity to the input vector.

  • The cosine similarity distances measure how close the match vectors are to the input vector, with values closer to 0, indicating higher similarity.

  • The first match has the highest similarity (lowest distance), followed by the second match.

Related Content