In this article
This article lays emphasis on the different scenarios that are possible while performing an SCD2 operation and the behavior of the ELT SCD2 Snap, in each of these scenarios.
To follow these scenarios and configure the ELT SCD2 Snap properly in your SnapLogic environment, you will need to understand the following terms:
Term | Definition | |
---|---|---|
Target Table Natural Key(s) | A unique identifier based on which the SCD2 operation inserts a new record with different values for the same key in the target table. You can define more than one natural key for your SCD2 operation. | |
Target Table Cause-Historization Field(s) | One or more fields in the target table which can cause new versions of data in the row. For example, changes in Address or Employee Designation of an individual over a period of time. Each change in the address can cause a new version of the row (SCD Type2 entry) and so does each change in an employee’s designation. | |
Target Table Temporal Field(s) | A set of fields in the target table that store information specific to the validity of the cause-historization values in each of the row versions created. They can be a combination of the following fields:
| |
Invalid Row | A row in the source data is considered invalid if its:
|
Target Table (Before)
Customer ID | Address | Start Date | End Date | Current Row |
1 | Chicago | 1/1/18 | 12/1/18 | false |
1 | New York | 1/1/19 | MAX_DATE | true |
Source/Update Table (Incoming SCD2 Data)
Customer ID (Natural Key) | Address (Cause Historization Values) | Start Date (Temporal Field 1) |
1 | Chicago | 1/1/20 |
1 | New York | 1/1/21 |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Remarks / Key Take-aways |
1 | Chicago | 1/1/18 | 12/1/18 | false | No change. |
1 | New York | 1/1/19 | 1/1/20 | false | Not current row anymore. |
1 | Chicago | 1/1/20 | 1/1/21 | false | New entry but not current row. Start date for the newer row is the End date for this entry. |
1 | New York | 1/1/21 | MAX_DATE | true | Current row with default End Date. |
The SCD2 operation uses a row-by-row approach. The final current row is arrived at, only after processing all rows for that natural key. |
Target Table (Before)
Customer ID | Address | Start Date | End Date | Current Row |
1 | Chicago | 1/1/18 | 12/1/18 | false |
1 | New York | 1/1/19 | MAX_DATE | true |
Source/Update Table (Incoming SCD2 Data)
Customer ID (Natural Key) | Address (Cause Historization Values) | Start Date (Temporal Field 1) |
1 | New York | 1/1/20 (or 1/1/17) |
1 | Boston | 1/1/21 |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Remarks / Key Take-aways |
1 | Chicago | 1/1/18 | 12/1/18 | false | No change. |
1 | New York | 1/1/19 | 1/1/21 | false | Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row. |
1 | Boston | 1/1/21 | MAX_DATE | true | Current row with default End Date. |
Target Table (Before)
Customer ID | Address | Start Date | End Date | Current Row |
(Empty Table) |
Source/Update Table (Incoming SCD2 Data)
Customer ID (Natural Key) | Address (Cause Historization Values) | Start Date (Temporal Field 1) |
1 | Chicago | 1/1/15 |
1 | Boston | 1/1/16 |
1 | Chicago | 1/1/18 |
1 | Chicago | 1/1/19 (or 1/1/14) |
1 | Boston | 1/1/20 |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Remarks / Key Take-aways |
1 | Chicago | 1/1/15 (or 1/1/14) | 1/1/16 | false | End date arrived based on the start date of the next entry (in ascending order). |
1 | Boston | 1/1/16 | 1/1/18 | false | End date arrived based on the start date of the next entry (in ascending order). |
1 | Chicago | 1/1/18 | 1/1/20 | false | Any start date for Chicago outside the range 1/1/18 - 1/1/20 will be ignored. |
1 | Boston | 1/1/20 | MAX_DATE | true | Current row with default End Date. |
The SCD2 operation sorts the source data records in the ascending order of their Start Dates when there is no row numbering (sequence) specified in the source data. |
Target Table (Before)
Customer ID | Address | Start Date | End Date | Current Row |
1 | Chicago | 1/1/18 | 12/1/18 | false |
1 | New York | 1/1/19 | MAX_DATE | true |
Source/Update Table (Incoming SCD2 Data)
Customer ID (Natural Key) | Address (Cause Historization Values) | Start Date (Temporal Field 1) |
1 | Orlando | 1/1/17 |
1 | Boston | 1/1/21 |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Remarks / Key Take-aways |
1 | Chicago | 1/1/18 | 12/1/18 | false | No change. |
1 | New York | 1/1/19 | 1/1/21 | false | Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row. |
1 | Boston | 1/1/21 | MAX_DATE | true | Current row with default End Date. |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Invalid | Remarks / Key Take-aways |
1 | Chicago | 1/1/18 | 12/1/18 | false | No change. | |
1 | New York | 1/1/19 | 1/1/21 | false | Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row. | |
1 | Orlando | 1/1/17 | MIN_DATE | false | true | Invalid row logged with End Date defaulted to MIN_DATE. |
1 | Boston | 1/1/21 | MAX_DATE | true | Current row with default End Date. |
Target Table (Before)
Customer ID | Address | Start Date | End Date | Current Row |
1 | Chicago | 1/1/18 | 12/1/18 | false |
1 | New York | 1/1/19 | MAX_DATE | true |
Source/Update Table (Incoming SCD2 Data)
Customer ID (Natural Key) | Address (Cause Historization Values) | Start Date (Temporal Field 1) |
1 | Orlando | NULL |
1 | Boston | 1/1/21 |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Remarks / Key Take-aways |
1 | Chicago | 1/1/18 | 12/1/18 | false | No change. |
1 | New York | 1/1/19 | 1/1/21 | false | Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row. |
1 | Boston | 1/1/21 | MAX_DATE | true | Current row with default End Date. |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Invalid | Remarks / Key Take-aways |
1 | Chicago | 1/1/18 | 12/1/18 | false | No change. | |
1 | New York | 1/1/19 | 1/1/21 | false | Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row. | |
1 | Orlando | NULL | MIN_DATE | false | true | NULL remains as is. Invalid row logged with End Date defaulted to MIN_DATE. |
1 | Boston | 1/1/21 | MAX_DATE | true | Current row with default End Date. |
No updates to the target table. SCD2 operation is aborted.
Target Table (Before)
Customer ID | Address | Start Date | End Date | Current Row |
1 | Chicago | 1/1/18 | 12/1/18 | false |
1 | New York | 1/1/19 | MAX_DATE | true |
Source/Update Table (Incoming SCD2 Data)
Customer ID (Natural Key) | Address (Cause Historization Values) | Start Date (Temporal Field 1) |
1 | NULL | 1/1/20 |
1 | Boston | 1/1/21 |
NULL | Miami | 1/1/22 |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Remarks / Key Take-aways |
1 | Chicago | 1/1/18 | 12/1/18 | false | No change. |
1 | New York | 1/1/19 | 1/1/20 | false | Not current row anymore. Any end date for New York before 1/1/20 (start date for the next row) results in New York becoming a historical row. |
1 | NULL | 1/1/20 | 1/1/21 | false | A NULL entry made in the ascending order of the start dates. End Date added This row cannot be the current row. |
1 | Boston | 1/1/21 | MAX_DATE | true | Current row with default End Date. |
NULL | Miami | 1/1/22 | MAX_DATE | true | This entry, though included in the target table, does not correspond to the same Natural Key as the other entries, in this example. |
Customer ID (Natural Key) | Address (New Cause Historization Entries) | Start Date | End Date | Current Row | Invalid | Remarks / Key Take-aways |
1 | Chicago | 1/1/18 | 12/1/18 | false | No change. | |
1 | New York | 1/1/19 | 1/1/21 | false | Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row. Any entry (Start Date 1/1/20, in this case) containing a NULL is ignored. | |
1 | Boston | 1/1/21 | MAX_DATE | true | Current row with default End Date. |