Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In this article

Table of Contents
minLevel2
maxLevel3
absoluteUrltrue

This article emphasizes lays emphasis on the different scenarios that are possible while performing an SCD2 operation and the behavior of the ELT SCD2 Snap, in each of these scenarios.

...

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Remarks / Key Take-aways

1

Chicago

1/1/18

12/1/18

false

No change.

1

New York

1/1/19

1/1/20

false

Not current row anymore.

1

Chicago

1/1/20

1/1/21

false

New entry but not current row. Start date for the newer row is the End date for this entry.

1

New York

1/1/21

MAX_DATE

true

Current row with default End Date.

Info

The SCD2 operation uses a row-by-row approach. The final current row is arrived at, only after processing all rows for that natural key.

...

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Remarks / Key Take-aways

1

Chicago

1/1/18

12/1/18

false

No change.

1

New York

1/1/19

1/1/21

false

Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row.

1

Boston

1/1/21

MAX_DATE

true

Current row with default End Date.

Scenario 3: Target Table is Empty and Source Table Contains Duplicate Values for Cause-Historization field

...

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Remarks / Key Take-aways

1

Chicago

1/1/15 (or 1/1/14)

1/1/16

false

End date arrived based on the start date of the next entry (in ascending order).

1

Boston

1/1/16

1/1/18

false

End date arrived based on the start date of the next entry (in ascending order).

1

Chicago

1/1/18

1/1/20

false

Any start date for Chicago outside the range 1/1/18 - 1/1/20 will be ignored.

1

Boston

1/1/20

MAX_DATE

true

Current row with default End Date.

Info

The SCD2 operation sorts the source data records in the ascending order of their Start Dates when there is no row numbering (sequence) specified in the source data.

...

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Remarks / Key Take-aways

1

Chicago

1/1/18

12/1/18

false

No change.

1

New York

1/1/19

1/1/21

false

Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row.

1

Boston

1/1/21

MAX_DATE

true

Current row with default End Date.

Output 4.2: Target Table (After SCD2 operation) when Invalid rows are Logged

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Invalid
(Temporal Field 4)

Remarks / Key Take-aways

1

Chicago

1/1/18

12/1/18

false

No change.

1

New York

1/1/19

1/1/21

false

Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row.

1

Orlando

1/1/17

MIN_DATE

false

true

Invalid row logged with End Date defaulted to MIN_DATE.

1

Boston

1/1/21

MAX_DATE

true

Current row with default End Date.

Scenario 5: One or more Start Dates in Source Rows contain NULLs

...

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Remarks / Key Take-aways

1

Chicago

1/1/18

12/1/18

false

No change.

1

New York

1/1/19

1/1/21

false

Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row.

1

Boston

1/1/21

MAX_DATE

true

Current row with default End Date.

Output 5.2: Target Table (After SCD2 operation) when Invalid rows are Logged and Nulls Honored

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Invalid
(Temporal Field 4)

Remarks / Key Take-aways

1

Chicago

1/1/18

12/1/18

false

No change.

1

New York

1/1/19

1/1/21

false

Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row.

1

Orlando

NULL

MIN_DATE

false

true

NULL remains as is. Invalid row logged with End Date defaulted to MIN_DATE.

1

Boston

1/1/21

MAX_DATE

true

Current row with default End Date.

Output 5.3: Target Table (After SCD2 operation) when the Pipeline is set to Return the Error upon Detecting Invalid rows

...

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Remarks / Key Take-aways

1

Chicago

1/1/18

12/1/18

false

No change.

1

New York

1/1/19

1/1/20

false

Not current row anymore. Any end date for New York before 1/1/20 (start date for the next row) results in New York becoming a historical row.

1

NULL

1/1/20

1/1/21

false

A NULL entry made in the ascending order of the start dates. End Date added This row cannot be the current row.

1

Boston

1/1/21

MAX_DATE

true

Current row with default End Date.

NULL

Miami

1/1/22

MAX_DATE

true

This entry, though included in the target table, does not correspond to the same Natural Key as the other entries, in this example.

Output 6.2: Target Table (After SCD2 operation) when NULLs in Source and Target Tables are Ignored

Customer ID (Natural Key)

Address (New Cause Historization Entries)

Start Date
(Temporal Field 1)

End Date
(Temporal Field 2)

Current Row
(Temporal Field 3)

Invalid
(Temporal Field 4)

Remarks / Key Take-aways

1

Chicago

1/1/18

12/1/18

false

No change.

1

New York

1/1/19

1/1/21

false

Not current row anymore. Any end date for New York before 1/1/21 (start date for Boston) results in New York becoming a historical row. Any entry (Start Date 1/1/20, in this case) containing a NULL is ignored.

1

Boston

1/1/21

MAX_DATE

true

Current row with default End Date.