Skip to main content

MATE incremental tables

  • 14 August 2024
  • 0 replies
  • 22 views

The implementation of incremental tables within dbt has some side effects that are important to understand. There are three cases to consider:

  • Initial creation of the table

  • Incrementally updating the table

  • Performing a full refresh of the table

Initial creation

dbt creates an incremental table in the same was as a non-incremental table. dbt uses a CREATE OR REPLACE TABLE .. AS SELECT to create the table. This will create a table owned by the role executing dbt, and with no other grants (unless future grants have been configured). Grants can be subsequently added either by using MATE governance or by the grants: configuration in dbt.

Incremental update

dbt performs an incremental update by creating a temporary view to select the data to be updated, and then using a MERGE INTO statement to move the data into the destination table. The merge is configured to insert new data and update existing data, using the configured primary key to identify existing data.

The destination table is not recreated in this case, and so any existing grants will be kept. MATE governance will add any new grants that are configured, but will not revoke existing grants. dbt grant management can be configured to either revoke any existing grants not mentioned in the configuration, or to add new grants only.

Full refresh

dbt performs a full refresh in the same way as initially creating the table. A CREATE OR REPLACE TABLE .. AS SELECT is executed. This creates a new table in Snowflake, and so existing grants are not copied across. The table will be owned by the role executing dbt. This means that additional roles will have a short period where they are unable to access the table, from when the table is rebuilt until the appropriate grants are reapplied (either by dbt or MATE governance).

dbt has a feature to avoid this. The copy_grants configuration key can be set to true. This causes dbt to add COPY GRANTS to the CREATE OR REPLACE TABLE .. AS SELECT, which causes Snowflake to copy the original grants over. The same caveats apply as for incremental updates - MATE governance will not revoke any grants not listed. Using this configuration removes the short time period where the table is unavailable.

Be the first to reply!

Reply