Table Update

Dark
Light

Table Update

Dark
Light

Article Summary

Share feedback

Thanks for sharing your feedback!

This article is specific to the following platforms - Snowflake - Redshift - BigQuery - Synapse.

Table Update Component

Update a target table with a set of input rows. The rows to update are based on matching keys. It is very important that the keys uniquely identify the rows, and that the keys are not NULL.

Note: Successful validation of this component ensures the target table exists, and the target columns have been found. However, data is only written to the table when the job containing the table update is actually run. Most potential problems are avoided by a successful validation; however, run-time errors can still occur during execution. For example your Cluster may run out of disk space.

Properties

Snowflake Properties
Property	Setting	Description
Name	String	A human-readable name for the component. This is automatically determined when the Target Table Name property is first set.
Warehouse	Select	Choose a Snowflake warehouse that will run the update.
Database	Select	Select the database that the newly created table will belong to.
Schema	Select	Select the table schema. The special value, [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see this article.
Target Table Name	Select	The name of the output table. The tables found in the currently selected environment is provided to choose from. You can change the currently selected environment in the Environments section.
Target Alias	Text	The alias for the target table. An alias allows a table to be referred to usually a name that is typically shorter and simpler than its actual name.
Source Alias	Text	The alias for the source table. An alias allows a table to be referred to usually a name that is typically shorter and simpler than its actual name.
Join Expressions	List of Expressions	A list of expressions specifying how each join is performed. This is the same expression editor used by the Calculator component. Each expression must be valid SQL and can use the all the built in Snowflake Functions There is exactly one expression for each Join, and the result of the expression is evaluated as True or False which indicates whether the two records being compared 'match'. Often this will be a simple 'equality' condition, but it could be more complex, e.g. where a date falls within a start/end date range.
When Matched	Case	Select a case as previously defined in the 'Join Expression' property. Add as many rows to the editor as you need, one per case.
When Matched	Operation	Decide an action for when the corresponding case occurs according to the join expression. Delete: Completely remove the row from the output table if the case is matched. Update: Output the data as expected if a match is found.
Include Not Matched	Select	If TRUE, allow non-matched data to continue through to the output. The column/s this data is written to is defined in a new property, 'Insert Mapping', described below.
Update Mapping	Input Column	(Visible only when a matched case results in an update). The column name from the matched input flow. Add as many rows to the editor as you need, one per input column.
Update Mapping	Output Column	The name of the input column that the corresponding matched input is written to. Note, this can be the same name as the input column if desired.
Insert Mapping	Input Column	(Visible only when an unmatched case is included). The column name from the unmatched input flow. Add as many rows to the editor as you need, one per input column.
Insert Mapping	Output Column	The name of the output column that the corresponding unmatched input is written to. Note, this can be the same name as the input column if desired.

Redshift Properties
Property	Setting	Description
Name	String	A human-readable name for the component. This is automatically determined when the Target Table Name property is first set.
Schema	Select	Select the table schema. The special value, [Environment Default] will use the schema defined in the environment. For more information on using multiple schemas, see this article.
Target Table Name	Select	The name of the output table. The tables found in the currently selected environment is provided to choose from. You can change the currently selected environment in the Environments section.
Fix Data Type Mismatches	Select	Yes: If the source column type does not match the target table type, attempt to CAST the value to the required target type. No: Do not cast types. Amazon Redshift may still attempt to coerce the types in this case.
Column Mapping	Input Column	The source column from the input flow.
Column Mapping	Output Column	The target table output column to update.
Unique Keys	Select	Selected column(s) from the input table used as unique keys. The associated target columns are found using the Column Mapping.
Update Strategy	Select	Delete/Insert: Removes overlapping rows (matching on Unique Keys) and then inserts all incoming rows. This is effectively an update, and is very fast. However, if you don't have incoming values for all target columns, replaced rows will be NULL for those missing columns. Note: Deleting rows in this way ideally requires a vacuum afterwards to recover space. This component does not arrange that vacuum for you, but there is a vacuum tables component available in an Orchestration job. For more information on vacuuming tables see here. Update/Insert: a traditional update statement, and an insert of incoming rows that don't match the target table (matching on Unique Keys). This is sometimes referred to as an upsert. Note: The update counts reported will differ between the two strategies, even for the same datasets. This is because the database reports the number of affected rows. The first strategy will count all the deletes, plus all the inserts, which may overlap. The second strategy will count the number of updates, plus the number rows added that weren't already updated.

BigQuery Properties
Property	Setting	Description
Name	String	A human-readable name for the component. This is automatically determined when the Target Table Name property is first set.
Project	Select	Select the table Project. The special value, [Environment Default] will use the project defined in the environment.
Dataset	Select	Select the table dataset. The special value, [Environment Default] will use the dataset defined in the environment.
Target Table Name	Select	The name of the output table. The tables found in the currently selected environment is provided to choose from. You can change the currently selected environment in the Environments section.
Fix Data Type Mismatches	Select	Yes: If the source column type does not match the target table type, attempt to CAST the value to the required target type. No: Do not cast types. Google BigQuery may still attempt to coerce the types in this case.
Column Mapping	Input Column	The source column from the input flow.
Column Mapping	Output Column	The target table output column to update.
Unique Keys	Select	Selected column(s) from the input table used as unique keys. The associated target columns are found using the Column Mapping.
Update Strategy	Select	Merge (Rate Limited): Combines Insert and Update operations into a single statement, and performs the operations in isolation (performs the operations separately from other operations that may be going on). Note: Uses the BigQuery Data Manipulation Language. The number of requests using the data BigQuery Data Manipulation Language is severely limited. Exceeding these limits will cause jobs to fail. Merge comes with the BigQuery DML limitations and quotas. Delete/Insert (Rate Limited): Removes overlapping rows (matching on Unique Keys) and then inserts all incoming rows. This is effectively an update, and is very fast. However, if you don't have incoming values for all target columns, replaced rows will be NULL for those missing columns. Note: Uses the BigQuery Data Manipulation Language. The number of requests using the data BigQuery Data Manipulation Language is severely limited. Exceeding these limits will cause jobs to fail. Update/Insert (Rate Limited): a traditional update statement, and an insert of incoming rows that don't match the target table (matching on Unique Keys). This is sometimes referred to as an upsert. Note: Uses the BigQuery Data Manipulation Language. The number of requests using the data BigQuery Data Manipulation Language is severely limited. Exceeding these limits will cause jobs to fail. The update counts reported will differ between the two strategies, even for the same datasets. This is because the database reports the number of affected rows. The first strategy will count all the deletes, plus all the inserts, which may overlap. The second strategy will count the number of updates, plus the number rows added that were not already updated. Rewrite and Append: Rewrites all rows not requiring updates and then appends all rows that are updated. Note: For large tables or small relative numbers of rows to be changed this may be resource heavy.

Synapse Properties
Property	Setting	Description
Name	String	A human-readable name for the component. This is automatically determined when the Target Table Name property is first set.
Schema	Select	Select the table schema. The special value, [Environment Default], will use the schema defined in the environment. For more information on schemas, please see the Azure Synapse documentation.
Target Table	Select	Select the target table. The available tables are determined by the chosen environment. Users can change the currently selected environment in the Environments panel in the bottom-left of the Matillion ETL user interface (UI).
Column Mapping	Input Column	The source column from the input flow.
Column Mapping	Output Column	The target table output column to update.
Unique Keys	Select	Select any columns from the table to use as unique keys. The associated target columns are found using the column mapping.
Update Strategy	Select	Update Only: Updates existing data in a table. Update/Insert: Updates existing data in a table and adds one or more rows to a table that do not match the target table.
Fix Data Type Mismatches	Select	No: Matillion ETL will not CAST data types. This is the default setting. Yes: Where a source column does not match the target table data type, Matillion ETL will attempt to CAST the value to the required type. For more information, please read this article.

Delta Lake Properties
Property	Setting	Description
Name	String	A human-readable name for the component.
Catalog	Select	Select a Databricks Unity Catalog. The special value, [Environment Default], will use the catalog specified in the Matillion ETL environment setup. Selecting a catalog will determine which databases are available in the next parameter.
Database	Select	Select the Delta Lake database. The special value, [Environment Default], will use the database specified in the Matillion ETL environment setup.
Target Table	Select	Select the target table to update. The tables available depends on the select Environment.
Target Alias	String	The alias for the target table. An alias allows a table to be referred to and is usually a name that is typically shorter and simpler than its actual name.
Source Alias	String	The alias of the source table from which data will be used to update the target table.
Join Expression	List of Expressions	A list of expressions specifying how each join is performed. This is the same expression editor used by the Calculator component. Each expression must be valid SQL. There is exactly one expression for each Join, and the result of the expression is evaluated as True or False, which indicates whether the two records being compared "match". Often this will be a simple "equality" condition, but it could be more complex, e.g., where a date falls within a start/end date range.
When Matched	Case	Select a case as previously defined in the "Join Expression" property. Add as many rows to the editor as you need, one per case.
When Matched	Operation	Decide an action for when the corresponding case occurs according to the join expression. Delete: Completely remove the row from the output table if the case is matched. Update: Output the data as expected if a match is found.
Update Mapping	Input Column	(Visible only when a matched case results in an update). The column name from the matched input flow. Add as many rows to the editor as you need, one per input column.
Update Mapping	Output Column	The name of the input column that the corresponding matched input is written to. Note: this can be the same name as the input column if desired.
Include Not Matched	Select	When "Yes", allow non-matched data to continue through to the output. The column(s) this data is written to is(are) defined in a new property, 'Insert Mapping', described below. The default setting is "No".
Insert Mapping	Input Column	(Visible only when an unmatched case is included). The column name from the unmatched input flow. Add as many rows to the editor as you need, one per input column.
Insert Mapping	Output Column	The name of the output column that the corresponding unmatched input is written to. Note: this can be the same name as the input column if desired.

Exports

A full list of common component exports can be found here.

Source	Description
Rows Deleted	When the Update Strategy is "Delete/Insert", the number of rows deleted.
Rows Updated	When the Update Strategy is "Update/Insert", the number of rows updated.
Rows Inserted	For either update strategy, the number of rows inserted.

Strategy

Depends upon the chosen Update Strategy.

Example

In this example we have a table filled with plane data, some of which is a little outdated. We want to update the outdated fields using a set of new data which is also held in a table. To accomplish this, we will be using the Table Update component that will find matching rows between the old and new data. The old data will then have rows replaced by their new data counterparts. The job used is shown below.

t_plane_info is a Table Input component that pulls in the new plane data, which is then linked to the Table Update component. In the Table Update properties (shown below), the table to be updated (our old data) is selected in the Target Table property.

The input columns are mapped to output columns using the Column Mapping property (shown below). Notice that although in this case the names match, that isn't necessary.

To select unique key(s), drag and drop from the left to the right or use the buttons in the center. The unique keys are fields in each row that should not change and do not require updating, thus will be identical in both the old and new data. These keys, at least together, should be able to uniquely identify each row and so should be chosen wisely. In this example, tailnum uniquely identifies a plane in the data.

Since Table Update is used on an existing (target) table, we'll need a Join Expression to match up records from the incoming data and the target table. In many cases, it's most appropriate to match a column of unique values that will ensure records are updated with the correct data. To this end, we simply match records based on TAILNUM since this uniquely identifies a record.

In the "When Matched" property we can add an identical expression and set it to "Update". "When Matched" allows us to tailor the response of the Table Update to either Update or Delete matched records depending on the expression entered here. The most common case is to Update a record based on its match criteria and that's what we configure here.

If we're updating records in the "When Matched" property, the new "Update Mapping" property is made available. This allows us to remap input columns into the output table. In this example, we're doing a simple update that overwrites the existing data and so the input columns are mapped directly onto the counterparts in the existing table as shown below:

Finally, we choose "No" for "Include Not Matched" since we only care about planes found in our existing data.

Running this job will then update the old data with any new rows while leaving the new data table untouched.

What's Next

API v1 - API Extract Profile

Table of contents

Table Update Component
Properties
Snowflake Properties
Redshift Properties
BigQuery Properties
Synapse Properties
Delta Lake Properties
Exports
Strategy
Example

Table Update

Table Update Component

Properties

Snowflake Properties

Redshift Properties

BigQuery Properties

Synapse Properties

Delta Lake Properties

Exports

Strategy

Example

What's Next