Azure Unload Snowflake
    • Dark
      Light

    Azure Unload Snowflake

    • Dark
      Light

    Article Summary

    This article is specific to the following platforms - Snowflake.

    Azure Blob Storage Unload

    This component creates files on a specified Azure Blob Storage account and loads them with data from a table or view.

    By default, your data will be unloaded in parallel.


    Properties

    Snowflake Properties

    PropertySettingDescription
    NameStringA human-readable name for the component.
    StageSelectChoose a predefined stage for your data. These stages must be created from your Snowflake account console. Otherwise, [Custom] can be chosen for the staging to be based on the component's properties.
    Azure Storage LocationStorage LocationSelect an Azure blob storage container that files will be unloaded into from the tree structure.
    Alternatively, provide the full URL manually in the field where the template has already been provided: azure://<account>/<container>/<path>
    When a user enters a forward slash character / after a folder name, a validation of the file path is triggered. This works in the same manner as the Go button.
    File PrefixStringSpecify a file prefix for unloaded data on the blob container. Each file will be named as the prefix followed by a number denoting which node this was unloaded from. All unloads are parallel, and will use the maximum number of nodes available at the time.
    AuthenticationSelectSelect the authentication method. Users can choose either:
    Credentials: Uses Azure security credentials (configured in the Matillion ETL environment).
    Storage Integration: use a Snowflake storage integration. A storage integration is a Snowflake object that stores a generated identity and access management (IAM) entity for your external cloud storage, along with an optional set of permitted or blocked storage locations (Amazon S3, Google Cloud Storage, or Microsoft Azure). More information can be found at CREATE STORAGE INTEGRATION.
    WarehouseSelectChoose a Snowflake warehouse that will run the load.
    DatabaseSelectChoose a Snowflake database to create the new table in.
    SchemaSelectSelect the table schema. The special value, [Environment Default], will use the schema defined in the environment. For more information on using multiple schemas, see this article.
    Target TableSelectSelect the table or view to unload to blob storage.
    FormatSelectChoose from preset file formats available in your Snowflake database. Additional file formats can be created using the Create File Format component. Selecting the [Custom] file format will use the component properties to define the file format.
    File TypeSelectChoose whether you would like Matillion ETL to unload the data in a CSV, JSON, or PARQUET file format.
    CompressionSelectSelect the compression format. Available CSV and JSON formats include:
    • AUTO
    • BROTLI
    • BZ2
    • DEFLATE
    • gzip
    • NONE (no compression)
    • RAW_DEFLATE
    • ZSTD
    Available PARQUET formats include:
    • AUTO
    • LZO
    • NONE (no compression)
    • SNAPPY
    Record DelimiterString(CSV only) Specify a delimiter character to separate records (rows) in the file. Defaults to newline. \ can also signify a newline. \\r can signify a carriage return.
    Field DelimiterString(CSV only) Specify a delimiter character to separate columns. The default character is a comma ,
    A [TAB] character can be specified as \ .
    Date FormatString(CSV only) Defaults to auto. Use this property to manually specify a date format. More information...
    Time FormatString(CSV only) Defaults to auto. Use this property to manually specify a time format. More information...
    Timestamp FormatString(CSV only) Defaults to auto. Use this property to manually specify a timestamp format. More information...
    EscapeSelect(CSV only) Specify a single character to be used as the escape character for field values that are enclosed. Default is NONE.
    Escape Unenclosed FieldString(CSV only) Specify a single character to be used as the escape character for unenclosed field values only. Accepts common escape sequences, octal values, or hex values. Also accepts a value of NONE (default). Default is \\\\.
    If a character is specified in the "Escape" field, it will override this field.
    If you have set a value in the property Field Optionally Enclosed, all fields will become enclosed, rendering the Escape Unenclosed Field property redundant, in which case it will be ignored.
    Field Optionally EnclosedString(CSV only) Specify a single character string used to enclose strings. Can be single quote (') or double quote (") or NONE (default). Note that the character chosen can be escaped by that same character.
    Nest ColumnsBoolean(JSON only) When "True", the table columns will be nested into a single JSON object so that the file can be configured correctly. A table with a single variant column will not require this setting to be "True". The default setting is "False".
    Null IfStringSpecify a string to convert to SQL NULL values. In this field, users can specify a value that will be used when unloading the data to the chosen file. If an SQL NULL value is found, then the NULL value is replaced with the first value listed in this. For unloads, only the first string that you specify in this property is used.
    Trim SpaceBooleanWhen "True", removes whitespace from fields. The default setting is "False".
    OverwriteBooleanWhen "True", overwrite existing data (if the target file already exists) instead of generating an error. Default setting is "False".
    Single FileBooleanWhen True, the unload operation will work in serial rather than parallel. This results in a slower unload but a single, complete file.
    The default setting is False.
    When True, no file extension is used in the output filename (regardless of the file type, and regardless of whether the file is compressed).
    When False, a filename prefix must be included in the path.
    Max File SizeNumericThe maximum size (in bytes) of each file generated.
    The default is 16000000 (16 MB). The maximum size is 5000000000 (5 GB).
    For more information, see the Snowflake documentation.
    Include HeadersBooleanWhen "True", write column names as headers at the top of the unloaded files. Default is "False".

    Copying Files to an Azure Premium Storage blob

    When copying files to an Azure Premium Storage blob, Matillion may provide the following error:

    Self-suppression not permitted.

    This is because, unlike standard Azure Storage, Azure Premium Storage does not support block blobs, append blobs, files, tables, or queues. Premium Storage supports only page blobs that are incrementally sized.

    A page blob is a collection of 512-byte pages that are optimised for random read and write operations. Thus, all writes must be 512-byte aligned and so any file that is not sized a multiple of 512 will fail to write.

    For additional information about Azure Storage blobs, we recommend consulting the Microsoft Azure documentation.


    Example

    In this example, we use a Jira Query component to load some data into a table named "doc_tbl". After some transformation of the data, we wish to unload the data to an Azure Blob Storage Container for long-term storage. To this end, we use the Azure Blob Storage Unload component in a job as shown below.

    The Azure Blob Storage Unload component is configured to take data from "doc_tbl" (as specified in the Table property) and unload it into the Blob container specified in the "Azure Storage Location" property. Each file it creates will have the prefix "Jira_TABLE" in its name.

    By exploring through the Azure Portal, we can check the contents of the Blob Storage Container and confirm that our data has landed there.