Launching a Matillion ETL HA Cluster via AWS
    • Dark
      Light

    Launching a Matillion ETL HA Cluster via AWS

    • Dark
      Light

    Article Summary

    Overview

    This page is a tutorial for creating a new Matillion ETL clustered instance using a CloudFormation Template.

    This tutorial uses Amazon Web Services (AWS) as the cloud provider and Snowflake as the cloud data warehouse. The same steps apply if Amazon Redshift is chosen as the cloud data warehouse.


    Selecting a Clustered Enterprise CloudFormation Template

    1. Log into the Matillion Hub, and choose the organization you want to work in.

    2. The Select your service page will be displayed. Click Add Matillion ETL instance.

    3. Select your cloud provider. In this tutorial, AWS is selected.

    4. Select your cloud data warehouse. In this tutorial, Snowflake is selected.

    5. On the How do you want to deliver Matillion ETL for <chosen cloud data warehouse> page, choose CloudFormation Template.

    6. On the In which region do you want to run Matillion [ETL]? page, choose your preferred AWS region.

    7. On the How do you want to deploy a Virtual Private Cloud (VPC)? page, choose your preferred option. For this tutorial, Deploy to an existing VPC in my AWS environment is selected. Choose Set up a new VPC if you want to create a new VPC and new subnets.

    8. On the Choose a Matillion [ETL] CloudFormation template page, choose Clustered Enterprise.

    9. On the Thank you. We will redirect you to AWS page, you can invite team members to assist with the configuration, or choose Continue in AWS to begin creating the AWS stack.

    You will then be redirected to the AWS console to finish the configuration of your HA cluster.


    Configuring your HA cluster in the AWS console

    This section of the tutorial focuses on the Quick create stack page in the AWS console after you have redirected to AWS from the Matillion Hub.

    Template

    AWS will automatically designate a template URL and stack description to the new stack, based on the metadata defined by each choice made earlier in the Matillion Hub.

    Stack name

    AWS will supply a name for the stack, but you can edit this. Stack names can include letters (A-Z and a-z), numbers (0-9), and dashes (-). The stack name must be unique.

    Parameters

    Parameters are defined in your template and allow you to input custom values when you create or update a stack.

    ParameterDescription
    Instance Configuration
    Instance TypeMatillion instance size. Larger sizes allow for running more concurrent tasks, See (https://www.matillion.com/pricing/) for more info.
    Networking and Security Configuration
    Keypair NameThe selected key pair will be added to the set of keys authorized for this instance.
    VPC IdThe VPC in which to create security groups. This must be the VPC containing the subnet(s).
    Primary SubnetAn existing public subnet to launch the Matillion ec2 instance(s) into.
    Secondary SubnetA secondary existing public subnet to launch the Matillion ec2 instance(s) into.
    Private SubnetsSelect two or more private subnets across multiple availability zones (AZs) for use by secondary resources, e.g. Postgres Failover.
    Security GroupThe security group to associate with the Matillion ETL instance(s). It should have at least ports 80 or 443 available, plus 22 for SSH, and 5701 for clustering.
    ALB Configuration
    DNS PrefixLoad balancer DNS name prefix. Example: [matillion]-1731869672.eu-west-1.elb.amazonaws.com
    Security Group IPv4 CIDRInbound IPv4 CIDR range for application load balancer
    RDS Repository Configuration
    Master UsernameInitial Postgres username. This user will be an admin role.
    Master PasswordPostgres Password. Must contain one uppercase letter, one lowercase letter, at least one digit (0-9). Spaces, quotes, @, and slash characters cannot be used.
    PortSpecify the TCP/IP port that the DB instance will use for application connections.
    Database NameThe Postgres database in which Matillion ETL will store its metadata repository.
    Instance ClassDatabase instance class size.
    Storage SizeThe size of the database in gigabytes.
    Matillion ETL Realm Configuration
    For help with setting up LDAP realm configuration, read [*LDAP Integration*](/docs/2819250){:target="_blank"}.
    UsernameConnection username. Example: administrator@INTERNAL.DOMAIN.COM
    Connection PasswordThe password for the connection username used for the initial bind.
    URLThe URL to your directory server. Example: ldap://10.10.10.254:389
    User BaseThe subtree below which users are stored in the directory tree. Example: cn=Users,dc=INTERNAL,dc=domain,dc=com
    User SearchThe LDAP attribute to use for identifying users. Example: sAMAccountName={0}
    Role BaseThe subtree below which groups are stored in the directory tree. Example: cn=Groups,dc=INTERNAL,dc=domain,dc=com
    Role NameThe LDAP attribute used to identify a group or role. Example: cn
    Role SearchThe LDAP attribute to use to identify groups or roles. Example: member={0}
    User SubtreeSets the scope of the search. Select true if you wish to search the entire subtree, rooted at the "User Base" entry. Selecting false (default) requests a lone top-level search.
    Login RoleThe name of an existing group in the directory server whose users will be allowed to log in. Role names are case-sensitive.
    Admin RoleThe name of an existing group in the directory server whose users will be allowed to administer Matillion. Role names are case-sensitive.
    Project Admin RoleThe name of an existing group in the directory server whose users will be allowed to administer Matillion Projects. Role names are case-sensitive.
    API RoleThe name of an existing group in the directory server whose users will be allowed to administer Matillion. Role names are case-sensitive.
    Other Parameters
    MatillionProductConfirm the target data warehouse. This is auto-populated with the data warehouse you selected in the Matillion Hub.

    Once you have added all parameters, tick the I acknowledge that AWS CloudFormation might create IAM resources box.

    Click Create stack. If any fields need to be re-validated, the console will provide information at the top of the Quick create stack page. If the stack creation form is complete, the AWS console will redirect to your newly created stack and land on the Events tab of the stack.


    Launching your new Matillion ETL HA cluster

    1. Click the Resources tab of your stack. There should be two ec2 instances visible.

    2. Click the Outputs tab of your stack and click on the URL that corresponds to the ALB URL. This URL will launch Matillion ETL in a new browser tab.

    3. Log into Matillion ETL using your credentials.

    4. If you are a Matillion Hub customer, you are required to associate your Matillion ETL instance with the Matillion Hub. For more information, read Associating a Matillion ETL Instance.

    Note

    You only need to associate one instance in a cluster because both instances connect to a shared database.

    1. Create a new project, read Create Project for more information.

    2. You can confirm that you are logged into a clustered Matillion ETL instance by navigating to the lower-right panel and clicking Cluster Info. This tab is only available in clustered instances. It displays information about the address of each ec2 instance, its current state, how many jobs it has, and its creation timestamp.

    Note
    • User Configuration is not available in the Admin menu on a clustered instance.
    • Clicking Server Log in the Admin menu will not open the standard server log, but instead open AWS CloudWatch.