Enable and Install Alation Analytics V2

Available from version 2020.3

Alation Analytics V2 feature switch is present by default in Admin Settings > Labs or Feature Configuration (renamed in release 2021.1) on those instances of Alation where the Alation Analytics V1 database has never been enabled.

Alation Analytics V2 components require installation.

Note

If Alation Analytics V1 has been previously enabled on your instance, the Alation Analytics V2 feature is not available by default. To simultaneously use V1 and V2, refer to Transition from Alation Analytics V1 to Alation Analytics V2 for guidelines on how to enable and maintain both versions.

The Alation Analytics V2 feature is disabled by default. After you enable it in Labs/Feature Configuration, proceed to install the Alation Analytics V2 components.

Note

Alation recommends that Alation Analytics V2 is installed on a Friday so that the initial ETL can run over the weekend. The ETL time depends on the amount of data in the internal application database. The initial ETL extracts the first 6 months of the Catalog data and may take longer than subsequent ETL jobs.

There are 2 installation options for Alation Analytics V2:

  • Installation on a separate server: recommended by Alation

    For instances using the HA Pair configuration, it is strongly recommended to install on a separate server. Installation on the same server is not recommended for the HA Pair. After the installation, enable, configure, and initialize the Alation Analytics V2 database on the Primary server of the HA Pair. The Secondary will inherit the configuration from the Primary.

    Note

    For HA Pair, the installation on a separate host will ensure a functional state of Alation Analytics V2 components in case of failover to Secondary. The alation_conf values that store the Alation Analytics V2 configurations match between Primary and Secondary; so Alation Analytics will continue to work with Secondary after failover without manual intervention.

  • Installation on the same server with the Alation Catalog

    You can choose this option if your host server can accommodate for both the Alation application and Alation Analytics V2, and the server resources and performance are not a concern. Please note that both the internal server database and the Alation Analytics database grow over time. The host server should be able to support this growth.

Follow the steps below to install the Alation Analytics V2 components and add the Alation Analytics V2 data source to the Catalog.

Prerequisites

Do the following prerequisites before installing Alation Analytics V2.

Step 1: Prepare the Host

Use these requirements to either prepare a separate machine or evaluate and prepare an existing Alation Catalog host.

Requirement

Description

OS

  • Linux with kernel version 3.10 or higher

  • OS distributions supported by Alation:

    OS Requirements

CPU

4 cores

Memory

  • 16 Gb RAM minimum if installing on a separate server

  • 32 Gb minimum if installing AA on the same host with the Alation Catalog

Ports

If installing on a separate server, open ports 25432 and 5672.

See Step 3: Open Ports .

Software

Install additionally:

  • Docker Engine version 18.09.1 or higher

  • Docker Compose version 1.23 or higher

See Step 2: Install Docker .

Docker daemon

Make sure the Docker daemon is running.

Space for /var/lib/docker

Allocate a minimum of 32 GB disk space to the directory /var/lib/docker

Space for the installation directory

Release 2020.3

Installation directory /opt/alation-analytics has to be created for the Alation Analytics V2 installer.

Release 2020.4 and later

Alation Analytics V2 can be installed into a custom directory specified during installation.

Allocate enough disk space for the installation directory. Recommended is 1.5-2 times the size of Rosemeta as this is almost its replica. The minimum is 32 GB.

See Check the Alation Application Database Size about how to find out the size of Rosemeta.

Check the Alation Application Database Size

To calculate disk usage for your Rosemeta database, use the following command on the Alation host server from inside the Alation shell (requires SUDO permissions). Depending on your PostgreSQL version:

du -sh /var/lib/pgsql/9.3/

or

du -sh /var/lib/pgsql/9.6/

Make sure you have 1.5-2 times more free disk space (in GB) before you enable Alation Analytics V2.

Step 2: Install Docker

Note

Other containerization software, such as Kubernetes, Openshift, or Podman, is not currently supported. Alation Analytics V2 can only be installed on Docker (v. 18 or later).

Docker Engine

Releases 2021.2 and Newer

Use Alation Container Service to install Docker: Install Docker Using Alation Container Service

Releases 2020.3.x - 2021.1.x

Docker Engine is available for a variety of Linux platforms. Install Docker in your preferred way.

General recommendations: Install Docker for Alation Versions 2020.3.x - 2021.1.x

Examples of instructions per OS:

Docker Compose

Docker Compose relies on Docker Engine to work, so install it after you have installed Docker Engine.

Please use either method to install Docker Compose:

Method 1

sudo curl -L "https://github.com/docker/compose/releases/download/1.26.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose

sudo chmod +x /usr/local/bin/docker-compose

# backwards compatibility
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose

Method 2: (requires pip to be installed on your server)

pip install docker-compose

Docker Daemon

After Docker Engine has been installed, the Docker daemon should start automatically. To check on the state of the Docker daemon:

sudo docker version

or

sudo systemctl status docker
# q to exit

Step 3: Open Ports

This applies only if installing on a separate server.

On the host server, open ports 25432 and 5672 in your firewall for the message broker and PostgreSQL and whitelist the IP of the main Alation server running the Catalog on this host. It is recommended to only allow access for the IP of the machine that hosts the Alation Catalog and block all other IPs.

Step 4: Create an Installation Directory

Note

Please make sure that at least 32 GB of free space is available on the host for the Alation Analytics installation directory.

Release 2020.3

On the server where you are going to install Alation Analytics V2, create a new directory at /opt:

sudo mkdir -p /opt/alation-analytics

Release 2020.4 and Later

From release 2020.4, installation is not tied to the directory /opt/alation-analytics and can be anywhere on the host. You can skip this step as the installation directory can be created automatically during installation.

Alternatively, you can create a custom installation directory on the host beforehand. Note down the path as you will need to specify it during installation.

If no custom installation directory is specified during installation, Alation will attempt to locate the directory opt/alation-analytics on the host and install into it. If such a directory does not exist, installation cannot proceed until the path is specified.

Installation

This requires the role of a Server Admin.

  1. Open a Terminal window and SSH to the host where you are going to install Alation Analytics V2.

  2. In your browser, sign in to the Alation Catalog and go to Admin Settings > Labs or Feature Configuration (renamed in 2021.1).

  3. Locate the switch for Alation Analytics V2, toggle it on, and save the changes.

    ../../_images/AAV2_01.png

    Note

    If Alation Analytics V1 has previously been enabled on your instance, the Alation Analytics V2 switch will not be available by default. See Transition from Alation Analytics V1 to Alation Analytics V2 for more details.

  4. After you save the changes, the Alation Analytics Settings page will open. This page contains instructions on how to proceed with the installation:

    ../../_images/AAV2_02.png
  5. First, download the Alation Analytics V2 package.

    • In 2020.3.x, you can only download this package locally. Click Download AA to download it to the current machine (373 MB).

    • Starting from 2020.4.x, you can either download locally or use a Curl command to download the package from the Terminal opened on the host. Copy the Curl command and run it to download the package on the host where you SSH’ed and opened the Terminal window.

  6. Copy or move the downloaded package to the Alation Analytics V2 host that you have prepared:

    • In 2020.3, after downloading locally, move the Alation Analytics V2 package to the directory /opt/alation-analytics that you previously created on the host server.

    • In 2020.4 and newer releases, move it to any custom directory on the host, for example /tmp.

  7. Untar the downloaded Alation Analytics file and navigate into the un-tarred folder. One of the content items is a README file. Open this file and follow the instructions to install Alation Analytics V2.

    ../../_images/AAV2_22.png

    Note

    The installation logs can be found at /var/log/alation-analytics/installer.log (path inside the Alation shell).

  8. After the installation, return to the Alation Analytics Settings page to configure and initialize the Alation Analytics V2 database.

  9. To configure, click the link click here to view conf values and change postgres password (step 5 of the Instructions on this page). This opens the Alation Analytics Conf Values dialog:

    ../../_images/AAV2_03.png
  10. Specify the values and click Save:

    • RabbitMQ Host: IP address of the Alation Analytics server. The default value is 127.0.0.1 which stands for localhost. If you have installed on the same server with the Alation application, leave the default value. If you installed on a separate server, enter the IP of the remote host.

    • Pgsql DB Host: IP address of the Alation Analytics server (the same IP address as for RabbitMQ Host). The default value is 127.0.0.1 which stands for localhost. If you have installed on the same server with the Alation application, leave the default value. If you installed on a separate server, enter the IP of the remote host.

    • Pgsql DB Password: type the password you created during the installation for the PostgreSQL database.

  11. SSH to the Alation server, enter the Alation shell and restart the Alation web services:

    #to enter the shell:
    sudo /etc/init.d/alation shell
    #restart Web and Celery:
    alation_supervisor restart web:* celery:*
    
  12. Return to the Settings > Alation Analytics Settings page and initiate the Alation Analytics database by clicking the Initiate Analytics Database button. This action creates the database schema. After the Initiate Database job has run, the Alation Analytics V2 database becomes available as a data source in Alation. While the Initiate job is running, you can click the Refresh button to refresh its status.

    ../../_images/AAV2_23.png
  13. After the database has been initiated, the very first ETL is triggered automatically to load the the data from the internal database right after the Initiate job has completed. After the initial ETL, the next one will run automatically at night (default configuration) or can be triggered manually using a one-off script. Note that MDE for Alation Analytics V2 runs automatically after every ETL.

    Important

    For users to be able to view, query or manage the Alation Analytics V2 data source, they must be granted access. A data source admin for the Alation Analytics V2 source can provide Viewer, Querier, or Admin access to other users on the Access tab of the data source settings. See User Access to Alation Analytics V2 .

  14. The Alation Analytics V2 data source that becomes available in Alation comes with a prepackaged data dictionary with table and column descriptions, a number of sample queries linked to the Description field of the schema Public and a number of out-of-the-box articles referenced under the Relevant Articles for schema Public (grouped under the parent article Introduction to Alation Analytics). Please review these materials to familiarize yourself with the structure of the Alation Analytics V2 data source:

    ../../_images/AAV2_16.png