Monitor Component Health

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Alation monitors the health of its components to determine whether they are running and working correctly. The status of these functional checks is available to Server Admins in the Monitor section of the Admin Settings page.

Enable Health Checks

Note

Alation Cloud Service customers can request server configuration changes through Alation Support.

To enable health checks, you need sudo access to the host server.

  1. Use SSH to connect to the Alation server.

  2. Enter the Alation shell using the following command:

    sudo /etc/init.d/alation shell
    
  3. Set the feature flag by running the following command:

    /opt/alation/ops/actions/alationadmin/enable_datadog
    

This configuration will make the Health Checks tab visible in Alation.

View Alation Health Status

To check on the health of your Alation instance:

  1. Sign in to Alation as a Server Admin, and in the upper-right corner of the main toolbar, click the Admin Settings icon. The Admin Settings page will open.

  2. In the Monitor section, click Health Checks. The Health Checks tab will open.

Components Monitored

Alation monitors the performance of the following components:

Component

Health Check

Description

Alation Analytics V2 (if enabled)

Postgres Connection*

Checks if a connection to Alation Analytics’ internal Postgres database can be established.

RabbitMQ Connection

Checks if a connection to Alation Analytics’ RabbitMQ component can be established.

Connector

Response

Checks if Compose Connector is responding to requests.

Elasticsearch

Shards

Checks if all Shards are active in Elasticsearch

Connection

Checks if the agent can connect to Elasticsearch to collect metrics.

Postgres

Query Period

Checks the running time of queries (threshold: 60 min). It is unhealthy for a query to run longer than 60 min and may be indicative of a problem. If the threshold of 60 min is exceeded, the check will throw a warning.

Connection

Checks if connection to Postgres is successful.

TaskServer

Alive

Checks if connection to TaskServer is alive.

Redis

Connection

Checks if connection to Redis can be established.

* Alation Cloud Service instances on our cloud-native architecture don’t include the Alation Anaytics Postgres health status, because in these instances Postgres is a fully managed relational database service.

For more details on Alation components, see Alation Architecture.

Statuses

The Health Check tab will display one of the three statuses on each of the checks and a clarifying status message:

  • Success—The component is performing correctly.

    ../../_images/MonitorSuccess.png
  • Warning—The component is running with issues. Refer to the warning message for details.

    ../../_images/MonitorWarning.png
  • Failure—There are errors in performance of the component. Refer to the error message for details and troubleshooting clues.

    ../../_images/MonitorFailure.png