View Data Health

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Applies from version 2022.2

The Open Data Quality Initiative is an Alation innovation that provides a framework for tracking rule-based measures of data health, data reliability, and overall data quality within the Alation data catalog. Rules can be manually generated using the Data Health API or automatically generated by any of several data quality software vendors, including Soda and BigEye. If Data Health information has been enabled in your Alation data catalog, any table with an associated Data Health rule will show an active Health tab.

Enable Data Health

Data Health is not active by default and can be enabled by a Server Admin. We recommend enabling Data Health on all Alation instances as it provides a useful framework for automating consistent data health observations across your data.

To enable Data Health, toggle Enable Health Data on the Feature Configuration page of Administrator Settings.

View a Table with Health Information

Once Data Health has been enabled and at least one rule defined using the API, you can view health information for any table to which a rule applies. For example, we defined a rule specifying that the season_number field in the Episodes table of the IMDb schema consists of only numeric data and similar rules checking that the episode title and parent TV show title consisted of string data. When we open the Episodes catalog page, we see the Health tab is active and the Health column shows the status of any rules applied to particular columns:

../../_images/DataHealth_TableView_HealthTab.png

Our rules show the three types of status available in a health rule: Good, Warning, and Alert. The Health tab shows the most severe status indicated.

The Health column is also visible on the Columns tab:

../../_images/DataHealth_Episodes_ColumnsTab.png

Click on the Health tab to view the Data Health information for all active rules:

../../_images/DataHealth_HealthTab_Rules.png

View Data Health Propagated Via Lineage

Beginning with Alation version 2022.4, you can see Data Health information propagated through Lineage.

You can see Data Health information on downstream tables, BI data sources, or BI reports if any upstream objects have Data Health issues. If so, the Health tab will be active, and show the status of the most severe Data Health rule impacting the object:

../../_images/DataHealth_DownstreamTable.png

Click the Health tab to view the Data Health information. Click the number beside Upstream Issues to view the Upstream Issues tab:

../../_images/DataHealth_UpstreamIssues2.png

Here you see the upstream source with data health issues, and a summary of the issues, which may include upstream object deletion. Click the down arrow to the right of this summary to see an expanded view of the information, including the rules that are in place and the objects they apply to:

../../_images/DataHealth_UpstreamIssues_Expanded2.png

View Data Health in Search Results

From release 2023.1, you can see Data Health alerts and warnings in search results.

For example, consider the spi_matches table with the following Data Health rules defined:

../../_images/DataHealth_RulesDefined.png

If we search for “spi_matches”, we obtain the following results:

../../_images/DataHealth_SearchPropagation.png

The spi_matches table appears in the search results flagged with an Alert icon, because Alert is currently the most severe status in the table. If the most severe status was Warning, the Warning icon would be displayed.