View Data Health¶
Applies from version 2022.2
The Open Data Quality Initiative is an Alation innovation that provides a framework for tracking rule-based measures of data health, data reliability, and overall data quality within the Alation data catalog. Rules can be manually generated using the Data Health API or automatically generated by any of several data quality software vendors, including Soda and BigEye. If Data Health information has been enabled in your Alation data catalog, any table with an associated Data Health rule will show an active Health tab.
Enable Data Health¶
Data Health is not active by default and can be enabled by a Server Admin. We recommend enabling Data Health on all Alation instances as it provides a useful framework for automating consistent data health observations across your data.
To enable Data Health, toggle Enable Health Data on the Feature Configuration page of Administrator Settings.
View a Table with Health Information¶
Once Data Health has been enabled and at least one rule defined using the API, you can view health information for any table to which a rule applies. For example, we defined
a rule specifying that the
season_number field in the Episodes table of the IMDb
schema consists of only numeric data and similar rules checking that the episode title and parent TV show title consisted of string data. When we open the Episodes catalog page, we see the Health tab is active and the Health column shows the status of any rules applied to particular columns:
Our rules show the three types of status available in a health rule: Good, Warning, and Alert. The Health tab shows the most severe status indicated.
The Health column is also visible on the Columns tab:
Click on the Health tab to view the Data Health information for all active rules:
View Data Health Propagated Via Lineage¶
Beginning with Alation version 2022.4, you can see Data Health information propagated through Lineage.
You can see Data Health information on downstream tables, BI data sources, or BI reports if any upstream objects have Data Health issues. If so, the Health tab will be active, and show the status of the most severe Data Health rule impacting the object:
Click the Health tab to view the Data Health information. Click the number beside Upstream Issues to view the Upstream Issues tab:
Here you see the upstream source with data health issues, and a summary of the issues, which may include upstream object deletion. Click the down arrow to the right of this summary to see an expanded view of the information, including the rules that are in place and the objects they apply to:
View Data Health in Search Results¶
From release 2023.1, you can see Data Health alerts and warnings in search results.
For example, consider the spi_matches table with the following Data Health rules defined:
If we search for “spi_matches”, we obtain the following results:
The spi_matches table appears in the search results flagged with an Alert icon, because Alert is currently the most severe status in the table. If the most severe status was Warning, the Warning icon would be displayed.