Alation/Tableau Best Practices¶
Summary Alation integrates with and compliments Tableau’s existing data governance capabilities so that organizations can truly balance the demands of agility and compliance; empowering self-service analytics in any organization.
Key Use Cases¶
Discovery Alation also enables Tableau users to search for their data across highly distributed big data environments through the Alation Data Catalog and generate workbooks directly from Alation. Along with samples of every data set available across systems, this single point of reference provides detailed descriptions of the data and its uses stewarded by both expert curators and the users of the data. This increases the ease of access for self-service analysis of numerous database and file storage systems.
Governance Alation natively supports data governance processes in Tableau through Tableau Projects. Data stewards and data governance teams can use Alation to publish and automatically update the association of Tableau Data Sources with data governance approved Tableau Projects. This ensures a managed and consistent workflow for approving and updating data assets for use in Tableau.
Deep Impact Analysis Alation makes Tableau deeply data-pipeline aware by notifying Tableau data consumers by email when underlying changes in the source database or file structure has potentially affected the accuracy of their analysis. Consider a profit field in a Tableau Workbook. If a defining field in the database table or Hive on Hadoop structure is changed and the profit field is no longer valid, it will be flagged in Alation as a deprecated column. This notification is automatically pushed to all Tableau users who have used that column in a Data Source, Project or Workbook.
Documentation Alation can provide a full documentation engine that analysts, Tableau power users, data scientists, data stewards, etc can rely for up-to-date knowledge on the data across all data sources.
Discovery Use Case¶
This use case empowers an Alation user to find and search Tableau assets through Alation’s data catalog. For this example, the user is interested in finding a Tableau workbook to be used in a presentation about hospital revenue.
Need to quickly find and discover trusted and credible assets in Tableau.
May have multiple BI tools, such as Tableau and MicroStrategy, and need a way to surface which workbooks have relevant and trusted information.
Tableau Power Users
Two important points about Alation’s search:¶
Alation’s search is heterogeneous, which means that you can search across all assets cataloged in Alation (BI workbooks, articles, charts, queries, tables, columns, etc).
Alation uses natural language processing (NLP) so you can search across metadata as well as the information curated by a human.
Selecting one of the search results highlights a Tableau workbook that is cataloged in Alation. The workbook is endorsed which promotes trust to use this workbook. Here, the user can view usage of specific worksheets, relevant tags, and a thumbnail sketch of the Tableau workbook.
Additionally, the user can quickly view important and relevant information that can give him/her quick insight on whether the workbook is relevant for the presentation:
Fields, such as measures and dimensions, can be viewed to show how fields were calculated.
Lineage can also be viewed to provide quick impact analysis and determine if any issues from the data sources are propagating into the worksheet.
Additionally, Alation can be used to search and compare reports from multiple BI tools. Instead of the second search result, the user can select the first result that shows it is a MicroStrategy workbook. From analysis of fields, lineage, usage information, and the thumbnail sketch the user can determine that both workbooks (Tableau and MicroStrategy) are based on the same data. Because the usage in the MSTR workbook is slightly higher than in the Tableau workbook, the user may decide that the MSTR workbook is more applicable to his/her analysis:
Governance Use Case¶
This use case empowers an Alation user to govern and trust data in Tableau. With Alation, you no longer need to choose between compliance and or usage. Instead, you can promote best practices from Alation’s data catalog to your Tableau environment.
Target Audience: Data Stewards, BI/BA Analysts¶
In a common self-service analytics scenario a natural question arises: Which workbooks have accurate and up to date information? This question might be manageable when you have a few workbooks tied to a few projects, but this can quickly be a difficult problem when you have to govern thousands of workbooks tied to hundreds of projects. Alation has a deep integration with Tableau that empowers users harnessing Alation to provide governance for your Tableau environment.
Alation provides the ability for a data steward, BI/BA Analyst, Tableau power user to endorse Tableau workbooks that they know are credible. In this example, the user searches for a workbook called Hospital Revenue Analysis he/she would like to endorse:
Here is a screenshot of Tableau Server. The Hospital Revenue Analysis workbook is in the Medical project folder:
After the user finds the workbook then he/she can endorse it by clicking on the green stoplight in the upper left corner:
Because of the deep integration between Alation and Tableau, the workbook, originally in the Medical project folder, is now moved to the Medical-Alation Certified folder. Users in the organization can be trained to understand that the assets in Alation-Certified suffix represent trusted and credible workbooks that are validated by someone else in the organization:
Similarly, a user can also deprecate a workbook if he/she finds an issue with it. For example, if a user finds an issue with an underlying table that sources information into the Tableau workbook then he/she could issue a deprecation.
Because of the deprecation in Alation, the workbook, which was in the Medical-Alation Certified project folder is now moved back to the Medical project folder in Tableau Server.
With Alation and Tableau, users have access to data governance policies and best practices directly in the flow of their analysis, resulting in consistent compliance and broadly adopted best practices.
Impact Analysis Use Case¶
Issues in a data pipeline can pollute many data assets downstream, thus it is important to immediately identify which assets are affected and quickly rectify. Alation provides deep impact analysis that surfaces data quality issues that can impact your Tableau environment.
Target Audience: Data Stewards, BA/BI Analysts, Data Engineers¶
Imagine that I am a data engineer and I discovered an ETL or data quality issue with the hospital_gen_info table. Maybe the data in it is computed incorrectly or is out-of-date:
As a data engineer, I might not know anything about Tableau, but thanks to Alation, I don’t have to. I have to know about the underlying data and Alation can do the rest. I can search for hospital, and visit that table page in Alation Catalog. This time, I’ll click the red circle to Deprecate the table.
Back at the Workbook Lineage in this browser tab, when I reload, I can see the impact analysis of how this data pipeline issue affects other tables and ultimately, even workbooks in Tableau. I can follow the now-red line downstream and click the icon for the dashboard in the Hospital Revenue workbook to see this warning was automatically propagated from the upstream deprecation.
If the data steward determines that the upstream issue will affect the trustworthiness of the workbook, he/she can deprecate the workbook.
Because of the tight integration between Tableau and Alation, the deprecation automatically moves the Hospital Revenue workbook from Medical - Alation Certified project to the Medical project folder:
Best Practices to Setup Tableau Governance¶
Create a Tableau Project. Create and name a Tableau Project - Alation Certified. This will be a Project to which only the Alation application’s user-id can add items. Similarly create Projects called Alation Warned, Alation Deprecated and Alation Un-evaluated.
Set-up custom groups and permissions so that any table’s flags in Alation, which indicate Endorsements, Deprecations, and Warnings, can be edited only by a given table’s Data Stewards. Ensure that the Data Steward field in Alation can be edited only by the central Data Governance group
Create trustworthy sources using Alation Compose or a
CREATE VIEWstatement in any other SQL tool.
If using Alation Compose, write a SQL query joining the appropriate source tables and selecting the appropriate columns (combined with the appropriate functions) or
Use the Alation Data Catalog to find the relevant source tables and then construct and run a CREATE VIEW statement using another query tool. The Alation Data Catalog will automatically detect this new table during its next scheduled extraction
Publish the view as a Tableau Data Source. You can publish the view as an Alation Certified Project through the Alation UI.
Extract data. (Alation’s automatic extraction will produce a page in Catalog for the Tableau Data Source and parse its lineage all the way back through the underlying tables). Alation can be configured to automatically re-categorize Tableau Data Sources into different projects if issues are flagged with any underlying data.
Tableau authors can create workbooks using certified sources, a policy that can be enforced by IT. When a new workbook is created, Alation will extract it and its full lineage (including the Tableau Data Sources to which it’s connected). Stewards can verify the lineage and also make additional semantic checks (for instance, ensuring that the title and axis labels in each view are appropriate given its measures and dimensions). If everything checks out, they can use the Alation UI to transfer the workbook into the Alation Certified Project, where it will remain unless something is flagged upstream (as described above).
Documentation Use Case¶
Data curation is a new approach that gets organizations to where data documentation efforts alone could not. Data curation promotes reuse of the data knowledge that already exists & results in higher analyst productivity.
Users can harness Alation articles to provide a quick and insightful view of data.
Articles act as a living document where users can link to queries, Tableau dashboards, conversations, datasets, etc that provides a central location to all your data assets.
Best Practices for Tableau Adoption¶
Establish a where to get started portal, such as Confluence or Sharepoint, with applicable documentation.
Here are some suggested documentation items to include:
Growth Approach Statement
Alation is meant to be a social collaboration environment for metadata that is primarily driven by users. When users have the ability to create their own ways of working they are empowered and feel ownership, making them energized and part of a greater community. At the same time, no guidance and direction can negatively impact users, as they may feel lost about what they should do or how they should approach working with the environment. With that in mind, the growth approach behind Alation is a balanced approach between user-driven grass-roots growth and architecture driven standards, governance, and training. The goal should be to give users enough guidance so that they don’t get lost or cause inadvertent damage but at the same time feel empowered to create their own impact on the environment. The below references standards, governance, and training, that will be useful to users, but keep in mind that feedback from users should be used to further change these standards by either adding to them or removing from them.
Use Cases Document
Use documentation, meaning building articles, as a way to establish best practices. Build five Articles linking to Tableau dashboards.
Identify three individuals in other organizations and have them build out articles that link to a Tableau dashboard.
Have those three people identify three other individuals to build out five articles that link to a Tableau dashboard.
Data curation is a new approach that gets organizations to where data documentation efforts alone could not. Data curation promotes reuse of the data knowledge that already exists & results in higher analyst productivity. Collaboration comes more naturally. And the coverage and accuracy of data knowledge in the organization also increases.