Explore Lineage

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Lineage is data about the origin of data and its movement through an organization’s data ecosystem. Lineage documents how target data objects are created from source data objects. Lineage is visually represented as a chart on the Lineage tab of a data source, BI source, or file system. Lineage charts frequently include dataflow objects, which can be used to document:

  • ETL and ELT processes

  • Stored procedures

  • SQL queries

  • Scripts that transform source data into target data

The lineage chart brings together a target data object, its upstream sources, and the dataflow objects that track its movement, to fully represent the data ecosystem.

From version 2023.3, lineage can be displayed in either of two views: a classic view or a compound layout view. For more information, see Analyze the Lineage Chart.

Lineage Architecture

The lineage framework in Alation is built on Lineage V3, or the lineage service, introduced in version 2021.4. The lineage service is a microservice operating inside the Alation server that is responsible for the creation, storage, and retrieval of lineage data into the catalog.

The Alation server creates lineage data from multiple sources, such as metadata extraction (MDE), query log ingestion (QLI), Compose query history, and public APIs. Lineage events generated from these sources are sent to the lineage service via Event Bus. In the lineage service:

  • The lineage write service consumes lineage events from the Event Bus and stores this lineage data into the lineage database.

  • The lineage read service retrieves the stored lineage data and powers the lineage diagrams in the Alation user interface.

The image below illustrates the lineage architecture for a customer-managed Alation instance.

../../_images/lineageV3_01.png

Types of Lineage

There are two main types of lineage: table-level and column-level. Table-level lineage is the more common, as all types of lineage extraction are capable of producing it. Column-level lineage is dependent upon both the data source and the data source connector. Column-level lineage is calculated for those sources whose connectors support it. For a complete list of data sources that support column-level lineage, see the Support Matrix for your Alation version.

Both table-level and column-level lineage can be created:

Automatic Lineage

Alation automatically calculates lineage using metadata sourced from metadata extraction (MDE), query log ingestion (QLI), and Compose queries. For most data sources, automatic lineage calculation requires query history data extracted and ingested with QLI. Lineage from Compose only exposes data transformations done through Alation’s Compose. Some data sources, for example, SAP HANA and Databricks Unity Catalog, support direct lineage extraction, which is lineage data extracted from system tables during MDE.

Manual Lineage

Users can create and edit lineage charts manually in the Alation interface using the capabilities of the Manual Lineage feature. Learn more in Create Lineage Data Manually.

Creating Lineage via the API

Alation provides a public API to create and update lineage data in the data catalog. The Lineage API documentation can be found on the Developer Portal: Lineage APIs.

Enabling Column-Level Lineage

For most connectors that support column-level lineage, column-level lineage is not calculated by default. You must first enable automatic extraction by setting a feature flag similar to the following on the Feature Configuration tab of Alation’s Admin Settings page:

../../_images/CLL_AutomaticExtract_FeatureFlags.png

If you still do not see column-level lineage, check with your Alation account manager to ensure that column-level lineage for the specified connector is part of your Alation license entitlement.