Azure Blob Storage OCF Connector: Overview

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

The OCF connector for Azure Blob Storage was developed by Alation and is available as a Zip file that can be uploaded and installed in the Alation application.

Note

The first released version of the connector (1.0.0.2428) is compatible with Alation release 2022.4. Newer connector versions may require newer Alation releases:

  • Version 2.0.0.2863 (compatible with Alation version 2023.1)—Supports schema extraction for the CSV, TSV, PSV, and Parquet files. Supports dynamic sampling with basic or OAuth authentication.

  • Version 2.2.0.4373 (compatible with Alation version 2023.1)—Improves schema extraction. Individual files that are not covered by a schema path pattern will be excluded from schema extraction.

To download the Azure Blob Storage OCF connector package, go to the Alation Connector Hub available from the Customer Portal. Go to Customer Portal > Connectors > Alation Connector Hub. Only Alation users with access to the Customer Portal can access the Alation Connector Hub. If you don’t have access to the Customer Portal, contact Alation Support.

Azure Blob Storage connector supports these storage account types:

  • Azure Blob Storage

  • Azure Data Lake Storage Gen 2

The connector should be used to catalog Azure Blob Storage as a file system source in Alation. The connector extracts metadata and logical schemas from data files stored on the account:

  • Metadata extraction—The connector will catalog Azure Blob Storage objects, such as containers, and the content of containers, such as directories and files. It enables users to discover, search, browse, and curate extracted Azure Blob Storage metadata objects in the Alation user interface.

  • Schema extraction—The connector will extract and catalog column headers from semi-structured file formats. Schema extraction is currently supported for CSV, TSV, PSV, and Parquet file formats.

Team

The following administrators are required to install this connector:

  • Alation Server Admin:

    • Ensures that Alation Connector Manager is installed and running or installs it.

    • Installs the connector.

    • Creates and configures the Azure Blob Storage file system source in the catalog.

    • Performs initial extraction and prepares the data source for Alation users.

  • Azure Blob Storage user with the administrator privileges:

    • Configures Azure Blob inventory.

    • Provides the access key or a shared access signature token.

    • Assists in configuring OAuth authentication for sampling.

Scope

The table below describes which metadata objects are extracted by this connector and which operations are supported.

Feature

Availability

Core Capabilities

Automated metadata extraction (MDE)

Yes

Custom query-based MDE

No

Column Extraction

Yes

(CSV, PSV, TSV, and Parquet file formats)

Search

Yes

Catalog page curation

Yes

Catalog sets

No

Propagation of trust flags

No

Popularity

No

Authentication

Access Key

Yes

Shared Access Signature

Yes

Azure Service Principal

Yes*

Azure AD (OAuth 2.0)

Supported for dynamic file sampling

SSL

No

LDAP

No

Technical Metadata

Files

Yes

Columns

Yes

Catalog Features

Dynamic file sampling

Yes

Query log ingestion

Not applicable

Compose

Not applicable

Lineage

Not applicable

* Azure service principal authentication is not supported for Dynamic File Sampling.