Overview

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Available from release 2022.3.5

Overview

The OCF connector for Amazon S3 is developed by Alation and is available for download from Alation’s Customer Portal (Customer Portal homepage > Connectors).

Use this connector to catalog S3 buckets as a file system source in Alation.

The connector offers these capabilities:

  • Metadata extraction — Extract and catalog S3 objects, such as buckets, and the content of buckets, such as folders and files. Users will be able to discover, search, browse, and curate S3 objects as Alation objects in the Alation user interface.

  • Column or Schema extraction — Extract and catalog column headers found in semi-structured file formats. Currently supported for Parquet, CSV, PSV, and TSV. Users can search and curate the column headers cataloged from each file as column objects. The column or schema extraction may be a time-intensive operation as it involves reading individual files.

  • File sampling — As a catalog user, initiate S3 file sampling, on-demand. This operation will retrieve randomly sampled rows of data from the file, providing a deeper insight into the file’s structure and data.

To retrieve the S3 bucket metadata, the connector reads from an Amazon S3 inventory. For more information about inventories, refer to Amazon S3 Inventory in AWS documentation.

Note

When you run MDE, the Amazon S3 OCF connector obtains the list of inventory files from the latest manifest.json file for the respective bucket. Manifest files are available at the following location in the destination bucket:

destination-prefix/source-bucket/config-ID/YYYY-MM-DDTHH-MMZ/manifest.json

For more information about manifest.json file, refer to Inventory manifest in AWS documentation.

Team

The following administrators are required to install this connector:

  • Amazon S3 administrator:

    • Performs the required configuration in S3, such as configuring the inventory and providing access for Alation.

    • Creates an IAM user and roles to authorize read-only access to the data and inventory buckets.

    • Assists in collecting the necessary configuration information from AWS.

  • Alation Server Admin:

    • Installs the Alation Connector Manager if required.

    • Installs the connector.

    • Adds and configures the S3 file system source in Alation.

Scope

The table below shows the catalog features supported by this connector. For version support information, refer to Support Matrices.

Feature

Availability

Core Capabilities

Metadata extraction (MDE) via basic authentication (AWS Access Key and Secret Key)

Metadata extraction (MDE) via AWS IAM role and IAM user-based STS authentication

Column extraction

Search

Catalog page curation

Catalog sets

Propagation of trust flags

Popularity

Sampling

File sampling via basic authentication or SSO

File sampling via STS authentication

Extracted Metadata

Folders

Files

Columns