Databricks Unity Catalog OCF Connector: Install and Configure

Prerequisites

The metastore must be assigned to a workspace with a running cluster or SQL warehouse.

Complex Data Type Extraction

The Databricks Unity Catalog OCF connector supports extraction of complex data types, such as arrays and structs. To enable their representation as a tree structure in the Alation user interface, make sure that the alation_conf parameter alation.feature_flags.enable_generic_nosql_support is set to True on your instance.

Additionally, you can use the parameter alation.feature_flags.docstore_tree_table_depth to define the depth of the display (default is three levels). For details about using alation_conf, refer to Using alation_conf.

Important

After changing values of these parameters, restart Alation Supervisor from the Alation shell.

alation_supervisor restart all

Network Connectivity

Open inbound TCP port 443 to the Databricks Unity Catalog server.

Create Service Account

  • Create a Databricks account-level user to be used as a service account for metadata extraction.

  • Assign the service account the USAGE and SELECT permissions on all Catalog, Schema, and Table objects that you want to catalog in Alation.

  • Assign the service account to the workspace using the information in Manage users, service principals, and groups. The service account must be assigned to the same workspace as the cluster or SQL warehouse.

  • Assign the service account the can use permissions on the cluster or SQL warehouse.

Authentication

The connector supports token-based authentication.

Generate a personal access token as described in Generate a personal access token.

JDBC URI

If you are using a Databricks cluster, get the JDBC URI as documented in Get connection details for a cluster.

If you are using a Databricks SQL warehouse (SQL endpoints), get the JDBC URI as documented in Get connection details for a SQL warehouse.

Note

Remove the jdbc: prefix from the JDBC URI.

Format

spark://<hostname>:443/default;transportMode=http;ssl=1;httpPath=<databricks_http_path_prefix>/<databricks_cluster_id>;AuthMech=3;

Example

spark://adb-58175503737864.5.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=/sql/1.0/endpoints/0f38f55be5cbd786;AuthMech=3;

Configuration in Alation

STEP 1: Install the Connector

Alation On-Prem

Important

Installation of OCF connectors requires Alation Connector Manager to be installed as a prerequisite.

To install an OCF connector:

  1. If this has not been done on your instance, install the Alation Connector Manager: Install Alation Connector Manager.

  2. Ensure that the OCF connector Zip file that you received from Alation is available on your local machine.

  3. Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.

Alation Cloud Service

Note

On Alation Service Cloud instances, Alation Connector Manager is available by default.

  1. Ensure that the OCF connector Zip file that you received from Alation is available on your local machine.

  2. Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.

STEP 2: Create and Configure a New Data Source

In Alation, add a new data source:

  1. Log in to Alation as a Server Admin.

  2. Expand the Apps menu on the right of the main toolbar and select Sources.

  3. On the Sources page, click +Add on the top right of the page and in the list that opens, click Data Source. This will open the Add a Data Source wizard.

  4. On the first screen of the wizard, specify a name for your data source, assign additional Data Source Admins, if necessary, and click the Continue Setup button on the bottom. The Add a Data Source screen will open.

  5. On the Add a Data Source screen, the only field you should populate is Database Type. From the Database Type dropdown, select the connector name. After that you will be navigated to the Settings page of your new data source.

The name of this connector is Databricks Unity Catalog OCF Connector.

Access

On the Access tab, set the data source visibility using these options:

  • Public Data Source—The data source will be visible to all users of the catalog.

  • Private Data Source—The data source will be visible to the users allowed access to the data source by Data Source Admins.

You can add new Data Source Admin users in the Data Source Admins section.

General Settings

Perform the configuration on the General Settings tab.

Application Settings

This section does not apply to Databricks Unity Catalog OCF Connector.

Connector Settings

Populate the data source connection information and save the values by clicking Save in this section.

Data Source Connection

Parameter

Description

JDBC URI

Specify the JDBC URI in the required format.

Username

For token-based authentication, use the value token.

Password

Paste the personal access token for the service account.

Logging Configuration

Select the logging level for the connector logs and save the values by clicking Save in this section. The available log levels are based on the Log4j framework.

Parameter

Description

Log level

Select the log level to generate logs. The available options are INFO, DEBUG, WARN, TRACE, ERROR, FATAL, ALL.

Obfuscate Literals

Skip this section as it’s not applicable to Databricks Unity Catalog data sources.

Test Connection

Under Test Connection, click Test to validate network connectivity.

If the connection test fails, make sure the JDBC URI and service account credentials are correct.

Metadata Extraction

You can configure metadata extraction (MDE) for an OCF data source on the Metadata Extraction tab of the Settings page. For Databricks Unity Catalog data sources, Alation supports full and selective default MDE. Custom query-based MDE is not supported.

Refer to Configure Metadata Extraction for OCF Data Sources for information about the available configuration options.

Sampling and Profiling

Sampling and profiling, including dynamic profiling, is supported from connector version 1.0.2.3423, compatible with Alation version 2022.4 or newer.

For details, see Configure Sampling and Profiling for OCF Data Sources.

Query Log Ingestion

Not supported.

Compose

Compose is supported from connector version 1.0.2.3423, compatible with Alation version 2022.4 or newer.

For details about configuring the Compose tab of the Settings, refer to Configure Compose for OCF Data Sources.

Note

To establish a connection between Compose and Unity Catalog, Compose users will need their own personal access token or the knowledge of the token of the service account.

Incremental MDE from Compose

When users create tables and views in Compose, Alation triggers a background extraction job to make the new table and view objects available in the catalog. As a result, users will see the corresponding table or view metadata in the data catalog without re-running MDE on the Settings page of the data source.