Databricks Unity Catalog OCF Connector: Install and Configure¶
Prerequisites¶
The metastore must be assigned to a workspace with a running cluster or SQL warehouse.
Complex Data Type Extraction¶
The Databricks Unity Catalog OCF connector supports extraction of complex data types, such as arrays and structs. To enable their representation as a tree structure in the Alation user interface, make sure that the alation_conf parameter alation.feature_flags.enable_generic_nosql_support
is set to True
on your instance.
Additionally, you can use the parameter alation.feature_flags.docstore_tree_table_depth
to define the depth of the display (default is three levels).
For details about using alation_conf, refer to Using alation_conf.
Important
After changing values of these parameters, restart Alation Supervisor from the Alation shell.
alation_supervisor restart all
Network Connectivity¶
Open inbound TCP port 443 to the Databricks Unity Catalog server.
Create Service Account¶
Create a Databricks account-level user to be used as a service account for metadata extraction.
Assign the service account the
USAGE
andSELECT
permissions on all Catalog, Schema, and Table objects that you want to catalog in Alation.Assign the service account to the workspace using the information in Manage users, service principals, and groups. The service account must be assigned to the same workspace as the cluster or SQL warehouse.
Assign the service account the
can use
permissions on the cluster or SQL warehouse.
Authentication¶
The connector supports token-based authentication.
Generate a personal access token as described in Generate a personal access token.
JDBC URI¶
If you are using a Databricks cluster, get the JDBC URI as documented in Get connection details for a cluster.
If you are using a Databricks SQL warehouse (SQL endpoints), get the JDBC URI as documented in Get connection details for a SQL warehouse.
Note
Remove the
jdbc:
prefix from the JDBC URI.
Format¶
spark://<hostname>:443/default;transportMode=http;ssl=1;httpPath=<databricks_http_path_prefix>/<databricks_cluster_id>;AuthMech=3;
Example¶
spark://adb-58175503737864.5.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=/sql/1.0/endpoints/0f38f55be5cbd786;AuthMech=3;
Configuration in Alation¶
STEP 1: Install the Connector¶
Alation On-Prem¶
Important
Installation of OCF connectors requires Alation Connector Manager to be installed as a prerequisite.
To install an OCF connector:
If this has not been done on your instance, install the Alation Connector Manager: Install Alation Connector Manager.
Ensure that the OCF connector Zip file that you received from Alation is available on your local machine.
Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.
Alation Cloud Service¶
Note
On Alation Service Cloud instances, Alation Connector Manager is available by default.
Ensure that the OCF connector Zip file that you received from Alation is available on your local machine.
Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.
STEP 2: Create and Configure a New Data Source¶
In Alation, add a new data source:
Log in to Alation as a Server Admin.
Expand the Apps menu on the right of the main toolbar and select Sources.
On the Sources page, click +Add on the top right of the page and in the list that opens, click Data Source. This will open the Add a Data Source wizard.
On the first screen of the wizard, specify a name for your data source, assign additional Data Source Admins, if necessary, and click the Continue Setup button on the bottom. The Add a Data Source screen will open.
On the Add a Data Source screen, the only field you should populate is Database Type. From the Database Type dropdown, select the connector name. After that you will be navigated to the Settings page of your new data source.
The name of this connector is Databricks Unity Catalog OCF Connector.
Access¶
On the Access tab, set the data source visibility using these options:
Public Data Source—The data source will be visible to all users of the catalog.
Private Data Source—The data source will be visible to the users allowed access to the data source by Data Source Admins.
You can add new Data Source Admin users in the Data Source Admins section.
General Settings¶
Perform the configuration on the General Settings tab.
Application Settings¶
This section does not apply to Databricks Unity Catalog OCF Connector.
Connector Settings¶
Populate the data source connection information and save the values by clicking Save in this section.
Data Source Connection¶
Parameter |
Description |
---|---|
JDBC URI |
Specify the JDBC URI in the required format. |
Username |
For token-based authentication, use the value |
Password |
Paste the personal access token for the service account. |
Logging Configuration¶
Select the logging level for the connector logs and save the values by clicking Save in this section. The available log levels are based on the Log4j framework.
Parameter |
Description |
---|---|
Log level |
Select the log level to generate logs. The available options are INFO, DEBUG, WARN, TRACE, ERROR, FATAL, ALL. |
Obfuscate Literals¶
Skip this section as it’s not applicable to Databricks Unity Catalog data sources.
Test Connection¶
Under Test Connection, click Test to validate network connectivity.
If the connection test fails, make sure the JDBC URI and service account credentials are correct.
Metadata Extraction¶
You can configure metadata extraction (MDE) for an OCF data source on the Metadata Extraction tab of the Settings page. For Databricks Unity Catalog data sources, Alation supports full and selective default MDE. Custom query-based MDE is not supported.
Refer to Configure Metadata Extraction for OCF Data Sources for information about the available configuration options.
Sampling and Profiling¶
Sampling and profiling, including dynamic profiling, is supported from connector version 1.0.2.3423, compatible with Alation version 2022.4 or newer.
For details, see Configure Sampling and Profiling for OCF Data Sources.
Query Log Ingestion¶
Not supported.
Compose¶
Compose is supported from connector version 1.0.2.3423, compatible with Alation version 2022.4 or newer.
For details about configuring the Compose tab of the Settings, refer to Configure Compose for OCF Data Sources.
Note
To establish a connection between Compose and Unity Catalog, Compose users will need their own personal access token or the knowledge of the token of the service account.
Incremental MDE from Compose¶
When users create tables and views in Compose, Alation triggers a background extraction job to make the new table and view objects available in the catalog. As a result, users will see the corresponding table or view metadata in the data catalog without re-running MDE on the Settings page of the data source.