Databricks on AWS OCF Connector: Overview¶
Alation Cloud Service Applies to Alation Cloud Service instances of Alation
Customer Managed Applies to customer-managed instances of Alation
The Databricks on AWS OCF Connector is developed by Alation.
To download the Databricks on AWS OCF connector package, go to the Alation Connector Hub available from the Customer Portal. Go to Customer Portal > Connectors > Alation Connector Hub. Only Alation users with access to the Customer Portal can access the Alation Connector Hub. If you don’t have access to the Customer Portal, contact Alation Support.
This connector should be used to catalog Databricks on AWS as a data source in Alation. The connector catalogs Databricks on AWS objects such as schemas, tables, columns and views. It enables end users to discover, search, browse and curate Databricks on AWS objects from the Alation user interface.
Team¶
The following administrators are required to install this connector:
Alation administrator
Installs the connector
Creates and configures the Databricks on AWS data source in the catalog.
Databricks on AWS administrator
Creates a service account for Alation
Provides the JDBC URI to access metadata
Provides access to public schemas to extract metadata
Assists in configuring QLI on Databricks
Provides AWS access key ID and secret to use in QLI configuration in Alation
Metastore Support¶
We have certified the AWS Databricks connector with Hive as the metastore. Please note that we do not certify external metastores, such as AWS Glue, Derby.
Scope¶
The table below shows what features are covered by the connector. For version support information, refer to Support Matrix.
Feature |
Scope |
Availability |
---|---|---|
Authentication |
||
Token-based authentication |
Authentication using Databricks personal access tokens |
Yes |
SSL authentication |
SSL Authentication |
No |
Kerberos |
Authentication with Kerberos |
No |
Keytab |
Authentication with Keytab |
No |
LDAP |
Authentication with the LDAP protocol |
No |
SSO |
Authentication with SSO |
No |
Metadata Extraction (MDE) |
||
Default MDE |
Extracts supported metadata objects based on Databricks JDBC driver methods in the connector code |
Yes |
Custom query-based MDE |
Extracts supported metadata objects based on extraction queries provided by user |
No |
Popularity |
Indicator of the popularity (intensity of use) of a data object, such as a table or a column |
Yes |
Extracted metadata objects |
||
Data Source |
Data source object in Alation that is parent to the extracted metadata |
Yes |
Schemas |
List of schemas |
Yes |
Tables |
List of tables |
Yes |
Columns |
List of columns |
Yes |
Column data types |
Column data types |
Yes |
Views |
List of views |
Yes |
Column source comments |
Coulumn source comments |
Yes |
Table source comments |
Table source comments |
No |
Primary keys |
Primary key information for extracted tables |
No |
Foreign keys |
Foreign key information for extracted tables |
No |
Functions |
Extract function metadata |
No |
Function definitions |
Extract function definition metadata |
No |
External tables |
Extract external tables |
Yes |
Delta tables |
Extract delta tables |
Yes |
Sampling and Profiling |
||
Table sampling |
Extracts data samples from all extracted tables |
Yes |
Column sampling |
Extracts data samples from all extracted columns |
Yes |
Deep column profiling |
On-demand profiling of specific columns with the calculation of value distribution stats |
Yes |
Dynamic profiling |
On-demand table and column profiling by individual users who use their own database accounts to retrieve the profiles |
Yes |
Custom query-based table sampling |
Ability to use custom queries for sampling specific tables |
Yes |
Custom query-based column sampling |
Ability to use custom queries for profiling specific columns |
Yes |
Query Log Ingestion (QLI) |
||
File-based QLI |
Ingestion of query history based on a file that contains query history data |
Yes |
Table-based QLI |
Ingestion of query history based on a table that contains query history data |
No |
Query-based QLI |
Ingestion of query history based on a custom query history extraction query |
No |
JOINs and filters |
Calculation of JOIN and filter information based on ingested query history |
Yes |
Predicates |
Ability to parse predicates in ingested queries |
Yes |
Lineage |
||
Automatic lineage generation |
Auto-calculation of lineage based on query history ingested from QLI, MDE, and Compose queries |
Yes |
Custom query-based column sampling |
Ability to use custom queries for profiling specific columns |
Yes |
Compose |
||
Customer-managed (on-prem) Alation instances |
Compose on on-prem Alation instances |
Yes |
Alation Cloud Service instances |
Depending on your network configuration, you may be using Alation Agent to connect to your data source. Compose via Agent is supported from connector version 2.1.0.4607. |
Yes |
Personal Access Token (PAT) authentication in Compose |
Authentication in Compose with username and password |
Yes |
SSO authentication in Compose |
Authentication in compose with SSO credentials |
Yes |