Azure Databricks: Configure SSO through OAuth for Compose

Applies from version 2020.4

Alation supports OAuth 2.0 for Compose connections to Azure Databricks data sources with authorization via Azure Active Directory (AAD).

Note

OAuth 2.0 provides a secure authorization mechanism for applications to access certain resources. Authorization is managed with access tokens that are issued to an application by the authorization server. The application is allowed to access resources until the token expires. When an access token has expired, a refresh token can be requested by the application to retrieve a new access token.

When OAuth is configured for an Azure Databricks data source in Alation, Compose users’ login credentials for Databricks are not stored on the Alation server. Compose users who want to establish a connection to the data source will be redirected to the Microsoft Azure login screen in a new browser tab. After login, Alation establishes a connection to the Databricks resources.

The service account that is provided in the Databricks data source settings and that is used for metadata extraction (MDE), sampling and profiling, and query log ingestion (QLI) does not use OAuth to connect to Databricks. When configured, OAuth will apply to the following functionality:

  • Query execution in Compose

    • Run Full Query

    • Run Current Statement

    • Run Full Query as Script

    • Run Full Query & Ignore Errors

  • Scheduled query execution

  • Query forms

  • Excel Live Reports

  • Data upload

  • Dynamic profiling of tables and columns

Note

Enabling OAuth for a Databricks data source does not prevent you from using basic connections at the same time. OAuth-enabled Compose connections can coexist with basic connections.

Workflow

Enabling OAuth for a Databricks data source requires provisioning a service principal in Azure Active Directory. You will need to collect the following information from the service principal:

  • Client ID

  • Client secret value

  • Authorization endpoint

  • Token endpoint

Follow this workflow:

Create Service Principal in Azure Active Directory

To provision a service principal in Azure Active Directory:

  1. Sign in to Azure Portal using your Azure account.

  2. Select Azure Active Directory.

  3. Go to App registrations and click New registration.

  4. In the Name field, specify a name for the application.

  5. Under Supported account types, select Single tenant.

  6. Under Redirect URI (optional), select Web as the app type and provide the Redirect URI in the format https://<your_Alation_URL>/api/datasource_auth/oauth/callback. For example, https://datacatalog.alation-test.com/api/datasource_auth/oauth/callback. Make sure there is no forward slash / at the end of the URI.

    ../../_images/OCF_AzureDB_RegisterApp.png
  7. Click Register to complete the app registration. The Overview page is displayed.

  8. Copy and save the Application (client) ID from the Essentials section on the Overview. You’ll need it during the configuration in Alation.

  9. Go to the Endpoints tab. Copy the values of OAuth 2.0 authorization endpoint (v1) and OAuth 2.0 token endpoint (v1), and save them. Make sure that you copy version 1 (v1) of the authorization and the token endpoints.

    ../../_images/OCF_AzureDB_Endpoints.png
  10. From the main menu on the left, select API permissions.

  11. Click Add a permission and select APIs my organization uses.

  12. Search with the keyword AzureDatabricks and click the AzureDatabricks API. Make sure there is no space in the search keyword.

  13. Select the User Impersonation permission and click Add permissions.

    ../../_images/OCF_AzureDB_APIPermissions.png
  14. On the API permissions page, select Grant admin consent for <Account> and in the pop-up dialog that opens, click Yes.

  15. In the menu on the left, select Certificates & secrets.

  16. Click New client secret to add a new client secret. Specify a description, select an expiration period, and click Add.

  17. Copy the client secret value and save it in a safe location. You only have one chance to copy the client secret value as it won’t be displayed again after you close this page.

    ../../_images/OCF_AzureDB_Secret.png

Create Databricks Users

To connect to Azure Databricks in Compose, users must exist in Azure Databricks. Perform the following steps to add the users:

  1. In Azure Databricks, go to Admin Settings.

  2. Click Add User to add a new user and provide the user email in the dialog box.

  3. Select the Admin checkbox for the new user to enable the user to run queries in Compose.

    ../../_images/OCF_AzureDB_AddUser.png

    Note

    If the Admin permission cannot be granted, the Can Attach To permission should also allow making connections with OAuth.

Configure OAuth for Compose

You configure Compose OAuth:

  • For an OCF data source, on the Compose tab of the data source Settings page.

  • For a Custom DB data source, on the General Settings tab of the data source Settings page.

To enable OAuth in Compose for an Azure Databricks data source:

  1. In Alation, open the data source Settings page:

    • If it’s an OCF data source, go to the Compose tab.

    • If it’s a Custom DB data source, go to the General Settings tab.

  2. Under Compose Connections, modify the default connection or create a new one. To enable OAuth, add parameters Authmech=11;Auth_flow=0. For example:

    spark://adb-900788168547414.14.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/900788168547414/0322-161807-ag2hb23i;AuthMech=11;Auth_Flow=0;.

  3. Under the OAuth Connection section, select the checkbox Enable OAuth 2.0 in Compose. This reveals several parameters for the OAuth setup.

    ../../_images/OCF_AzureDB_NewOAuthSettings.png
  4. Enter the values into the fields and click Save.

    Field

    Value

    Client ID

    Provide the client ID.

    Client Secret

    Provide the client secret value.

    Request Refresh Token

    Select the Request Refresh Token to enable requests for refresh tokens.

    Enable PKCE

    Leave as is. This setting does not apply to this data source type.

    Authorization Endpoint

    Provide the Authorization Endpoint.

    Token Endpoints

    Provide the Token Endpoint.

    Default Scope

    Leave this field blank.

    Username Field/Claim

    Provide the value unique_name.

    JWT

    Select this checkbox (required).

    Access Token Parameter name

    Provide Auth_AccessToken.

    OAuth Enablers

    Provide the value Authmech=11&Auth_flow=0. Make sure there is an ampersand symbol in between the parameters.

The screenshot below shows an example of a Compose tab configuration:

../../_images/OCF_AzureDB_ComposeTab.png

Connect in Compose

After following the steps above, Compose users can connect via OAuth-enabled connections and run queries.

To connect to the database in Compose:

  1. Click Connection Settings to open the connection settings dialog.

    ../../_images/OCF_AzureDB_ConnectionSettings.png
  2. In the Connection Settings dialog that opens, select the OAuth-enabled connection.

    ../../_images/OCF_AzureDB_SelectConnection.png
  3. From the Connect as (Select User) list, select your user or click Add New (SSO login).

    ../../_images/OCF_AzureDB_SelectUser.png
  4. The Microsoft login page should open in a new tab. Authenticate with your Azure Databricks credentials.

    ../../_images/OCF_AzureDatabricks_SSO_Login.png

More information about the Connection Setting dialog can be found in Working with Data Source Connections.

Troubleshooting

If the OAuth configuration is incorrect and does not match the Authorization Server configurations, you will get authorization errors in Alation. The error usually contains details about possible causes and may include troubleshooting tips.

Please refer to the table below for message examples.

Error

Description

Authorization terminated unexpectedly

This message is shown:

  • If the redirect browser window or tab is manually closed before authorization completes.

  • If the Client ID specified in the OAuth configuration is incorrect.

  • If the authorization endpoint is incorrect. The error page will redirect to the URL provided if it exists or throw the 404 error if no Redirect URL is provided.

The authorization server reported a failed authorization attempt: <message>

If authorization fails, the authorization server may respond with error details in the error message. Examples:

  • Invalid scope defined as Default Scope in the data source OAuth configuration.

  • Invalid scope defined as Refresh Scope in the data source OAuth configuration.

  • The authorization step is canceled by some UI control on an authorization, authentication, or consent screen.

  • The authorization step is canceled due to an authentication failure.

Token request failed following successful authorization

In some cases after successful authorization, the request for tokens can fail. This message will be followed by further details.

There was a problem extracting username information following successful authorization and token retrieval. Please check the OAuth settings for the data source

The causes of this error may be:

  • The OAuth configuration specifies that the username should be extracted as a JWT claim but:

    • The access token can not be decoded as a JWT.

    • No JWT claim is specified in the OAuth configuration.

    • The specified claim does not exist in the token’s claim set.

  • The OAuth configuration specifies that the username should not be extracted as a JWT claim but:

    • No username field is specified in the OAuth config.

    • The specified field does not exist in the response returned from the authorization server upon token request.

Log Location

The log entries for OAuth authorization can be found in /opt/alation/site/logs/uwsgi.log (path inside the Alation shell).