Databricks Azure OAuth for Compose

Applies from version 2020.4

Alation supports OAuth 2.0 for connections from Compose to Azure Databricks data sources with authorization in Azure Active Directory (AAD).

Note

OAuth 2.0 provides a secure mechanism for authorizing an application to access certain resources. This process is overseen by an Authorization Server. Authorization is managed with access tokens which are used by an application to access resources to which it is authorized according to the token until that token expires. When an access token has expired, a refresh token can be used to retrieve a new access token without re-authorization if the Authorization Server is configured to issue refresh tokens and a refresh token has been requested.

When OAuth is enabled for a Databricks data source in Alation, Compose users’ login credentials for Databricks are not stored on the Alation server. When Compose users establish a connection to the data source, they are redirected to the Databricks OAuth login screen in a new browser tab. They authenticate with the authorization server. After that the login screen tab closes, the tokens are stored, and Alation establishes the connection to the Databricks resources.

The following functionality is supported with OAuth:

  • Query execution in Compose

  • Scheduled query execution

  • Query forms

  • Excel Live Reports

  • Data upload

  • Dynamic Profiling

Databricks Service Account

The service account that is set up in Alation during the Databricks data source configuration and that is used for MDE, Profiling, and QLI does not use OAuth to connect to Databricks and is not affected.

No Enforcement of OAuth

Enabling OAuth for a Databricks data source is an option and is not enforced. Connection with OAuth can be enabled by including the Authmech=11;Auth_flow=0 parameters into the Compose connection URI after OAuth configuration is provided in the data source Settings > General Settings.

Compose users still can connect by directly providing their Azure Databricks credentials. In such a case, their credentials are stored within Alation for reuse.

Required Information

The following information is required from the Microsoft Azure portal to configure OAuth for Databricks in Alation:

  • Client ID

  • Client Secret

  • Authorization Endpoint

  • Token Endpoint

  • JDBC URI

  • Username/Claim

Configuration in Azure Portal

Note

Refer to Azure Databricks Documentation for the latest version of Service Principal configuration and User Creation information.

Create a Service Principal in Azure Portal

Perform the following steps to provision a Service Principal in Azure Portal:

  1. Sign in to Azure Portal using your Azure account.

  2. Select Azure Active Directory > App registrations > New registration.

  3. Provide a name for the app:

    ../../_images/DatabricksOAuth_01.png
  4. Select Single Tenant under Supported account types:

    ../../_images/DatabricksOAuth_02.png
  5. Under Redirect URI, select Web as the app type and provide the Redirect URI as http://localhost:8000/api/datasource_auth/oauth/callback. Make sure there is no forward slash / at the end of the URI. If the Redirect URI is not localhost then it must be HTTPS:

    ../../_images/DatabricksOAuth_03.png
  6. Click Register to complete the app registration. The Overview page is displayed.

  7. Select API permissions > Add a permission > APIs my organization uses:

    ../../_images/DatabricksOAuth_04.png
  8. Search with the keyword AzureDatabricks and click the app AzureDatabricks. Make sure there is no space in the search keyword:

    ../../_images/DatabricksOAuth_05.png
  9. Select the User Impersonation permission and click Add permissions:

    ../../_images/DatabricksOAuth_06.png
  10. Select Grant admin consent for <Account> (Default Directory) and click Yes:

    ../../_images/DatabricksOAuth_07.png
  11. Select Certificates & secrets > New client secret to add a new client secret. Provide the description, select never for the expiry of the client secret and click Add:

    ../../_images/DatabricksOAuth_08.png
  12. Copy the client secret and save it at any other location. You will get only one chance to copy the client secret.

    ../../_images/DatabricksOAuth_09.png
  13. Go to Overview to get the Client ID.

    ../../_images/DatabricksOAuth_10.png
  14. Go to the Endpoints tab. Copy the Authorization Endpoint, the Token Endpoint, and save them to any other location. Make sure that you copy version 1 of the Authorization Endpoint and the Token Endpoint:

    ../../_images/DatabricksOAuth_11.png

User Creation

To execute the queries in Compose, the user details must be added to the Admin Console in the Azure portal. Perform the following steps to add/modify the user details:

  1. In Azure Databricks Cluster page, go to Admin Console.

    ../../_images/DatabricksOAuth_12.png
  2. Click Add User to add a new user to the console and provide the user email in the dialog box.

    ../../_images/DatabricksOAuth_13.png
  3. Select the Admin checkbox against the new user to enable the user to write queries in compose.

    ../../_images/DatabricksOAuth_14.png

Setup Databricks Azure Custom DB

Refer to Databricks Azure to set up Databricks Azure as Custom DB in Alation.

Enable OAuth in Alation

Perform the following configuration on the Settings > General Settings page of a Databricks Azure Custom DB source.

../../_images/DatabricksOAuth_15.png

To enable OAuth for Databricks Azure:

  1. Open the data source Settings > General Settings tab, scroll down to the Compose Connections section, and locate the OAuth Connection section.

  2. Modify the JDBC URI. To enable OAuth, add Authmech=11;Auth_flow=0 at the end of JDBC URI.

  3. Select the checkbox Enable OAuth for all Compose Connections. This reveals several parameters for the OAuth setup. Provide the values for the fields based on the description in the table below:

    Field

    Value

    Client ID

    Provide the Client ID. Refer to step 12 section Create a Service Principal in Azure Portal.

    Client Secret

    Provide the Client Secret. refer to step 11 in section Create a Service Principal in Azure Portal.

    Enable Refresh Token

    Select the Enable Refresh Token checkbox.

    Enable PKCE

    Does not apply to this data source type.

    Authorization Endpoint

    Provide the Authorization Endpoint, refer to step 14 in section Create a Service Principal in Azure Portal.

    Token Endpoints

    Provide the Token Endpoint. Refer to section Create a Service Principal in Azure Portal.

    Default Scope

    Leave this field blank as the value for this field will be enabled by default.

    Username/Claim

    Provide email and select the JWT checkbox. You can configure a different claim in JSON Web Token.

    Access Token Parameter name

    Provide Auth_AccessToken. You can leave this field. blank as the value for this field will be enabled by default.

    OAuth Enablers

    Provide Authmech=11&Auth_flow=0. Make sure there is an ampersand symbol in between the parameters.

  1. Click Save to enable OAuth for your Azure Databricks source.

Connect in Compose

Compose users can connect to the OAuth enabled connection and run the query. Once you run the query, the enabled OAuth connection will redirect to provide the authenticated user credentials to finish the query run.

In Compose:

  1. Select a connection which OAuth is enabled:

    ../../_images/DatabricksOAuth_16.png
  2. Write the query and click Run.

    ../../_images/DatabricksOAuth_17.png
  3. Click Click here to authorize access before connecting.

    ../../_images/DatabricksOAuth_18.png
  4. Provide the authenticated user name and password.

    ../../_images/DatabricksOAuth_19.png

The following query execution features are supported in Compose using OAuth connections:

  • Run Full Query

  • Run Current Statement

  • Run Full Query as Script

  • Run Full Query & Ignore Errors

  • Schedule Queries

  • Live Excel

  • Data Upload

  • Query Forms

  • Dynamic Profiling - Both table profiling and column profiling

Troubleshooting

In case the OAuth configuration is incorrect and does not match the Authorization Server configurations, you may get authorization errors in Alation. The error usually contains details about possible causes and may include troubleshooting tips.

Please refer to the table below for message examples.

Error

Description

Authorization terminated unexpectedly

This message is shown:

  • If the redirect browser window or tab is manually closed before authorization completes;

  • The Client ID specified in the OAuth configuration may be incorrect. This will initially cause an error in the SSO page and on closing, this message is displayed.

  • The authorization endpoint may be incorrect. This will redirect to the URL provided if it exists or it will throw a 404 error if no Redirect URL is provided.

The authorization server reported a failed authorization attempt:

If authorization fails, the authorization server may respond respond with error details which will be listed after such an error message. Examples:

  • Invalid scope defined as Default Scope in the data source OAuth configuration;

  • Invalid scope defined as Refresh Scope in the data source OAuth configuration;

  • A Snowflake role is specified in the connection URI and does not map to an allowed scope on the authorization server;

  • The authorization step is canceled by some UI control on an authorization, authentication, or consent screen;

  • The authorization step is canceled due to an authentication failure.

Token request failed following successful authorization.

In some cases after successful authorization, the request for tokens can fail. This message will be followed by further details.

There was a problem extracting username information following successful authorization and token retrieval. Please check the OAuth settings for the data source.

The causes of this error may be:

  • The OAuth configuration specifies that the username should be extracted as a JWT claim and:

    • the access token can not be decoded as a JWT;

    • no JWT claim is specified in the OAuth configuration;

    • the specified claim does not exist in the token’s claim set;

  • The OAuth configuration specifies that the username should not be extracted as a JWT claim and:

    • no username field is specified in the OAuth config;

    • the specified field does not exist in the response returned from the authorization server upon token request.

No default role has been assigned to the user, contact a local system administrator to assign a default role and retry.

Displayed if there is no default role assigned to the user in Snowflake and none is specified in the connection URI.

Role ‘<role_name>’ specified in the connect string is not granted to this user. Contact your local system administrator, or attempt to login with another role, e.g. PUBLIC.

Displayed if the role is not accessible to the user and this role is specified in the connection URI and is authorized either explicitly in the scope or via SESSION:ROLE-ANY scope.

The role requested in the connection or the default role if none was requested in the connection (‘SYSADMIN’) is not listed in the Access Token or was filtered. Please specify another role, or contact your OAuth Authorization server administrator.

Displayed if either:

  • the role is assigned as the default role to a user but this role is not authorized because it is not specified in the URI or in the Default Scope;

  • the role is specified in the connection URI but the user does not have access to that role.

User’s configured default role ‘SYSADMIN’ is not granted to this user. Contact your local system administrator, or attempt to login using a CLI client with a connect string selecting another role, e.g. PUBLIC.

Displayed if the role is not accessible to the user in Snowflake but this role is authorized either via Default Scope by Alation or a role-related scope on the authorization server.

Where to Find Logs

The log entries for OAuth authorization can be found in /opt/alation/site/logs/uwsgi.log (path inside the Alation shell).