Snowflake OCF Connector: Install and Configure

Preliminaries

Firewall Configuration

Open the inbound TCP port 443 to the Snowflake server.

Driver

Refer to the Support Matrix for your Alation release to find out the version of the available driver for Snowflake.

Authentication

Create a Snowflake Service Account

Under your Snowflake account, set up a user account for Alation. Security and account administrators for Snowflake (users with the SECURITYADMIN or ACCOUNTADMIN roles) can create users.

You can also use an existing account.

Alation will use the service account for metadata extraction (MDE), profiling and sampling, and query log ingestion (QLI). It must be assigned a role that has enough privileges for these data jobs.

Snowflake connection requires a warehouse and a role. A Snowflake user may have multiple roles which give them different access. As a best practice, when setting up a user account in Snowflake for Alation, it is recommended to set a default role and default warehouse for this user.

Example:

  • Create role for the Alation service account:

    USE ROLE ACCOUNTADMIN;
    CREATE ROLE <alation_role>;
    
  • Create Alation service account user:

    USE ROLE ACCOUNTADMIN;
    CREATE USER <alation_user> PASSWORD='*****' DEFAULT_ROLE = <alation_role> MUST_CHANGE_PASSWORD = TRUE;
    
  • Add a default warehouse:

    USE ROLE ACCOUNTADMIN;
    ALTER USER <alation_user> SET DEFAULT_WAREHOUSE=<warehouse_name>;
    GRANT ROLE <alation_role> TO USER <alation_user>;
    

Permissions for Metadata Extraction

  • Grant access to warehouse if MDE will be performed on a warehouse that is not the default warehouse of the Alation service account:

    USE ROLE ACCOUNTADMIN;
    GRANT USAGE ON WAREHOUSE <warehouse_name> TO ROLE <alation_role>;
    
  • Grant access to the Snowflake Account_Usage Schema views:

    USE ROLE ACCOUNTADMIN;
    GRANT IMPORTED PRIVILEGES on database "SNOWFLAKE" to <alation_role>;
    

    This permission gives access to all the views in ACCOUNT_USAGE schema. If you want to provide access to limited views of Snowflake Account_usage schema, use the following grants:

    USE ROLE ACCOUNTADMIN;
    USE SNOWFLAKE;
    GRANT DATABASE ROLE OBJECT_VIEWER TO ROLE <alation_role>;
    GRANT DATABASE ROLE GOVERNANCE_VIEWER TO ROLE <alation_role>;
    
  • To extract primary key and foreign key metadata, grant permissions for database(s), schema(s), and table(s):

    • Grant USAGE on specific database(s):

      GRANT USAGE ON DATABASE <database> TO <alation_role>;
      
    • Grant USAGE on specific schema(s):

      -- grants limited by database
      GRANT USAGE ON ALL SCHEMAS IN DATABASE <database> TO ROLE <alation_role>;
      -- OR
      -- grants limited by a named schema
      GRANT USAGE ON SCHEMA <database.schema> TO ROLE <alation_role>;
      -- OPTIONAL: grant access to future schemas in database
      GRANT USAGE ON FUTURE SCHEMAS IN DATABASE <database> TO ROLE <alation_role>;
      
    • Grant REFERENCES on specific table(s):

      -- grants limited by database
      GRANT REFERENCES ON ALL TABLES IN DATABASE <database> TO ROLE <alation_role>;
      -- OR
      -- grants limited by a named schema
      GRANT REFERENCES ON ALL TABLES IN SCHEMA <database.schema> TO ROLE <alation_role>;
      -- OPTIONAL: grant access to future tables in database
      GRANT REFERENCES ON FUTURE TABLES IN DATABASE <database> TO ROLE <alation_role>;
      -- OR
      -- OPTIONAL: grant access to future tables in a specific schema
      GRANT SELECT ON FUTURE TABLES IN SCHEMA <database.schema> TO ROLE <alation_role>;
      
    • Grant REFERENCES on specific view(s):

      -- grants limited by database
      GRANT REFERENCES ON ALL VIEWS IN DATABASE <database> TO ROLE <alation_role>;
      -- OR
      -- grants limited by a named schema
      GRANT REFERENCES ON ALL VIEWS IN SCHEMA <database.schema> TO ROLE <alation_role>;
      -- OPTIONAL: grant access to future views in database
      GRANT REFERENCES ON FUTURE VIEWS IN DATABASE <database> TO ROLE <alation_role>;
      -- OR
      -- OPTIONAL: grant access to future views in a specific schema
      GRANT SELECT ON FUTURE VIEWS IN SCHEMA <database.schema> TO ROLE <alation_role>;
      
  • Grant permissions to extract external tables:

    -- grants limited by database
    GRANT REFERENCES ON ALL external TABLES in database <database> TO ROLE <alation_role>;
    
    
    -- grants limited by a named schema
    GRANT REFERENCES ON ALL external TABLES in SCHEMA <database.schema> TO ROLE <alation_role>;
    

Permissions for Profiling

GRANT SELECT ON schema_name.table_name TO <alation_role>;

Permissions for Data Policies and Tags

Users must be granted access to the ACCOUNT_USAGE schema to extract policies and tags.

See Snowflake Tags for more information on how to configure and use tags in Alation.

JDBC URI

When building the URI, include the following components:

  • Hostname:

    • For instances on AWS: <account_name>.<region_id>.snowflakecomputing.com

      Example:

      alation_partner.us-east-1.snowflakecomputing.com

    • For instances on Azure: <account_name>.<region_id>.azure.snowflakecomputing.com

      Example:

      lj16805.east-us-2.azure.snowflakecomputing.com

    • For instances on GCP:<account_name>.<region_id>.gcp.snowflakecomputing.com

      Example:

      mj26341.europe-west2.gcp.snowflakecomputing.com

  • Port - Provide the port number as 443.

  • Warehouse name - The currently active/running warehouse in your account for which the service account you are going to use in Alation has access privileges.

  • Role - This is an optional parameter. Provide the role of the service account user.

  • Database name - This is an optional parameter. The parameter name db is case insensitive but db_name value is case sensitive.

    Note

    If the Database Name is specified, then the metadata will be extracted for this specific database only.

    If the Database Name is not provided, then the metadata will be extracted for all the databases that the user account has access to.

URI format

snowflake://<hostname>:<port>/?warehouse=<warehouse_name>&db=<db_name>&role=<role_name

Example:

snowflake://alation_partner.us-east-1.snowflakecomputing.com:443/?warehouse=TEST&db=TEST_DB&role=admin

URI to Extract Policies

The Snowflake URI must include the default database name as shown below to extract the policies from a particular database. If not, all the policies will be extracted from the complete Snowflake data source. See Policy Center on how to extract and use policies in Alation.

snowflake://alation_partner.us-east-1.snowflakecomputing.com:443/?warehouse=OXY_WH&db=PC_DB

Key Pair Authentication

Alation supports the key pair authentication method for the service account connection to Snowflake. To use key pair authentication, a number of key pair authentication-relevant parameters need to be included into the Snowflake URI provided in the Alation catalog.

Key pair authentication requires a preliminary configuration on Snowflake and on the Alation server.

Configuration on the Snowflake Side

  1. Generate a private key and a public key. Assign the public key to the service account you created for Alation. Refer to Key Pair Authentication & Key Pair Rotation in Snowflake documentation for details.

  2. Store the private key securely on your local machine as it will need to be uploaded in Alation while configuring the key pair authentication.

The following error is seen while connecting using a JDBC connection to Snowflake using Key Pair Authentication. The error is specific to the OpenSSL version 1.1.1g running on Windows / Ubuntu.

Error:

Private key provided is invalid or not supported: rsa_key.p8: ObjectIdentifier() -- data isn't an object ID (tag = 48) null or Exception in thread "main" java.security.NoSuchAlgorithmException: 1.2.840.113549.1.5.13 SecretKeyFactory not availableWhen run from JDBC application.

To resolve this issue, use the following command to generate the private key:

openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -v1 PBE-SHA1-RC4-128 -out rsa_key.p8

Configuration on the Alation Server Side

You will need to upload the private key file in the General Settings > Snowflake Key Pair Authentication section of your Snowflake OCF data source after adding this data source to the Catalog.

Note

  • Key Pair Authentication for individual user account connections from Compose is currently not supported.

  • Key Pair Authentication for Snowflake OAuth in Compose is currently not supported. Snowflake OAuth uses authentication based on the provided client ID and secret.

Configuration in Alation

STEP 1: Install the Connector

Alation On-Prem

Important

Installation of OCF connectors requires Alation Connector Manager to be installed as a prerequisite.

To install an OCF connector:

  1. If this has not been done on your instance, install the Alation Connector Manager: Install Alation Connector Manager.

  2. Ensure that the OCF connector Zip file that you received from Alation is available on your local machine.

  3. Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.

Alation Cloud Service

Connecting to Cloud Databases

Note

On Alation Cloud Service instances, Alation Connector Manager is available by default.

To install an OCF connector:

  1. Ensure that the OCF connector Zip file that you received from Alation is available on your local machine.

  2. Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.

Connecting to On-Prem Databases

Important

To connect to an on-prem database from an Alation Cloud Service instance, you may need to use Alation Agent.

  1. Ensure that the Alation Agent functionality is enabled on your Alation instance. Create a Support ticket with Alation for an Alation representative to enable the Alation Agent feature on your instance and to receive the Alation Agent installer.

  2. Install Alation Agent using the information in Alation Agent.

  3. Install the OCF connector on Alation Agent.

STEP 2: Create and Configure a New Snowflake Data Source

In Alation, add a new data source:

  1. Log in to Alation as a Server Admin.

  2. Expand the Apps menu on the right of the main toolbar and select Sources.

  3. On the Sources page, click +Add on the top right of the page and in the list that opens, click Data Source. This will open the Add a Data Source wizard.

  4. On the first screen of the wizard, specify a name for your data source, assign additional Data Source Admins, if necessary, and click the Continue Setup button on the bottom. The Add a Data Source screen will open.

  5. On the Add a Data Source screen, the only field you should populate is Database Type. From the Database Type dropdown, select the connector name. After that you will be navigated to the Settings page of your new data source.

The name of this connector is Snowflake OCF connector.

Access

On the Access tab, set the data source visibility using these options:

  • Public Data Source—The data source will be visible to all users of the catalog.

  • Private Data Source—The data source will be visible to the users allowed access to the data source by Data Source Admins.

You can add new Data Source Admin users in the Data Source Admins section.

General Settings

Perform the configuration on the General Settings tab:

  1. Specify Application Settings:

    Parameter

    Description

    BI Connection Info

    Not applicable

    Disable Automatic Lineage Generation

    Not applicable

  2. Click Save.

  3. Specify Connector Settings:

    Parameter

    Description

    Data Source Connection

    JDBC URI

    Provide the JDBC URI constructed in JDBC URI.

    Username

    Provide the service account username.

    Password

    Provide the service account password.

    Snowflake Key Pair Authentication

    Enable Snowflake Key Pair Auth

    Select this checkbox to enable the key pair authentication.

    Username

    Provide the user name of the service account that has the public key assigned to it.

    Make sure that the username provided here is the same as the service account username provided in the Data Source Connection section.

    Upload Private Key File

    Click the Upload button to upload the private key file.

    Private Key Passphrase

    Provide the passphrase of the private key file.

    Logging Information

    Log Level

    Select the Log Level to generate logs. The available log levels are based on the log4j framework.

  4. Click Save.

  5. Obfuscate Literals - Enable this toggle to hide the details of the queries in the catalog page that are ingested via QLI or executed in Compose. This toggle is disabled by default.

  6. Under Test Connection, click Test to validate network connectivity.

../../../_images/SnowflakeOCF_02.png

Add-On OCF Connector for dbt

The dbt connector can be used as an add-on connector with a primary OCF connector to extract and catalog descriptions and lineage from dbt models, sources, and columns in dbt Core or dbt Cloud. See the Add-On OCF Connector for dbt documentation to use this connector.

Metadata Extraction

Note

There is 45 minutes to 3 hour latency time to update tables, columns, views, primary keys, foreign keys, functions, and procedures in ACCOUNT_USAGE views. So the new items created will be extracted during metadata extraction after the latency time. Refer to Account Usage — Snowflake Documentation for information about the latency time for each item.

You can perform a default extraction or configure extraction based on custom queries.

  • Default extraction: This type of MDE is based on default SQL queries that are built in the connector code.

  • Query-based extraction: This type of MDE is based on custom SQL queries that can be provided by a Data Source Admin. See Query Based Extraction below.

Note

It is recommended not to use the Default Extraction if a Query-Based extraction method is used and vice-versa.

Note

Alation supports S3 storage and Azure storage for external tables.

Note

If the database name is mentioned in the JDBC URI, then the metadata will be extracted only from that database. If the database name is not mentioned in the JDBC URI, then all the data of the Snowflake instance will be extracted, which increases the extraction time.

Application Settings

  • Enable Raw Metadata Dump or Replay: The options in this drop list can be used to dump the extracted metadata into files in order to debug extraction issues before ingesting the metadata into Alation. This feature can be used during testing in case there are issues with MDE. It breaks extraction into two steps: first, the extracted metadata is dumped into files and can be viewed; and second, it can be ingested from the files into Alation. It is recommended to keep this feature enabled only if debugging is required.

    • Enable Raw Metadata Dump: Select this option to save the extracted metadata into a folder for debugging purposes. The dumped data will be saved in four files (attribute.dump, function.dump, schema.dump, table.dump) in the folder opt/alation/site/tmp/ inside Alation shell.

    • Enable Ingestion Replay: Select this option to ingest the metadata from the dump files into Alation.

    • Off - Disable the Raw Metadata Dump or Replay feature. Extracted metadata will be ingested into Alation.

Connector Settings

Query Based Extraction

Query-Based Extraction allows users to customize metadata extraction from the source down to the level of specific metadata types, such as tables, columns, views, and some others by using custom queries.

Tables, columns, and views, are enabled by default at the backend and users cannot disable them.

By default, the following additional metadata type is enabled:

  • Functions

Users can disable the metadata types that are not required by clearing the corresponding checkboxes.

Primary Keys and Foreign Keys are disabled by default. Users can enable their extraction if required. Primary Keys and Foreign Keys are extracted without custom queries if the checkbox are selected.

To use this feature, you will need to write custom queries to extract the metadata. Alation expects that these queries conform to a specific structure and use the expected reserved identifiers. The sections below provide the expected query structure for each metadata type.

Catalog

Make sure that your query has a column labeled as CATALOG in the SELECT statement.

Example:

SELECT
  DATABASE_NAME AS CATALOG
FROM
  SNOWFLAKE.ACCOUNT_USAGE.DATABASES
WHERE
  DELETED IS NULL AND DATABASE_NAME = '<catalog_name>'
ORDER BY
  DATABASE_NAME
Schema

Make sure that your query has a column labeled as CATALOG, SCHEMA in the SELECT statement.

Example:

SELECT
  CATALOG_NAME AS CATALOG,
  SCHEMA_NAME AS SCHEMA
FROM
  SNOWFLAKE.ACCOUNT_USAGE.SCHEMATA
WHERE
 DELETED IS NULL AND CONCAT(CATALOG_NAME,'.',SCHEMA_NAME) IN ('<catalog_name>.<schema_name>') AND CATALOG_NAME = '<catalog_name>'
ORDER BY
  CATALOG_NAME
Table

Make sure that your query has columns labeled as CATALOG, SCHEMA, TABLE_NAME, TABLE_TYPE, and REMARKS in the SELECT statement.

Example:

SELECT
  TABLE_CATALOG AS CATALOG,
  TABLE_SCHEMA AS SCHEMA,
  TABLE_NAME,
  TABLE_TYPE,
  COMMENT AS REMARKS
FROM
  SNOWFLAKE.ACCOUNT_USAGE.TABLES
WHERE
  DELETED IS NULL AND TABLE_TYPE = 'BASE TABLE' AND CONCAT(TABLE_CATALOG,'.',TABLE_SCHEMA) IN ('<catalog_name>.<schema_name>') AND TABLE_CATALOG = '<catalog_name>' AND TRIM(TABLE_NAME) !=''
ORDER BY
  TABLE_CATALOG
View

Make sure that your query has columns labeled as DATABASE_NAME, SCHEMA_NAME, NAME, TEXT, IS_MATERIALIZED, and COMMENT in the SELECT statement.

Example:

SELECT
  TABLE_CATALOG AS CATALOG,
  TABLE_SCHEMA AS SCHEMA,
  TABLE_NAME AS VIEW_NAME,
  VIEW_DEFINITION AS VIEW_CREATE_STATEMENT,
  'VIEW' AS VIEW_TYPE,
  COMMENT AS REMARKS
FROM
  SNOWFLAKE.ACCOUNT_USAGE.VIEWS
WHERE
  DELETED IS NULL AND CONCAT(TABLE_CATALOG,'.',TABLE_SCHEMA) IN ('<catalog_name>.<schema_name>') AND TABLE_CATALOG = '<catalog_name>'
ORDER BY
  TABLE_CATALOG
Column

Make sure that your query has columns labeled as CATALOG, SCHEMA, TABLE_NAME, TYPE_NAME, DATA_TYPE, COLUMN_NAME, ORDINAL_POSITION, IS_NULLABLE, REMARKS, and COLUMN_DEFAULT in the SELECT statement.

Example:

SELECT
  TABLE_CATALOG AS CATALOG,
  TABLE_SCHEMA AS SCHEMA,
  TABLE_NAME,
  DECODE(DATA_TYPE,'TEXT','VARCHAR' || '(' || CHARACTER_MAXIMUM_LENGTH || ')','NUMBER',DATA_TYPE || '(' || NUMERIC_PRECISION ||','|| NUMERIC_SCALE || ')', DATA_TYPE) AS TYPE_NAME,
  DATA_TYPE,
  COLUMN_NAME,
  ORDINAL_POSITION,
  IS_NULLABLE,
  COMMENT AS REMARKS,
  COLUMN_DEFAULT
FROM
  SNOWFLAKE.ACCOUNT_USAGE.COLUMNS
WHERE
  DELETED IS NULL AND CONCAT(TABLE_CATALOG,'.',TABLE_SCHEMA) IN ('<catalog_name>.<schema_name>') AND TABLE_CATALOG = '<catalog_name>' AND TRIM(COLUMN_NAME) NOT IN ('','SYS_MV_SOURCE_PARTITION') AND TRIM(TABLE_NAME) !=''
ORDER BY
  TABLE_CATALOG
Primary Key

Make sure that your query has columns labeled as PK_DATABASE_NAME, PK_SCHEMA_NAME, PK_TABLE_NAME, and PK_COLUMN_NAME in the SELECT statement.

Example:

SHOW PRIMARY KEYS IN ACCOUNT
Foreign Key

Make sure that your query has columns labeled as PK_DATABASE_NAME, PK_SCHEMA_NAME, PK_TABLE_NAME, PK_COLUMN_NAME, FK_DATABSE_NAME, FK_SCHEMA_NAME, FK_TABLE_NAME, FK_COLUMN_NAME in the select list.

Example:

SHOW EXPORTED KEYS IN ACCOUNT
Function

Make sure that your query has columns labeled as CATALOG, SCHEMA, FUNCTION_NAME, and REMARKS in the select list.

Example:

SELECT
  FUNCTION_CATALOG AS CATALOG,
  FUNCTION_SCHEMA AS SCHEMA,
  FUNCTION_NAME,
  COMMENT AS  REMARKS
FROM
  SNOWFLAKE.ACCOUNT_USAGE.FUNCTIONS
WHERE
  DELETED IS NULL AND CONCAT(FUNCTION_CATALOG,'.',FUNCTION_SCHEMA) IN ('<catalog_name>.<schema_name>') AND FUNCTION_CATALOG = '<catalog_name>'
ORDER BY
  FUNCTION_CATALOG

Selective Extraction

On the Metadata Extraction tab, you can select the Schemas to include or exclude from extraction. Selective extraction settings are used to apply a filter to include or exclude a list of schemas.

Enable the Selective Extraction toggle if you want only a subset of schemas to be extracted.

  1. Click Get List of Schemas to first fetch the list of schemas. The status of the Get Schemas action will be logged in the Extraction Job Status table at the bottom of the Metadata Extraction page.

  2. When Schema synchronization is complete, a drop-down list of the Schemas will become enabled.

  3. Select one or more schemas as required.

  4. Check if you are using the desired filter option. Available filter options are described below:

Filter Option

Description

Extract all Schemas except

Extract metadata from all Schemas except from the selected Schemas.

Extract only these Schemas

Extract metadata only from the selected Schemas.

  1. Click Run Extraction Now to extract metadata. The status of the extraction action is also logged in the Job History table at the bottom of the page.

Automated Extraction

If you wish to automatically update the metadata extracted into the Catalog, under Automated and Manual Extraction, turn on the Enable Automated Extraction switch and select the day and time when metadata must be extracted. The metadata extraction will be automatically scheduled to run on the selected schedule.

Data Policies

To enable the Policy Center and data policy extraction , see Policy Center. Additionally, for the Snowflake OCF connector to extract data policies, enable the following alation_conf flags:

alation_conf alation.ocf.mde.policy.enable_extraction -s True
alation_conf alation.ocf.mde.policy.enable_ingestion -s True
alation_conf alation.ocf.mde.policy.enable_search -s True

After changing the values of these parameters, restart Alation Supervisor with the command alation_supervisor restart all.

Compose

On the Compose tab, an admin can enable the use of the Compose tool for this data source.

../../../_images/SnowflakeOCF_03.png
  1. Enable or disable the Allow Export and Download toggle to export or download the results of this data source.

  2. Enable the Enabled in Compose toggle to enable Compose for this data source.

    • Provide the JDBC URI in the Default Connection field which Compose will use as a default connection and Save.

    • Select Compose Connection Sharing option based on the description in the table:

      Compose Connection Option

      Description

      Shared connections across tabs

      This option lets users use the same connection across multiple Compose tabs.

      Separate connection per tab

      Users can use different connections for each Compose tab, which enables them to run multiple queries at the same time.

    • Select the Enable OAuth 2.0 in Compose checkbox to enable the OAuth in Compose. Once you enable this checkbox, it reveals a number of parameters for OAuth setup. Refer to Enable OAuth to configure the OAuth parameters.

  3. Select a Data Uploader option based on the description below:

    Data Uploader

    Description

    Use Global Setting (True)

    Or

    Use Global Setting (False)

    Use the global setting option that is set in alation_conf using alation.data_uploader.enabled flag.

    Users can upload data if the flag is set to true or if the flag is set to false, users cannot upload the data for any data source.

    Enable for this data source

    Use this option to enable the data upload for this data source and override the global setting if the global setting in alation_conf if it is set to false.

    Disable for this data source

    Use this option to disable the data upload for this data source and override the global setting in alation_conf if it is set to true.

Data Sampling

Sampling and profiling is supported. For details, see Configure Sampling and Profiling for OCF Data Sources.

Note

From Snowflake OCF connector version 1.0.1, dynamic sampling and profiling uses the Compose URI. This applies for both basic and OAuth authentication methods.

Query Log Ingestion

You can either create a table for Alation to pull the query logs from or use a custom query to query the logs from the Snowflake data source.

Connector Settings

Table-Based QLI

Use the query format below to create a new table and flush the query history from the Snowflake data source.

CREATE VIEW <View_Name> AS
SELECT
     user_name as "userName",
     CASE
     WHEN SCHEMA_NAME IS NULL
     THEN DATABASE_NAME ||'.'|| ''
     ELSE DATABASE_NAME ||'.'|| SCHEMA_NAME
     END as "defaultDatabases",
     TRIM(QUERY_TEXT) as "queryString",
     TRIM(SESSION_ID ||'/' || TO_CHAR(START_TIME,'YYYY-MM-DD HH:MI:SS.SSSSSSSSS') || '/' || USER_NAME) as "sessionID",
     ROUND(TOTAL_ELAPSED_TIME/1000,3) as "seconds",
     false as "cancelled",
     to_char(start_time,'YYYY-MM-DD HH:MI:SS.US') as "startTime",
     to_char(start_time,'YYYY-MM-DD HH:MI:SS.US') as "sessionStartTime"
FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY;

In the Table Name field, provide the name of the table in which the query logs are available. The table name must be provided in the following format:

“schema.table/viewname”

Note

Make sure that the user has privileges to access the table/view.

Custom Query-Based QLI

When you cannot create a table or view, you can use a Custom QLI Query to perform QLI. Provide the expected query structure as shown below and click Save:

SELECT
      user_name as "userName",
CASE
      WHEN SCHEMA_NAME IS NULL
      THEN DATABASE_NAME ||'.'|| ''
      ELSE DATABASE_NAME ||'.'|| SCHEMA_NAME
END as "defaultDatabases",
TRIM(QUERY_TEXT) as "queryString",
TRIM(SESSION_ID ||'/' || TO_CHAR(START_TIME,'YYYY-MM-DD HH:MI:SS.SSSSSSSSS') || '/' || USER_NAME) as "sessionID",
ROUND(TOTAL_ELAPSED_TIME/1000,0) as "seconds",
false as "cancelled",
to_char(start_time,'YYYY-MM-DD HH:MI:SS.US') as "startTime",
to_char(start_time,'YYYY-MM-DD HH:MI:SS.US') as "sessionStartTime"
FROM
    SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE START_TIME BETWEEN STARTTIME AND ENDTIME;

Or

SELECT
      user_name as "userName",
CASE
      WHEN SCHEMA_NAME IS NULL
      THEN DATABASE_NAME ||'.'|| ''
      ELSE DATABASE_NAME ||'.'|| SCHEMA_NAME
END as "defaultDatabases",
TRIM(QUERY_TEXT) as "queryString",
TRIM(SESSION_ID ||'/' || TO_CHAR(START_TIME,'YYYY-MM-DD HH:MI:SS.SSSSSSSSS') || '/' || USER_NAME) as "sessionID",
ROUND(TOTAL_ELAPSED_TIME/1000,0) as "seconds",
false as "cancelled",
to_char(start_time,'YYYY-MM-DD HH:MI:SS.US') as "startTime",
to_char(start_time,'YYYY-MM-DD HH:MI:SS.US') as "sessionStartTime"
FROM
    SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE DATABASE_NAME = '<Database_Name>' AND USER_NAME = '<User_Name>' AND START_TIME BETWEEN STARTTIME AND ENDTIME;

Automated and Manual QLI

Users can either perform manual Query Log Ingestion or enable automated Query Log Ingestion:

  1. To perform manual Query Log Ingestion, make sure that the Enable Automated Query Log Ingestion toggle is Off. Click the Import button to do manual Query Log Ingestion.

  2. Set the Enable Automated Query Log Ingestion toggle to On to perform the automated Query Log Ingestion.

    1. Set a schedule in the corresponding fields of the schedule section, specify values for week, day and time.

    2. Click Import.

Custom Settings

This configuration option is available if Profiling V2 is enabled.

To profile a column, Alation runs a default query if no custom query is specified for this column. Default queries for column profiling differ depending on the data type: numeric or non-numeric.

The default query supplied by Alation can be modified based on the user requirements and datasource type. The following default query template can be used to revert back to the default query from a custom query:

Numeric columns:

SELECT
  MIN({column_name}) AS MIN,
  MAX({column_name}) AS MAX,
  AVG({column_name}) AS MEAN,
  (COUNT(*) - COUNT({column_name})) AS "#NULL",
  (CASE WHEN COUNT(*) > 0 THEN ((COUNT(*) -  COUNT({column_name})) * 100.0 / COUNT(*)) ELSE 0.0 END) AS "%NULL"
  FROM {schema_name}.{table_name};

Non numeric columns:

SELECT
      ((SUM(CASE WHEN {column_name} IS NULL THEN 1 ELSE 0 END))) AS "#NULL",
      (CASE WHEN COUNT(*) > 0 THEN ((((SUM(CASE WHEN {column_name} IS NULL THEN 1 ELSE 0 END))) * 100.0 / count(*)) ) ELSE 0.0 END ) AS "%NULL"

FROM {schema_name}.{table_name};

Important

The profiling queries MUST contain {column_name}, {schema_name}, and {table_name}. Users must check the compatibility of the default query based on the datasource type and modify it if required.

The default profiling query calculates the Profiling stats that are displayed on the Overview tab of the Column catalog page. When you customize the query, you can also customize the statistics that should be calculated and displayed:

../../../_images/MySQLOCF_06.png

Lineage

Note

This feature requires the Snowflake Column-Level Lineage parser add-on. Contact Alation about purchasing this add-on.

Enable the Column Level Lineage

To enable the Column Level Lineage:

  1. Go to Settings > Feature Configuration.

  2. Turn on the toggle Automatically extracts Column Level Lineage for Snowflake data sources.

  3. Click the Save changes button to save the changes.

../../../_images/DS_Snowflake08.png
  1. In the Verify Feature Configuration Change dialog, click the Save Configuration button.

    ../../../_images/DS_Snowflake09.png

View Column Level Lineage

When you enable the Column Level Lineage add-on, Alation will automatically generate column-level lineage data based on the existing metadata that was previously extracted with MDE, QLI, and the Compose query log.

Note

Once the feature flag is enabled, the column level lineage will be automatically generated only for the view definitions that are complete and already added to the table level lineage.

Complete view definition:

CREATE VIEW adbc_database_01.query_execution_main.table_view AS
  SELECT src.*
    FROM (SELECT col1, col2, col3 FROM tpch.sf1.customer LIMIT 10) src;

Incomplete view definition:

SELECT src.* FROM
  (SELECT col1, col2, col3 FROM tpch.sf1.customer LIMIT 10) src;

More column-level lineage data will be created after you perform metadata extraction (MDE), query log ingestion (QLI), and (or) use Compose to run queries that create or update column objects.

To view the column-level lineage data:

  1. Go to the Lineage tab on the catalog page of a table object that should have column-level lineage.

    ../../../_images/DS_Snowflake10.png
  2. Expand the table or view lineage node to see the columns of the table or view and links to the dataflow object.

    ../../../_images/DS_Snowflake11.png

When you expand a table or view lineage node and click one of the child columns, the column-level lineage link for that particular column becomes highlighted.

../../../_images/DS_Snowflake12.png

Note

If the Column Level Lineage feature flag is turned off, the column level lineage links that were generated previously are not removed. Only the future queries and query history will not be processed.

Troubleshooting

Refer to Troubleshooting.

Query Log Ingestion

While you perform the Custom Query-Based QLI, if the Preview Data does not display results and the log message displays Missing result set column: Username, do the following steps:

Log message:

../../../_images/SnowflakeOCF_05.png
  1. Sign in to the Snowflake with an admin account.

  2. If the ALTER USER SET QUOTED_IDENTIFIERS_IGNORE_CASE = true, set it to false.

    ../../../_images/SnowflakeOCF_06.png
  3. In Alation, perform the following QLI again and see if it runs successfully. If the issue still persists, contact Alation support.