Vertica OCF Connector: Install and Configure

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Network Configuration

Open outbound TCP port 5433 to the Vertica server.

Create Service Account

Sample SQL to Create an Account

When using the SQL below, substitute <service_account> and <password> with actual values.

CREATE USER <service_account> IDENTIFIED BY '<password>';

Permissions for Metadata Extraction

GRANT SELECT ON V_CATALOG.SCHEMATA TO <service_account>;
GRANT SELECT ON V_CATALOG.ALL_TABLES TO <service_account>;
GRANT SELECT ON V_CATALOG.TABLES TO <service_account>;
GRANT SELECT ON V_CATALOG.ODBC_COLUMNS TO <service_account>;
GRANT SELECT ON V_CATALOG.VIEWS TO <service_account>;
GRANT SELECT ON V_CATALOG.PRIMARY_KEYS TO <service_account>;
GRANT SELECT ON V_CATALOG.FOREIGN_KEYS TO <service_account>;
GRANT SELECT ON V_CATALOG.USER_FUNCTIONS TO <service_account>;
GRANT SELECT ON V_CATALOG.USER_TRANSFORMS TO <service_account>;
GRANT SELECT ON V_INTERNAL.ODBC_PROCEDURE_COLUMNS TO <service_account>;

Permission

Purpose

GRANT SELECT ON V_CATALOG.SCHEMATA

Required for schema extraction

GRANT SELECT ON V_CATALOG.ALL_TABLES

Required for table extraction

GRANT SELECT ON V_CATALOG.TABLES

Required for table extraction

GRANT SELECT ON V_CATALOG.ODBC_COLUMNS

Required for column extraction

GRANT SELECT ON V_CATALOG.VIEWS

Required for view extraction

GRANT SELECT ON V_CATALOG.PRIMARY_KEYS

Required for primary key extraction

GRANT SELECT ON V_CATALOG.FOREIGN_KEYS

Required for foreign key extraction

GRANT SELECT ON V_CATALOG.USER_FUNCTIONS

Required for function extraction

GRANT SELECT ON V_CATALOG.USER_TRANSFORMS

Required for function extraction

GRANT SELECT ON V_INTERNAL.ODBC_PROCEDURE_COLUMNS

Required for function definition extraction

Permissions for Profiling and Sampling

GRANT USAGE ON SCHEMA [schema] TO <service_account>;
GRANT SELECT ON ALL TABLES IN SCHEMA [schema] TO <service_account>;

Permissions for Query Log Ingestion

If using table-based QLI, grant the service account the SELECT permissions for the QLI view you create on Vertica. If using custom query-based QLI, grant the service account admin-level privileges as this type of QLI involves querying system tables. For details, see Query Log Ingestion below.

Configure Kerberos for Vertica Data Source

If your Vertica database is kerberized, make sure that the Alation service account has been granted permissions to use Kerberos authentication. You will also need the krb5.conf and the keytab files for your Vertica database. Both files will need to be uploaded in the data source settings in Alation user interface. For details on granting accounts permissions to use Kerberos authentication, refer to Create the Vertica Principals and Keytabs on Linux KDC in Vertica documentation.

Testing Kerberos Connection

Run the following commands to obtain a ticket and check connectivity from the Vertica server:

  • Kerberos without keytab

    sudo -i -u  dbadmin kinit <principal>/<service_account>
    

    Enter the password created during adding the principal.

  • Kerberos with keytab

    sudo -i -u  dbadmin kinit -kt <keytab_file> <principal>/<service_account>
    
  • Verifying the connection

    vsql -U <service_account> -k <KerberosServiceName> -K <krb5-host> -h <host name> -c "select client_authentication_name, authentication_method from sessions;"
    
    vsql -U <service_account> -k <KerberosServiceName> -K <krb5-host> -h <host name> -c "select KERBEROS_CONFIG_CHECK();"
    

JDBC URI

Format

Basic Authentication

vertica://<hostname_or_ip>:<port>/<database_name>

Example

vertica://10.13.41.154:5433/my_database

Kerberos Authentication

vertica://<hostname>:<port>/<database_name>?KerberosHostName=<kerberos_host_name>&KerberosServiceName=<service_name>

Example

vertica://ip-10-13-20-137:5433/alation?KerberosHostName=ip-10-13-20-137.alation-test.com&KerberosServiceName=alation

Configuration in Alation

Step 1: Install the Connector

Alation On-Premise

Important

Installation of OCF connectors requires Alation Connector Manager to be installed as a prerequisite.

To install an OCF connector:

  1. If this has not been done on your instance, install the Alation Connector Manager: Install Alation Connector Manager.

  2. Ensure that the OCF connector Zip file is available on your local machine.

  3. Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.

Alation Cloud Service

Note

On Alation Cloud Service instances, Alation Connector Manager is available by default.

Depending on your network configuration, you may need to use Alation Agent to connect to databases.

Connection via Alation Agent
  1. Ensure that Alation Agent is enabled on your Alation instance. If necessary, create a Support ticket with Alation for an Alation representative to enable the Alation Agent feature on your instance and to receive the Alation Agent installer.

  2. Install the Alation Agent.

  3. Install the OCF connector on Alation Agent.

Connection Without Agent

To install an OCF connector:

  1. Ensure that the OCF connector Zip file is available on your local machine.

  2. Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.

STEP 2: Create and Configure a New Data Source

In Alation, add a new data source:

  1. Log in to Alation as a Server Admin.

  2. Expand the Apps menu on the right of the main toolbar and select Sources.

  3. On the Sources page, click +Add on the top right of the page and in the list that opens, click Data Source. This will open the Add a Data Source wizard.

  4. On the first screen of the wizard, specify a name for your data source, assign additional Data Source Admins, if necessary, and click the Continue Setup button on the bottom. The Add a Data Source screen will open.

  5. On the Add a Data Source screen, the only field you should populate is Database Type. From the Database Type dropdown, select the connector name. After that you will be navigated to the Settings page of your new data source.

    Note

    Agent-based connectors will have the Agent name appended to the connector name.

The name of this connector is Vertica OCF connector.

Access

On the Access tab, set the data source visibility using these options:

  • Public Data Source—The data source will be visible to all users of the catalog.

  • Private Data Source—The data source will be visible to the users allowed access to the data source by Data Source Admins.

You can add new Data Source Admin users in the Data Source Admins section.

General Settings

Perform the configuration on the General Settings tab.

Application Settings

Specify Application Settings if applicable. Click Save to save the changes after providing the information.

Parameter

Description

BI Connection Info

This parameter is used to generate lineage between the current data source and another source in the catalog, for example a BI source that retrieves data from the underlying database. The parameter accepts host and port information of the corresponding BI data source connection.

Use the following format: host:port

You can provide multiple values as a comma-separated list:

10.13.71.216:1541,sever.com:1542

Find more details in BI Connection Info.

Disable Automatic Lineage Generation

Select this checkbox to disable automatic lineage generation from QLI, MDE, and Compose queries. By default, automatic lineage generation is enabled.

Connector Settings

Data Source Connection

Populate the data source connection information and save the values by clicking Save in this section.

Parameter

Description

JDBC URI

Specify the JDBC URI in the required format.

Username

Specify the service account username.

Password

Specify the service account password.

Enable Kerberos authentication

Select this checkbox if using Kerberos authentication and upload the krb5.conf file using the upload link under the checkbox.

Use keytab

Select this checkbox if using keytabs and upload the keytab file for the service account using the upload link under the checkbox.

Enable SSL

Enable or disable SSL authentication by selecting or clearing the Enable SSL checkbox.

If the Enable SSL checkbox is enabled, upload the SSL certificate using the upload link below.

Truststore password

Specify the password for the SSL certificate.

The password will be deleted if the data source connection is deleted.

Logging Configuration

Select the logging level for the connector logs and save the values by clicking Save in this section. The available log levels are based on the Log4j framework.

Parameter

Description

Log level

Select the log level to generate logs. The available options are INFO, DEBUG, WARN, TRACE, ERROR, FATAL, ALL.

You can view the connector logs in Admin Settings > Manage Connectors > Vertica OCF connector.

Obfuscate Literals

Obfuscate Literals—Enable this toggle to hide actual values in the query statements that are ingested during query log ingestion or executed in Compose. This toggle is disabled by default.

Test Connection

Under Test Connection, click Test to validate network connectivity.

Metadata Extraction

You can configure metadata extraction (MDE) for an OCF data source on the Metadata Extraction tab of the Settings page. Refer to Configure Metadata Extraction for OCF Data Sources for information about the available configuration options.

The default queries that the connector uses to extract metadata can be found in Extraction Queries for Vertica. You can customize these queries to adjust the extraction to your needs.

MDE Extraction from Compose

After creation of a table or view in Compose, users will see the corresponding table or view on the respective catalog page without re-running MDE. In Compose, users should use the following query format for Alation to ingest the metadata incrementally:

  • Table

    CREATE TABLE "SCHEMA_NAME"."TABLE_NAME" ({column properties});
    
  • View

    CREATE VIEW "SCHEMA_NAME"."VIEW_NAME" AS {view condition}
    

Sampling and Profiling

Sampling and profiling is supported. For details, see Configure Sampling and Profiling for OCF Data Sources.

Compose

For details about configuring the Compose tab of the Settings, refer to Configure Compose for OCF Data Sources.

Query Log Ingestion

QLI Configuration on Vertica

Vertica logs queries to a number of system tables. Selecting rows from these tables directly requires an admin account. For Alation to ingest the query information, you can either give the Alation service account admin access to select from the relevant system tables or request the DBA to create a view that Alation can be permitted to access.

Use the query below as an example of creating such a view. You can choose any name for the view. You will need to enter the view name in the Alation user interface when configuring query log ingestion on the Query Log Ingestion tab.

CREATE view public.alation_qli_view AS
       SELECT
      ri.session_id AS sessionId,
      ri.user_name AS userName,
      qr.start_timestamp AS startTime,
      qr.end_timestamp AS endTime,
      qr.request_duration_ms/1000 AS seconds,
      ri.request_id,
      qr.start_timestamp AS sessionStartTime,
      ri.request AS queryString,
      '' as defaultDatabases,
      CASE WHEN qr.success THEN null
          ELSE 'FAILED'
      END AS cancelled,
      qr.search_path
  FROM v_internal.dc_requests_issued ri
    JOIN v_monitor.query_requests qr
      ON ri.transaction_id = qr.transaction_id
      AND ri.request_id = qr.request_id
      AND ri.session_id = qr.session_id;

Configuration in Alation

On the Query Log Ingestion tab, you can select QLI options for your data source and schedule the QLI job if necessary.

Connector Settings > Query Extraction

You can choose to use table-based QLI or custom query-based QLI.

If you select table-based QLI, before you set up QLI in Alation, perform the configuration on the database as is described in QLI Configuration on Vertica.

If you select query-based QLI, make sure that the Alation service account has admin-level access.

Table-Based QLI

To configure table-based QLI:

In the Table Name field, specify the name of the QLI view where the query logs are available. Make sure that the service account has the permissions to access this view. The view name must be provided in the following format: database.schema.view_name.

Custom Query-Based QLI

If you cannot create the view for QLI, for example, due to access restrictions, you can use a custom QLI query to extract query history into Alation. If you opt for custom query-based QLI, Alation will query the system table storing query history or the table you’ve created to enable QLI every time you manually run QLI or when the QLI job runs on schedule.

For custom query-based QLI to succeed, ensure that the service account has enough permissions to select from the QLI table.

The template for the QLI query is given below. You can customize it by adding, removing, or changing the filter, but the columns and their aliases must remain as is since the connector expects this query structure.

Note

When using the QLI query template, do not substitute the STARTTIME and ENDTIME parameters in the WHERE filter. These parameters are not actual column names and should stay as is. They are expected by the connector and will be substituted with the start and end date of the QLI range selected in the user interface when QLI is run manually or on schedule.

To configure query-based QLI:

  1. Go to the Query Log Ingestion tab of the Settings page of your OCF data source.

  2. Under Connector Settings > Query Extraction, in the Custom QLI Query field, provide the QLI query.

  3. Click Save.

QLI Query Template
SELECT
  ri.session_id AS sessionId,
  ri.user_name AS userName,
  qr.start_timestamp AS startTime,
  qr.end_timestamp AS endTime,
  qr.request_duration_ms/1000 AS seconds,
  ri.request_id,
  qr.start_timestamp AS sessionStartTime,
  ri.request AS queryString,
  '' AS defaultDatabases,
  CASE WHEN qr.success THEN null
      ELSE 'FAILED'
  END AS cancelled,
  qr.search_path
FROM
  v_internal.dc_requests_issued ri
  JOIN v_monitor.query_requests qr
    ON ri.transaction_id = qr.transaction_id
         AND ri.request_id = qr.request_id
          AND ri.session_id = qr.session_id
WHERE date(starttime) >= STARTTIME
  AND date(endtime) <= ENDTIME
ORDER BY
  sessionId,
  startTime;

Perform QLI

You can either perform QLI manually on demand or enable automated QLI:

  1. To perform manual QLI, under the Automated and Manual Query Log Ingestion section of the Query Log Ingestion tab, ensure that the Enable Automated Query Log Ingestion toggle is disabled.

    Note

    Metadata extraction must be completed first before running QLI.

  2. Click Preview to get a sample of the query history data to be ingested.

  3. Click the Import button to perform QLI on demand.

  4. To schedule QLI, enable the Enable Automated Query Log Ingestion toggle.

  5. Set a schedule under Automated Query Log Ingestion Time by specifying values in the week, day, and time fields. The next QLI job will run on the schedule you have specified.

    Note

    The hourly schedule for automated QLI is not supported.

Troubleshooting

Refer to Troubleshooting for information about logs.