Db2 OCF Connector: Install and Configure

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Network Configuration

Open outbound TCP port 50000 to the Db2 Database (default).

Note

Db2 instances can listen on non-standard ports. Open the appropriate outbound TCP ports from the Alation server.

Service Account

The DB2 service account is a Linux account. When the account is created, grant it the required permissions to access the database and perform MDE, profiling, and QLI.

Create an Account

Example SQL to create an account:

group -g 999 db2iadm1

useradd -u 1004 -g db2iadm1 -m -d /home/alation alation

password alation

Example SQL to grant database access to the account:

-- SQL: enable new user to connect to database

GRANT CONNECT ON DATABASE TO alation;

-- This grants \`alation\` access to all data

GRANT DATAACCESS ON DATABASE TO alation;

Permissions for Metadata Extraction

GRANT SELECT ON SYSCAT.SCHEMATA TO alation;
GRANT SELECT ON SYSCAT.TABLES TO alation;
GRANT SELECT ON SYSCAT.VIEWS TO alation;
GRANT SELECT ON SYSCAT.COLUMNS TO alation;
GRANT SELECT ON SYSIBM.SYSCOLUMNS TO alation;
GRANT SELECT ON SYSCAT.REFERENCES TO alation;
GRANT SELECT ON SYSCAT.KEYCOLUSE TO alation;
GRANT SELECT ON SYSCAT.INDEXES TO alation;
GRANT SELECT ON SYSIBM.SYSROUTINEPARMS TO alation;
GRANT SELECT ON SYSCAT.ROUTINES TO alation;

Permissions for Sampling and Profiling

For profiling, the service account requires the SELECT permission on tables that you want to sample.

Permissions for Query Log Ingestion

See QLI Configuration on Db2 below.

Additional Setup: Enable Explain in Compose

To support the Explain feature for Db2 connections, follow the steps below on your Db2 system. This is a one-time process.

  1. Connect to the DB2 instance.

    CONNECT TO <database-name>
    
  2. Run the procedure given below.

    CALL SYSPROC.SYSINSTALLOBJECTS('EXPLAIN', 'C', CAST (NULL AS VARCHAR(128)),
    CAST (NULL AS VARCHAR(128)))
    

JDBC URI

Format

db2://<hostname_or_ip>:<port>/<service_name>

Example

db2:/my_ibm.net:50000/TESTDB

Configuration in Alation

STEP 1: Install the Connector

Alation On-Premise

Important

Installation of OCF connectors requires Alation Connector Manager to be installed as a prerequisite.

To install an OCF connector:

  1. If this has not been done on your instance, install the Alation Connector Manager: Install Alation Connector Manager.

  2. Ensure that the OCF connector Zip file is available on your local machine.

  3. Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.

Alation Cloud Service

Note

On Alation Cloud Service instances, Alation Connector Manager is available by default.

Depending on your network configuration, you may need to use Alation Agent to connect to databases.

Connection via Alation Agent
  1. Ensure that Alation Agent is enabled on your Alation instance. If necessary, create a Support ticket with Alation for an Alation representative to enable the Alation Agent feature on your instance and to receive the Alation Agent installer.

  2. Install the Alation Agent.

  3. Install the OCF connector on Alation Agent.

Connection Without Agent

To install an OCF connector:

  1. Ensure that the OCF connector Zip file is available on your local machine.

  2. Install the connector on the Connectors Dashboard page using the steps in Manage Connectors.

STEP 2: Create and Configure a New Data Source

In Alation, add a new data source:

  1. Log in to Alation as a Server Admin.

  2. Expand the Apps menu on the right of the main toolbar and select Sources.

  3. On the Sources page, click +Add on the top right of the page and in the list that opens, click Data Source. This will open the Add a Data Source wizard.

  4. On the first screen of the wizard, specify a name for your data source, assign additional Data Source Admins, if necessary, and click the Continue Setup button on the bottom. The Add a Data Source screen will open.

  5. On the Add a Data Source screen, the only field you should populate is Database Type. From the Database Type dropdown, select the connector name. After that you will be navigated to the Settings page of your new data source.

    Note

    Agent-based connectors will have the Agent name appended to the connector name.

The name of this connector is DB2 OCF Connector.

Access

On the Access tab, set the data source visibility using these options:

  • Public Data Source—The data source will be visible to all users of the catalog.

  • Private Data Source—The data source will be visible to the users allowed access to the data source by Data Source Admins.

You can add new Data Source Admin users in the Data Source Admins section.

General Settings

Note

This section describes configuring settings for credentials and connection information stored in the Alation database. If your organization has configured Azure KeyVault or AWS Secrets Manager to hold such information, the user interface for the General Settings page will change to include the following icons to the right of most options:

../../../_images/VaultOrDB.png

By default, the database icon is selected, as shown. In the vault case, instead of the actual credential information, you enter the ID of the secret. See Configure Secrets for OCF Connector Settings for details.

Perform the configuration on the General Settings tab.

Application Settings

Specify Application Settings if applicable. Click Save to save the changes after providing the information.

Parameter

Description

BI Connection Info

This parameter is used to generate lineage between the current data source and another source in the catalog, for example a BI source that retrieves data from the underlying database. The parameter accepts host and port information of the corresponding BI data source connection.

Use the following format: host:port

You can provide multiple values as a comma-separated list:

10.13.71.216:1541,sever.com:1542

Find more details in BI Connection Info.

Disable Automatic Lineage Generation

Select this checkbox to disable automatic lineage generation from QLI, MDE, and Compose queries. By default, automatic lineage generation is enabled.

Connector Settings

Data Source Connection

Populate the data source connection information and save the values by clicking Save in this section.

Parameter

Description

JDBC URI

Specify the JDBC URI in the required format.

Username

Specify the service account username.

Password

Specify the service account password.

Logging Configuration

Select the logging level for the connector logs and save the values by clicking Save in this section. The available log levels are based on the Log4j framework.

Parameter

Description

Log level

Select the log level to generate logs. The available options are INFO, DEBUG, WARN, TRACE, ERROR, FATAL, ALL.

You can view the connector logs in Admin Settings > Manage Connectors > DB2 OCF Connector.

Obfuscate Literals

Obfuscate Literals—Enable this toggle to hide actual values in the query statements that are ingested during query log ingestion or executed in Compose. This toggle is disabled by default.

Test Connection

Under Test Connection, click Test to validate network connectivity.

Metadata Extraction

You can configure metadata extraction (MDE) for an OCF data source on the Metadata Extraction tab of the Settings page. Refer to Configure Metadata Extraction for OCF Data Sources for information about the available configuration options.

The default queries that the connector uses to extract metadata can be found in Extraction Queries for IBM Db2 Database. You can customize these queries to adjust the extraction to your needs.

MDE Extraction from Compose

After users create tables or views in Compose, Alation incrementally extracts their metadata and adds it to the corresponding schemas in the catalog. A Data Source Admin does not need to rerun MDE to have these new objects represented in Alation.

For incremental MDE from Compose to be successful, the CREATE statements should use this format:

CREATE TABLE "SCHEMA_NAME"."TABLE_NAME" ({column properties});

CREATE VIEW "SCHEMA_NAME"."VIEW_NAME" AS {view condition};

For lineage to be generated successfully for views, the fully qualified name of the table should be used in the view condition.

Note

DB2 stores table names in uppercase. Ensure that you use double quotes to use the necessary case.

Sampling and Profiling

Sampling and profiling is supported. For details, see Configure Sampling and Profiling for OCF Data Sources.

Compose

For details about configuring the Compose tab of the Settings, refer to Configure Compose for OCF Data Sources.

Query Log Ingestion

QLI Configuration on Db2

Query Log Ingestion (QLI) requires creating an event monitor and tables on your database to capture query history.

Alation supports two QLI options for Db2. Depending on which type of QLI you opt for, you may need to create an additional view for QLI. See Configure Table-Based QLI or Configure Query-Based QLI below.

Use the following example SQL to enable query history on Db2 and grant the service account permissions to select from the QLI tables:

CREATE EVENT MONITOR alationQueryLogMonitor FOR STATEMENTS
    WRITE TO TABLE connheader(
    TABLE alationDb2QueryLogger.connection_header INCLUDES(sequence_no, auth_id, appl_id, conn_time)
    ),
    stmt(
        TABLE alationDb2QueryLogger.statement INCLUDES(
          appl_id,
          stmt_operation,
          stmt_text,
          start_time,
          stop_time,
          system_cpu_time,
          user_cpu_time
          )
    )
MANUALSTART BUFFERSIZE 512 NONBLOCKED;
SET EVENT MONITOR alationQueryLogMonitor STATE = 1;

GRANT SELECT ON alationDb2QueryLogger.statement TO alation;
GRANT SELECT ON alationDb2QueryLogger.connection_header TO alation;

Configure Table-Based QLI

If you opt for table-based QLI, in a schema of your choice create a view on top of the QLI tables using the SQL example below and grant the service account the SELECT rights on this view. You will need to provide the name of this view in Alation when configuring QLI in the data source settings.

When using the example below:

  • Substitute the placeholder value <schema_name> with your schema name.

  • You can give the view any name of your choice. Substitute the placeholder name <view_name> with the name you choose.

CREATE VIEW <schema_name>.<view_name> AS
  SELECT
      TRIM(alationDb2QueryLogger.connection_header.auth_id) AS userName,
      alationDb2QueryLogger.connection_header.conn_time AS sessionstartTime,
      alationDb2QueryLogger.statement.start_time AS startTime,
      TRIM(alationDb2QueryLogger.connection_header.auth_id || '/' || to_char(alationDb2QueryLogger.statement.start_time,
          'YYYY-MM-DD HH:MI:SS.SSSSS') || '/' || alationDb2QueryLogger.statement.appl_id) AS sessionId,
      alationDb2QueryLogger.statement.stmt_text AS queryString,
      'N' AS cancelled,
      '' AS defaultDatabases,
      (CASE
          WHEN timestampdiff(2, char(alationDb2QueryLogger.statement.stop_time-alationDb2QueryLogger.statement.start_time)) > 10*60
              THEN cast(timestampdiff(2, char(alationDb2QueryLogger.statement.stop_time-alationDb2QueryLogger.statement.start_time)) AS decimal(31,3))
          ELSE cast(timestampdiff(1, char(alationDb2QueryLogger.statement.stop_time-alationDb2QueryLogger.statement.start_time))/1000000.0 AS decimal(31,3))
      END) as seconds_taken,
      CAST(alationDb2QueryLogger.statement.system_cpu_time/1000000.0 AS decimal(31,3)) AS system_cpu_time,
      CAST(alationDb2QueryLogger.statement.user_cpu_time/1000000.0 AS decimal(31,3)) AS user_cpu_time
FROM alationDb2QueryLogger.statement
INNER JOIN alationDb2QueryLogger.connection_header
    ON alationDb2QueryLogger.connection_header.appl_id = alationDb2QueryLogger.statement.appl_id
WHERE alationDb2QueryLogger.statement.stmt_operation IN (3,6);

GRANT SELECT ON <schema_name>.<view_name> TO alation;

On the Query Log Ingestion tab, in the Table Name field under Connector Settings > Query Extraction, specify the name of the view in which the query logs are available. Make sure that the service account has permissions to select from this view. The table name must be provided in the following format: database.schema.view_name.

Configure Query-Based QLI

If you cannot create the view for QLI, for example, due to access restrictions, you can use a custom QLI query to extract query history into Alation. If you opt for query-based QLI, Alation will query the system table storing query history every time you manually run QLI or when the QLI job runs on schedule.

For custom query-based QLI to succeed, ensure that the service account has enough permissions to select from the system table.

The template for the QLI query is given below. You can customize it by adding, removing, or changing the filter, but the columns and their aliases must remain as is since Alation expects this query structure.

Note

  • The query must include the following fields: userName, startTime, sessionStartTime, sessionId, queryString, cancelled, defaultDatabases, seconds_taken.

  • When using the QLI query template, do not substitute the STARTTIME and ENDTIME parameters in the WHERE filter. These parameters are not actual column names and should stay as is. They are expected by the connector and will be substituted with the start and end date of the QLI range selected in the user interface when QLI is run manually or on schedule.

To configure query-based QLI:

  1. Go to the Query Log Ingestion tab of the Settings page of your OCF data source.

  2. Under Connector Settings > Query Extraction, in the Custom QLI Query field, specify the QLI query.

  3. Click Save.

QLI Query Template

SELECT
  TRIM(alationDb2QueryLogger.connection_header.auth_id) AS userName,
  alationDb2QueryLogger.connection_header.conn_time AS sessionstartTime,
  alationDb2QueryLogger.statement.start_time as startTime,
  TRIM(alationDb2QueryLogger.connection_header.auth_id || '/' || to_char(alationDb2QueryLogger.statement.start_time,'YYYY-MM-DD HH:MI:SS.SSSSS') || '/' || alationDb2QueryLogger.statement.appl_id) AS sessionId,
  alationDb2QueryLogger.statement.stmt_text AS queryString,
  'N' as cancelled,
  '' as defaultDatabases,
  (CASE WHEN timestampdiff(2, CHAR(alationDb2QueryLogger.statement.stop_time-alationDb2QueryLogger.statement.start_time)) > 10*60
      THEN cast(timestampdiff(2, CHAR(alationDb2QueryLogger.statement.stop_time-alationDb2QueryLogger.statement.start_time)) AS decimal(31,3))
      ELSE cast(timestampdiff(1, CHAR(alationDb2QueryLogger.statement.stop_time-alationDb2QueryLogger.statement.start_time))/1000000.0 AS decimal(31,3))
  END) AS seconds_taken,
  CAST(alationDb2QueryLogger.statement.system_cpu_time/1000000.0 AS decimal(31,3)) AS system_cpu_time,
  CAST(alationDb2QueryLogger.statement.user_cpu_time/1000000.0 AS decimal(31,3)) AS user_cpu_time
FROM alationDb2QueryLogger.statement
INNER JOIN alationDb2QueryLogger.connection_header
  ON alationDb2QueryLogger.connection_header.appl_id = alationDb2QueryLogger.statement.appl_id
WHERE alationDb2QueryLogger.statement.stmt_operation IN (3,6)
  AND alationDb2QueryLogger.statement.stmt_text IS NOT NULL
  AND TRIM(alationDb2QueryLogger.statement.stmt_text) <> ''
  AND start_time > STARTTIME
  AND start_time < ENDTIME
ORDER BY
  sessionId,
  startTime;

Perform QLI

You can either perform QLI manually on demand or enable automated QLI:

  1. To perform manual QLI, under the Automated and Manual Query Log Ingestion section of the Query Log Ingestion tab, ensure that the Enable Automated Query Log Ingestion toggle is disabled.

    Note

    Metadata extraction must be completed first before running QLI.

  2. Click Preview to get a sample of the query history data to be ingested.

  3. Click the Import button to perform QLI on demand.

  4. To schedule QLI, enable the Enable Automated Query Log Ingestion toggle.

  5. Set a schedule under Automated Query Log Ingestion Time by specifying values in the week, day, and time fields. The next QLI job will run on the schedule you have specified.

    Note

    The hourly schedule for automated QLI is not supported.

Log Location

Refer to Troubleshooting for information about logs.