Azure Power BI Scanner OCF Connector: Install and Configure

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

Follow these steps to perform the necessary configuration for the Azure Power BI Scanner connector on the Power BI side and in Alation.

Network Configuration

Open outbound TCP port 443 on the Azure Power BI server for the REST API communication.

Note

If the Azure Power BI server is connected using proxy connection, open the inbound TCP port 3128.

I. Azure Power BI Scanner Setup and Authentication

Step 1: Register an Application in Azure Portal

Register an application in Azure Portal to use as a service principal to access Power BI from Alation:

  1. Navigate to App registrations in Microsoft Azure Portal.

  2. Click New registration:

    ../../../_images/powerb2.png
  3. Provide the following information:

    • A name for the application

    • Select the account type Accounts in this organizational directory only (<Org Name> only - Single tenant)

    • The Power BI URL as Redirect URI: https://api.powerbi.com or https://api.powerbigov.us - for Azure Government Cloud

      ../../../_images/powerb3.png
  4. Click Register.

  5. On the next screen, locate Application (client) ID and Directory (tenant) ID. Save them for future reference.

    ../../../_images/powerb4.png
  6. On the left pane, click Certificates & secrets to generate the Client Secret.

    ../../../_images/powerb5.png
  7. Click New client secret under the Client secrets section.

    ../../../_images/powerb6.png
  8. On the popup window, provide the Description for the client secret and click Add.

    ../../../_images/powerb7.png
  9. Save the Client Secret for future reference.

Step 2: Assign Workspace Permissions

Grant the application you created in Step 1 member-level access to the workspaces you plan to catalog in Alation. Member-level permissions enable the connector to extract report dimensions and dataflows. For steps, refer to Give users access to workspaces in Power BI in Power BI documentation.

Step 3: Create a Security Group for Azure Power BI Scanner

  1. Navigate to the Groups management section of Azure Portal.

  2. Click New group.

    ../../../_images/powerb8.png
  3. Enter the following details:

    • Select Security for Group Type.

    • Enter the Group Name and Group Description.

    • Under Members, add the app created in Step 1 as the member of the group.

    • Click Create.

    ../../../_images/powerb9.png

Step 4: Enable Azure Power BI Scanner Service Admin Settings

For an Azure application to be able to access the Power BI content and APIs, a Power BI admin needs to set Enable service principal access in the Power BI admin portal.

  1. Login to Power BI.

  2. From the Settings menu, select the Admin portal:

    ../../../_images/powerb10.png
  3. Navigate to Tenant Settings > Developer Settings and perform the following configuration:

    3.1 Enable Allow service principals to use Power BI APIs.

    3.2 Select the Specific security groups and select the security group created in Step 3.

    3.3 Click Apply to apply the settings.

    ../../../_images/PowerBI_39.png
  4. Navigate to Tenant Settings > Admin API Settings and perform the following configuration:

    4.1 Enable Allow service principals to use read-only Power BI Admin APIs.

    4.2 Select the Specific security groups and select the security group created in Step 3.

    4.3 Click Apply to apply the settings.

    4.4 Similarly, enable the following flags:

    • Enhance admin APIs responses with detailed metadata

    • Enhance admin APIs responses with DAX and mashup expressions

    ../../../_images/powerb11.png
  5. In Tenant Settings, navigate to Gen1 Dataflow Settings:

    5.1 Enable Create and use Gen1 dataflows to extract the dataflow objects in Alation.

    5.2 Click Apply to save the settings.

    ../../../_images/powerb19.png

    Note

    The service principle requires member-level access to workspaces in Power BI to extract dataflows. The dataflow connection missing error may occur during extraction when workspace permissions are not available.

  6. Under Tenant Settings, navigate to Download Reports:

    6.1 Enabling Download Reports enables you to extract report fields into Alation. It is also required for Column Level Lineage.

    6.2 Click Apply to save the settings.

    ../../../_images/powerb21.png

II. Alation Setup and Authentication

Step 1: Install the Connector

Important

Installation of an OCF connector requires Alation Connector Manager to be installed as a prerequisite.

  1. If this has not been done on your instance, install Alation Connector Manager using the steps in: Install Alation Connector Manager.

  2. Make sure that the OCF connector Zip file that you received from Alation is available on your local machine.

  3. Install the connector on the Connectors Dashboard page: refer to Manage Connector Dashboard.

Step 2: Create and Configure a New BI Server Source

Add a New BI Server Source

This configuration requires the role of the Server Admin.

  1. Log in to the Alation instance and add a new BI Server source: Apps > Sources > Add > BI Server. The Register a Business Intelligence Server screen will open.

  2. On this screen, from the Select a Business Intelligence Server type list, select the PowerBIScanner connector name.

    Note

    The connector name appears in this list only after it is installed.

    ../../../_images/powerb12.png
  3. Provide the Title and a Description (optional) for your PowerBI Scanner data source.

    ../../../_images/powerb13.png

Configure the Azure Power BI Scanner Source

  1. Configure Access:

    Applies from release 2023.3.5

    On the Access tab, set the BI source visibility as follows:

    • Public BI Server—The BI source will be visible to all users of the catalog.

    • Private BI Server—The BI Source will be visible to users that have been assigned the BI Server Admin or Viewer role. It will be hidden for all other users.

    ../../../_images/AccessMenu_BISource.png

    You can add more BI Admins or Viewers in the BI Server Admins section if required.

    For more information, see Configure Access to OCF BI Sources

  2. Perform the configuration on the General Settings tab.

    Note

    This section describes configuring settings for credentials and connection information stored in the Alation database. If your organization has configured Azure KeyVault or AWS Secrets Manager to hold such information, the user interface for the General Settings page will change to include the following icons to the right of most options:

    ../../../_images/VaultOrDB.png

    By default, the database icon is selected, as shown. In the vault case, instead of the actual credential information, you enter the ID of the secret. See Configure Secrets for OCF Connector Settings for details.

    Specify Application Settings:

    ../../../_images/powerb14.png

Parameter

Description

Enable Raw Dump or Replay

The options in this drop list can be used to dump the extracted metadata into files in order to debug extraction issues before ingesting the metadata into Alation. This feature can be used during testing in case there are issues with MDE. It breaks extraction into two steps: first, the extracted metadata is dumped into files and can be viewed; and second, it can be ingested from the files into Alation. It is recommended to keep this feature enabled only if debugging is required.

  • Enable Raw Metadata Dump: Select this option to save extracted metadata into a folder for debugging purposes. The dumped data will be saved in four files (attribute.dump, function.dump, schema.dump, table.dump) in folder opt/alation/site/tmp/ inside Alation shell.

  • Enable Ingestion Replay: Select this option to ingest the metadata from the dump files into Alation.

  • Off: Disable the Raw Metadata Dump or Replay feature. Extracted metadata will be ingested into Alation.

Disable Automatic Lineage Generation

Select the Disable Automatic Lineage Generation checkbox to skip the creation of automatic Lineage after extraction. When automatic Lineage generation is disabled, during extraction Alation does not calculate Lineage data for this BI source.

For more information, see Disable Automatic Lineage Generation.

Disable Permission Enforcement

Leave this checkbox selected.

Permission enforcement is not supported between Azure Power BI Scanner and Alation

Disable Certification

Leave this checkbox selected.

Certification of BI objects in Azure Power BI Scanner from Alation is not supported.

Server URI

Enter the server URI used to access Azure Power BI Scanner.

  1. Click Save to save the information you have entered.

  2. Specify Connector Settings:

../../../_images/powerb15.png

Parameter

Description

Power BI Web URL

Specify the URL to access Power BI. The default values are: https://app.powerbi.com/ and https://app.powerbigov.us/ - for Gov Cloud

Power BI API URL

Specify the API URL to access Power BI APIs. The default values are: https://api.powerbi.com/v1.0/myorg/ and https://api.powerbigov.us/v1.0/myorg/ - for Gov Cloud

API URL for Access Token

Specify the API URL to generate the Access Token. The default value is https://login.microsoftonline.com/common/oauth2/token

Power BI Resource URL

Specify the Power BI Resource URL to grant access scope to Power BI APIs. The default values are: https://analysis.windows.net/powerbi/api and https://analysis.usgovcloudapi.net/powerbi/api - for Gov Cloud

Power BI Tenant ID

Specify the Power BI tenant ID.

Power BI Client ID

Specify the Power BI client ID.

Power BI Client Secret

Specify the Power BI client secret.

Proxy URL

Specify the URL to access Power BI via proxy server. This is an optional field and should be used only if Azure Power BI Scanner is connected using proxy connection.

Proxy Port

Specify the proxy port number. This is an optional field and should be used only if Azure Power BI Scanner is connected using proxy connection.

Proxy Username

Specify the proxy server username. This is an optional field and should be used only if Azure Power BI Scanner is connected using proxy connection.

Proxy Password

Specify the proxy server password. This is an optional field and should be used only if Azure Power BI Scanner is connected using proxy connection.

Power BI Workspace Extraction Batch Size

Provide the workspace extraction batch size. Alation supports a maximum of 100 batches per extraction. Reducing this parameter will increase the number of API calls against the Azure Power BI Scanner to fetch the data, but decrease the size of data fetched during each call.

Enable Apps Extraction

Enable this checkbox to extract the apps from Azure Power BI Scanner.

Enable Report Fields Extraction

Enable this checkbox to extract the report fields from Azure Power BI Scanner. Additionally, enabling this option also enables Column Level Lineage (CLL) to work in supported Alation versions (2024.1.2 onwards). However, enabling this option increases the extraction time.

Enable Extraction For Service Principal

Enable the checkbox to enable service principal-based extraction and listing of projects that the user has access to view.

Log Level

Select the log level to generate logs. The available options are INFO, DEBUG, WARN, TRACE, ERROR, FATAL, ALL.

Pause Extraction if API limit is reached

Enable this checkbox to pause the extraction if the API limit is reached. The extraction will be paused until the API limit is refreshed which will take a maximum of one hour time. If this checkbox is disabled, the extraction will complete if the API limit is reached. The metadata that was extracted before this time will be added to the Catalog. Extraction will not resume when the API limit is refreshed.

  1. Click Save.

  2. Under Test Connection, click Test to validate network connectivity.

    ../../../_images/powerb16.png
  3. Under Extraction Settings, turn on Selective Extraction if required. Selective extraction settings are used to apply a filter to include or exclude a list of projects.

    7.1 Click Get List of Projects to first fetch the list of projects.

    7.2 The status of the Get Projects action is logged in the** Job History** table at the bottom of the Settings page.

    7.3 Once the folder synchronization is successful, a drop-down list of projects will become enabled. Select one or more projects to apply the filter.

    7.4 Check if you are using the desired filter option. Available filter options are described below:

    Filter Option

    Description

    Extract all Projects except

    Extract metadata from all projects except from the Folders selected.

    Extract only these Projects

    Extract metadata from only the selected Projects.

    7.5 Click Run Extraction Now to extract metadata. The status of the extraction action is also logged in the Job History table at the bottom of the page.

    Note

    Extraction of source description is supported from connector version 1.4.3.

    Note

    Extraction of dataflow objects is supported from connector version 2.0.1. See Dataflows for more details.

  4. If you wish to automatically update the metadata in the Catalog, under Automated and Manual Extraction, turn on Enable Automated Extraction and select the day and time when metadata must be extracted. The metadata extraction will be automatically scheduled to run at the selected day and time.

Lineage

The OCF connector for Azure Power BI automatically calculates lineage information during metadata extraction. The lineage information you get in Alation depends on the Alation version and the connector version:

  • Lineage for Power BI dataflow objects is supported from Alation version 2023.3.3

  • Column-level lineage is supported from Alation version 2024.1.2

By default, the connector generates report-level lineage (dataset > report > dashboard). You have the ability to perform additional configurations to get more lineage information on lineage charts:

Configure Cross-System Lineage

You have the ability to configure cross-system lineage to generate lineage between your Power BI Scanner BI source and an RDBMS data sources supported by the connector for lineage:

  • Amazon Redshift

  • Azure Databricks

  • Azure SQL DB

  • Azure Synapse Analytics

  • Databricks on AWS

  • Databricks Unity Catalog

  • Google BigQuery

  • MySQL

  • Oracle

  • PostgreSQL

  • Snowflake

  • SQL Server

  • Teradata

Note

RDBMS data sources must be cataloged in Alation and be connected with the corresponding OCF connector.

Cross-system lineage is configured in the settings of an RDBMS data source. To generate cross-system lineage, configure the BI Connection Info field (Additional datasource connections field) on the RDBMS connector’s General Settings tab in the format host:port (for example, adb-8443049157651279.19.azuredatabricks.net:443). Find more information in Configure Cross-Source Lineage.

The screenshot below shows cross-system lineage configuration on the General Settings tab of a supported RDBMS data source:

../../../_images/powerb17.png

Column-Level Lineage

From Alation version 2024.1.2, you have the ability to enable lineage calculation for columns. With column-level lineage enabled, Alation will generate lineage information for Power BI reports on the report field level. With both column-level lineage and cross-system lineage configured, users will be able to trace lineage from a specific RDBMS column to a specific report field in Power BI:

../../../_images/powerb20.png

Enable Column-Level Lineage

Applies from version 2024.1.2

Column-level lineage can be activated by a Server Admin in Admin Settings > Feature Configuration.

Note

On versions before 2024.1.2, Power BI Scanner OCF sources only support report-level and table-level lineage.

To enable column-level lineage:

  1. Log in to Alation as a Server Admin.

  2. Click the three gears icon on top right to open the Admin Settings page.

  3. Under the Server Admin section, click Feature Configuration.

  4. Locate the toggle Automatically extracts Column Level Lineage from Power BI Scanner data source. Click the toggle to activate the feature.

  5. Click Save changes on the top right of the page to apply the change.

  6. Go to connector General Settings page for your BI Source.

  7. Enable Enable Report Fields Extraction flag and click Save.

  8. Click Run Extraction Now to extract metadata.

Now, lineage for Power BI Scanner OCF sources will be generated at the column level.

Note

Enabling the report field extraction can increase the extraction time, as it is dependent on the PBIX report export and parsing, which might take more time depending on file size.

Lineage Limitations

All Lineage

  • You must upgrade to Alation version 2023.3.2 or higher to be able to generate cross-system lineage from Databricks Unity Catalog with connector version 2.1.0 or higher.

  • To use lineage with UC Databricks in Alation version 2023.3.2, enable the flag alation.resolution.DEV_bi_connector_returns_db_for_two_part_schema. No flag is required from Alation version 2023.3.3 onwards for cross-system lineage to be generated between Databricks Unity Catalog lineage and Power BI.

  • Lineage for paginated reports (RDL) is not supported. Paginated reports do not contain the dataset ID required to show lineage between a Power BI dataset and a Power BI paginated report.

  • Lineage between data source tables and datasets is built through parsing dataset expressions which are Power BI M Queries (Power Queries). Alation gets these expressions as a response from the getScanResult API. See Expressions Supported by Lineage for examples of supported expressions.

  • Power BI Scanner API doesn’t return datasets with object-level security. As a result, Alation will not show connections, fields, and lineage for such datasets.

Limitations on Column-Level Lineage

  • Power BI column-level lineage is not supported for the following objects:

    • Power BI dashboards

      • Column-level lineage is not shown between a report and a dashboard or a tile and a dashboard

    • Power BI tiles

      • Column-level lineage is not shown between an upstream source and a tile

    • Power BI report measures

  • If a data source column name is modified in Power BI in a dataset or a dataflow, then column-level lineage is supported only if Alation gets the renamed column information in one of the supported formats as part of the dataset expression in the scanner API response.

Note

From Power BI Scanner OCF connector version 2.2.1, the connector supports the parsing of queries that contain table names with periods (.) in them. The table names must be enclosed in quotes to ensure correct parsing, for example:

SELECT a.name, a.id FROM schema.\"Table.name\" a

Troubleshooting

For general recommendations on troubleshooting OCF connectors, refer to Troubleshooting.

Find specific troubleshooting scenarios below:

Missing Power BI Report Fields

Problem

Report fields are only partially extracted. The logs show an API limit exceeded error with the 429 response code for the PBIX Download API. The message in the logs is similar to the following:

{"timeMillis":1711699692373,"thread":"grpc-default-executor
0","level":"WARN","loggerName":"org.springframework.web.client.RestTemplate","message":"GET request for \"https://api.powerbi.com/v1.0/myorg
groups/c87a7b55-08f2-4565-84b8-e717263c7741/reports/09b2a05b-c8fc-4edc-80ba-43369b897565/Export\" resulted in 429 (null); invoking error
handler","endOfBatch":false,"loggerFqcn":"org.apache.commons.logging.impl.Log4JLogger","contextMap":{},"threadId":17,"threadPriority":5,"source
:{"class":"org.springframework.web.client.RestTemplate","method":"handleResponseError","file":"RestTemplate.java","line":559,"classLoaderName":
app"},"timestamp":"2024-03-29T08:08:12.373+0000"}

Solution

The issue is caused by Power BI REST API throttling. You may have to reach out to the Azure Power BI Support team about increasing the REST API limit for your Power BI server.

Missing Tiles

Problem

Some of the tiles under the dashboard are not cataloged.

Solution

If a tile has no title, Alation skips cataloging it. Ensure that tiles have titles in Power BI.

Cross-System Lineage Is Not Displayed

Problem

Cross-system lineage has been is configured on a data source, but the lineage chart does not display upstream lineage.

Solution

Verify the following:

  • Make sure that the data source on which the cross-system lineage is configured is supported by this connector. See the Expressions Supported by Lineage table for the data sources supported by this connector.

  • Make sure that the value in the BI Connection Info (Additional datasource connections) field uses the format host:port.

  • Make sure that the connection details in the Azure Power BI Scanner connector and the data source are in the same format: catalog.schema.tablename or schema.tablename.

    ../../../_images/powerb18.png

Existing Tables Appear as TMP Objects in Lineage

Problem

Tables on lineage charts appear as TMP objects when a schema’s name includes the period symbol.

Cause

The issue is due to a limitation in Alation’s framework, where the convention for schema names is not to include periods and to follow the format catalogName.schemaName.tableName. If a schema name includes periods, Alation can’t differentiate between schemaName and tableName.

Solution

This is a known issue.

Columns Do Not Appear in the Dropdown for a Dataset or Dataflow in Lineage

Problem

Columns don’t appear in the dropdown for a dataset or a dataflow node on the lineage chart.

Cause

Columns are displayed on lineage charts only if the connector has extracted enough metadata to establish a relationship between a Power BI report field and the Power BI dataset field from which it is created. The mapping between fields is done internally in the connector code but is dependent on the PBIX report export and parsing. For example, if Alation failed to download a Power BI PBIX report due to some limitation, such as, for example, the 429 throttling error, the connector won’t be able to extract report fields and display column-level lineage.

Solution

Check the connector logs for errors while extracting and downloading reports. The logs will point to the specific cause of the issue.

Missing Columns in Lineage after Selective MDE

Problem

Lineage is not displayed for some reports after selective extraction (MDE). However, all lineage is displayed after full MDE.

Solution

The report that doesn’t have lineage after selective MDE may have been created from a dataset or dataflow in a different workspace. For cross-workspace lineage to work, we need to catalog all the workspaces where the datasets populating the reports are located. Ensure that you select all relevant workspaces when configuring selective extraction.

Column Level Lineage Appears Even After Disabling

Problem

If you have enabled report fields extraction and ran extraction with Column Level Lineage enabled from feature configuration and later turned off these settings due to increased extraction time. You will observe that the columns remain in the lineage graph even after re-running the extraction.

Solution

This is a known issue.

Migrating Logical Metadata

You can migrate the logical metadata from the Power BI instance connected using the older version of the Azure Power BI connector to the same Power BI instance connected using the Azure Power BI Scanner connector.

Prerequisites

Prepare the following to migrate the metadata:

  1. The BI source that is configured with the old Power BI OCF Connector.

  2. The Power BI migration script: create a ticket with Alation Support about receiving the Power BI migration script.

  3. Make sure that the required Azure Power BI Scanner OCF connector has been installed on the Alation server. The installed OCF connector will be displayed on Admin Settings > Server Admin > Manage Connectors page.

  4. Configure the same Power BI instance (as in step 1) as a new BI Server source with the Azure Power BI Scanner connector and run the extraction.

Scope

The following information will be migrated from the old Power BI connector to the Azure Power BI Scanner connector:

  • Data source ID - The data source ID will remain the same.

  • Logical metadata such as values of the custom fields, data quality flags, @-mentions in the custom fields, @-mentions in articles, object sets, people sets, pickers, multi-select pickers, the title and description, tags, top users, and fields shared through catalog sets.

Migration

Perform the following steps to migrate the metadata:

  1. Copy the migration script that you received from Alation to the /tmp directory on the Alation host. If you use the example below, replace <ssh-user> with your username.

scp copy_powerbi_logical_data_to_scanner.py <ssh-user>@{alation-instance-ip}:/tmp
  1. Use SSH to connect to the Alation host.

  2. Enter the Alation shell:

sudo /etc/init.d/alation shell
  1. Stage the migration scripts in the .. code-block:: Bash one_off_scripts directory.

sudo cp /tmp/<Migration_Script>.py /opt/alation/alation/opt/alation/django/rosemeta/one_off_scripts
  1. Change user to alation:

sudo su alation
  1. Navigate to the .. code-block:: Bash one_off_scripts directory.

cd opt/alation/django/rosemeta/one_off_scripts/
  1. Find the SourceID of the connector. The SourceID of the datasource can be viewed in the browser URL when you view the datasource catalog page. In the below example, the SOURCE_ID would be 9.

Example:

htpps://qualified_domain_name/bi/v2/server/9/

  1. Run the migration script with the below command.

python <Migration_Script>.py -s <Old Power BI Connector_SourceID> -t <New Azure Power BI Scanner Connector_SourceID>
  1. Exit the shell:

exit
  1. The migration status will not be stored in logs and will be displayed in the console only during the migration process.

Expressions Supported by Lineage

Lineage between data source tables and Power BI datasets is built through parsing dataset expressions which are Power BI M Queries (Power Queries). Alation gets these expressions as responses from the getScanResult API. Lineage between tables and datasets will only be generated if expressions are in one of the formats shown below.

Database Name

Connection Type

Sample Expression

Azure Databricks

Table

"let\n    Source = Databricks.Contents(\"adb-8651250466505288.8.azuredatabricks.net\", \"sql/protocolv1/o/8651250466505288/
0622-214404-awash362\", [Database=null, BatchSize=null]),\n    SPARK_Database = Source{[Name=\"SPARK\",Kind=\"Database\"]}
[Data],\n    default_Schema = SPARK_Database{[Name=\"default\",Kind=\"Schema\"]}[Data],\n    diamonds_Table =
default_Schema{[Name=\"diamonds\",Kind=\"Table\"]}[Data]\nin\n    diamonds_Table";

AWS Databricks

Table

“let\n Source = Databricks.Catalogs(\"dbc-25e69bfd-44ed.cloud.databricks.com\", \"sql/protocolv1/o/7841352139603430/
0118-050057-4xra3flu\", [Database=null, BatchSize=null]),\n       hive_metastore_Database = Source{[Name=\"hive_metastore\",
Kind=\"Database\"]}[Data],\n           technology_Schema = hive_metastore_Database{[Name=\"default\",Kind=\"Schema\"]}[Data],
\n               strm_dems_Table = technology_Schema{[Name=\"student2\",Kind=\"Table\"]}[Data]\n          in\n
strm_dems_Table\n”;

“let\n    Source = DatabricksMultiCloud.Catalogs(\"dbc-25e69bfd-44ed.cloud.databricks.com\", \"sql/protocolv1/o/
7841352139603430/0210-085930-oexhpgse\", [Catalog=null, Database=null, EnableAutomaticProxyDiscovery=null]),\n
hive_metastore_Database = Source{[Name=\"hive_metastore\",Kind=\"Database\"]}[Data],\n    default_Schema = hive_metastore_
Database{[Name=\"default\",Kind=\"Schema\"]}[Data],\n    student2_Table = default_Schema{[Name=\"student2\",Kind=\"Table\"]}
[Data]\n in\n    student2_Table\n”;

Unity Databricks

Table

“let\n    Source = Databricks.Catalogs(\"adb-8443049157651279.19.azuredatabricks.net\", \"sql/protocolv1/o/
 900788168547414/1031-122656-8zrkv0jk\", [Catalog=\"default\", Database=null, EnableAutomaticProxyDiscovery=null]),\n
 hive_metastore_Database = Source{[Name=\"hive_metastore\",Kind=\"Database\"]}[Data],\n  default_Schema = hive_metastore_
 Database{[Name=\"default\",Kind=\"Schema\"]}[Data],\n    t1_Table = default_Schema{[Name=\"t1\",Kind=\"Table\"]}
 [Data]\nin\n    t1_Table”;

SQL Server

Table

"let\n    Source = Sql.Databases(\"ifac-sqlsrv.ceeyrlqdpprr.us-west-2.rds.amazonaws.com\"),\n    Sales = Source{[Name=\"Sales\
"]}[Data],\n    dbo_Customer_Orders = Sales{[Schema=\"dbo\",Item=\"Customer_Orders\"]}[Data]\nin\n    dbo_Customer_Orders";

"let\n    Source = Sql.Databases(\"tcp:ifac-sqlsrv.ceeyrlqdpprr.us-west-2.rds.amazonaws.com,1433\"),\n    AdventureWorks =
Source{[Name=\"AdventureWorks\"]}[Data],\n    Sales_vPersonDemographics =
AdventureWorks{[Schema=\"Sales\",Item=\"vPersonDemographics\"]}[Data]\nin\n    Sales_vPersonDemographics";

"let\n Source = Sql.Databases("tcp:sqlwrhlondonprod.b05aaf70da1f.database.windows.net"),\n LONDON = Source{[Name="london"]}
[Data],\n london_pbi_SHIFTREPORT_REACHSTAT_Allinone = LONDON{[Schema="london",Item="pbi_SHIFTREPORT_REACHSTAT_Allinone"]}
[Data],\n #"Filtered Rows" = Table.SelectRows(london_pbi_SHIFTREPORT_REACHSTAT_Allinone, each [P_sysdate] >= RangeStart and
[P_sysdate] <= RangeEnd)\nin\n #"Filtered Rows""

Query

"let\n    Source = Sql.Database(\"ifac-sqlsrv.ceeyrlqdpprr.us-west-2.rds.amazonaws.com\", \"pubs\", [Query=\"select * from
jobs\"])\nin\n    Source";

"let\n    Source = Sql.Database(\"synapse-edw-d.sql.azuresynapse.net\", \"edwsyndsql\", [Query=\"SeLeCt#(lf)    s.f_name,#(lf)
t.f_name#(lf)from #(lf)    tmp.src_test s#(lf)inner JOIN  #(lf)    tmp.tgt_test t#(lf)ON  #(lf)    s.f_name = t.f_name\"])\nin
\n    Source";

"let\n    Source = Sql.Database(\"synapse-edw-d.sql.azuresynapse.net\", \"edwsyndsql\", [Query=\"select#(lf)    s.f_name,#(lf)
t.f_name#(lf)from #(lf)    tmp.src_test s#(lf)inner JOIN  #(lf)    tmp.tgt_test t#(lf)ON  #(lf)    s.f_name = t.f_name\"])\nin
\n    Source";

"let\n    Source = Sql.Database(\"10.13.12.200:1433\", \"test_database\", [Query=\"select * from
\"\"test_profilling_main.arcs.test\"\".\"\"arcstable\"\"\"])\nin\n   Source";

MySQL

Table

"let\n    Source = MySQL.Database(\"ifac-mysql.ceeyrlqdpprr.us-west-2.rds.amazonaws.com:3306\", \"employees\",
[ReturnSingleDatabase=true]),\n    employees_departments = Source{[Schema=\"employees\",Item=\"departments\"]}[Data]\nin\n
employees_departments";

Query

"let\n    Source = MySQL.Database(\"ifac-mysql.ceeyrlqdpprr.us-west-2.rds.amazonaws.com:3306\", \"crm\", [ReturnSingleDatabase
=true, Query=\"select c.customerNumber , c.customername, c.city, c.country from customers c , orders o where
c.customernumber=o.customernumber\"])\nin\n    Source";

Azure SQL

Table

"let\n    Sql.Database(\"tf-testal-94619nimeshkuma-17.database.windows.net\", \"SqlServerAzDB_1\"),\n
schemaWithViews01_testView01 = Source{[Schema=\"schemaWithViews01\",Item=\"testView01\"]}[Data]\nin\n
schemaWithViews01_testView01";

Azure Synapse

Table

"let\n    Source = Sql.Database(\"synaptestal125371ayush24-ondemand.sql.azuresynapse.net\", \"master\"),\n
dbo_MSreplication_options = Source{[Schema=\"dbo\",Item=\"MSreplication_options\"]}[Data]\nin\n
dbo_MSreplication_options";

Oracle

Table

"let\n    Source = Oracle.Database(\"ifac-orcl.ceeyrlqdpprr.us-west-2.rds.amazonaws.com:1521/orcl\", [HierarchicalNavigation=
true]),\n    IFAC_ADMIN = Source{[Schema=\"IFAC_ADMIN\"]}[Data],\n    ORDER_ITEMS1 = IFAC_ADMIN{[Name=\"ORDER_ITEMS\"]}[Data]\
nin\n    ORDER_ITEMS1";

Postgres

Table

"let\n    Source = PostgreSQL.Database(\"ifac-pgsql.ceeyrlqdpprr.us-west-2.rds.amazonaws.com:5432\", \"postgres\"),\n
public_events = Source{[Schema=\"public\",Item=\"events\"]}[Data]\nin\n    public_events";

Amazon Redshift

Table

"let\n    Source = AmazonRedshift.Database(\"redshift-cluster-1.csjsqfswsudr.us-east-1.redshift.amazonaws.com:5439\", \"dev\",
[BatchSize=null]),\n    public = Source{[Name=\"public\"]}[Data],\n    category1 = public{[Name=\"category\"]}[Data]\nin\n
category1";

Snowflake

Table

"let\n    Source = Snowflake.Databases(\"alation_partner.us-east-1.snowflakecomputing.com\", \"LOAD_WH\", [Role=null,
CreateNavigationProperties=null, ConnectionTimeout=null, CommandTimeout=null]),\n    TABSFDC_Database =
Source{[Name=\"TABSFDC\",Kind=\"Database\"]}[Data],\n    PUBLIC_Schema = TABSFDC_Database{[Name=\"PUBLIC\",Kind=\"Schema\"]}
[Data],\n    ACCOUNT_Table = PUBLIC_Schema{[Name=\"ACCOUNT\",Kind=\"Table\"]}[Data]\nin\n    ACCOUNT_Table";

Non-Quoted Strings:

"let\n    Source = Snowflake.Databases(DS_CONN,DW_CONN,[Role=DS_ROLE]),\n    ANALYTICS_DB_Database =
Source{[Name=DB_CONN,Kind=\"Database\"]}[Data],\n    INSIGHT_REPORT_Schema = ANALYTICS_DB_Database{[Name=DSCH_CONN,Kind=\
"Schema\"]}[Data],\n    CUSTOMER_DIM_G_AGREEMENT_DIM_G_VW_View =
INSIGHT_REPORT_Schema{[Name=\"CUSTOMER_DIM_G_AGREEMENT_DIM_G_GBI_RGG_VW\",Kind=\"View\"]}[Data],\n    #\"Removed Other Columns
\" = Table.SelectColumns(CUSTOMER_DIM_G_AGREEMENT_DIM_G_VW_View,{\"CustomerKey\", \"CustomerCd\", \"CustomerDesc\",
\"MasterAgreementDesc\"})\nin\n    #\"Removed Other Columns\"

Query

"let\n    Source = Value.NativeQuery(Snowflake.Databases(\"hg51401.snowflakecomputing.com\",\"RESTAURANTS\"){[Name=\"FIVETRAN\
"]}[Data], \"select * from fivetran.restaurants_global_postsales.sc_new_monthly_churn\", null, [EnableFolding=true])\nin\n
Source";

Google BigQuery

Table

Extraction from a Table:

"let\n   Source = GoogleBigQuery.Database(),    #\"eng-scene-228201\" = Source{[Name=\"eng-scene-228201\"]}[Data],
HR_Data_Schema = #\"eng-scene-228201\"{[Name=\"HR_Data\",Kind=\"Schema\"]}[Data],   HR_Recruiting_Table = HR_Data_Schema{
[Name=\"HR_Recruiting\",Kind=\"Table\"]}[Data]
in
HR_Recruiting_Table";

Extraction From a View:

“let\n  Source =
GoogleBigQuery.Database(), #\"test-alation-database-1\" = Source{[Name=\"test-alation-database-1\"]}[Data], profiling_Schema =
#\"test-alation-database-1\"{[Name=\"profiling\",Kind=\"Schema\"]}[Data],\n   gbq_profile_View = profiling_Schema{[Name=
\"gbq_profile\",Kind=\"View\"]}[Data]\n
in\n  gbq_profile_View”;

GBQ has default hostname - www.googleapis.com

Query

"let\n    Source = Value.NativeQuery(GoogleBigQuery.Database(){[Name=\"test-alation-database-1\"]}[Data], \"select * from
`test-alation-database-1.CAPITALDATASET.columnprofiler`\", null, [EnableFolding=true])\nin\n  Source";

Teradata

Table

"let\n    Source = Teradata.Database(\"10.13.25.7\", [HierarchicalNavigation=true]),\n    test_query_ingestion =
 Source{[Schema=\"test_query_ingestion\"]}[Data],\n    test2 = test_query_ingestion{[Name=\"test1\"]}[Data]\nin\n    test2";

Query

"let\n    Source = Teradata.Database(\"10.13.25.7\", [HierarchicalNavigation=true, Query=\"SELECT * from test_query_ingestion.
test1 t1 left join test_query_ingestion.test2 t2 on 1=1\"])\nin\n    Source";

Parameterised Expressions Supported by Lineage

Azure Power BI Scanner connector supports the parameterized expressions mentioned in the below table for Lineage:

Note

All parameterized expressions in the table below work with all supported data sources mentioned in Expressions Supported by Lineage.

Types of Parameterized Expressions

Example Parameters used in Dataset Expression

Name of the parameterized expression

Example: BQEnv

let
Source = GoogleBigQuery.Database(),
#"test-alation-database-1" = Source{[Name=BQEnv]}[Data],
PowerBiSchema_Schema = #"test-alation-database-1"{[Name="PowerBiSchema",Kind="Schema"]}[Data],
joinedTable_Table = PowerBiSchema_Schema{[Name="joinedTable",Kind="Table"]}[Data]
in
joinedTable_Table

Parameterized expression preceding with #

Example: #”BQEnv”

let
Source = GoogleBigQuery.Database(),
#"test-alation-database-1" = Source{[Name=#"BQEnv"]}[Data],
PowerBiSchema_Schema = #"test-alation-database-1"{[Name="PowerBiSchema",Kind="Schema"]}[Data],
joinedTable_Table = PowerBiSchema_Schema{[Name="joinedTable",Kind="Table"]}[Data]
in
joinedTable_Table

Parameterized expression used with skip character &

Example: “”&BQEnv&””

let
Source = GoogleBigQuery.Database(),
#"test-alation-database-1" = Source{[Name=""&BQEnv&""]}[Data],
PowerBiSchema_Schema = #"test-alation-database-1"{[Name="PowerBiSchema",Kind="Schema"]}[Data],
joinedTable_Table = PowerBiSchema_Schema{[Name="joinedTable",Kind="Table"]}[Data]
in
joinedTable_Table

Scenarios Supported by the Connector for Parameter Usage

The connector is expected to accurately parse the parameters in critical parts of expressions and establish lineage. The critical parts of expressions are the host, database, schema, table, query, and any renamed columns. Parameters placed in these parts of an expression are essential for extracting the connection information required for lineage tracing. Parameters situated outside of these parts of an expression are less likely to impact the accuracy of the lineage.

Direct Usage

Parameters can act as placeholders for different variables in a dataset expression. For example, the expression below includes placeholder parameters HostParameter, DatabaseParameter, SchemaParameter, and TableParameter. The connector is expected to parse such parameters and extract the necessary connection information for lineage.

"let\n    Source = MySQL.Database(HostParameter, #\"DatabaseParameter\",
[ReturnSingleDatabase=true]),\n    employees_departments =
Source{[Schema=\"\"&SchemaParameter&\"\",Item=TableParameter]}[Data]\nin\n
employees_departments";

Usage of the Ampersand Operator

The ampersand operator & may be used in expressions to concatenate text within hostnames or queries. In the example below, the ampersand operator is used to concatenate parts of a hostname and a query. The connector is expected to parse such expressions accurately.

"let\n    Source = Value.NativeQuery(Snowflake.Databases(\"hg51401.\"
& hostParameter & \".com\",\"RESTAURANTS\"){[Name=\"FIVETRAN\"]}[Data],
\"select * from \" & SchemaNameParameter & \".restaurants_global_postsales.sc_new_monthly_churn\",
null, [EnableFolding=true])\nin\n    Source";

Note

Using the ampersand operator at the beginning or end of a hostname or query may result in incorrect parsing by the connector, potentially causing incomplete lineage.

Column-Level Lineage Expressions

Datasets

"let\n    Source = Sql.Databases(\"ifac-sqlsrv.ceeyrlqdpprr.us-west-2.rds.amazonaws.com:1433\"),\n
Sales = Source{[Name=\"Sales\"]}[Data],\n
dbo_Orders_Details = Sales{[Schema=\"dbo\",Item=\"Orders_Details\"]}[Data],\n
#\"Renamed Columns\" = Table.RenameColumns(dbo_Orders_Details,{{\"Discount\",
\"Discount_field_of_dataset\"}, {\"OrderID\", \"OrderID_field_of_dataset\"},
{\"ProductID\", \"ProductID_field_of_dataset\"}})\nin\n    #\"Renamed Columns\"";
"let\r\n    Source = Sql.Database(#\"Warehouse server\", #\"Warehouse name\"),\r\n
Warehouse_Date = Source{[Schema=\"Warehouse\",Item=\"Date\"]}[Data],\r\n
#\"Renamed Columns\" = Table.RenameColumns(#\"Removed Other Columns\",
{{\"Calendar YWD Week Of Year_Caption_1033\", \"YWD Week Of Year\"},
{\"Calendar YWD Week_Caption_1033\", \"YWD Week - Year\"},
{\"Calendar YWD Year_Caption_1033\", \"YWD Year\"}, {\"Calendar YQMD Month Of Year_Caption_1033\",
\"YQMD Month Of Year\"}, {\"Calendar YQMD Month_Caption_1033\",
\"YQMD Month - Year\"}, {\"Calendar YQMD Quarter Of Year_Caption_1033\",
\"YQMD Quarter Of Year\"}})\r\nin\r\n    #\"Renamed Columns\"";
"let\r\n    Source = Cds.Entities(#\"Dynamics 365 URL\", [ReorderColumns=null,
UseFormattedValue=null]),\r\n    entities = Source{[Group=\"entities\"]}[Data],\r\n
msfp_questions = entities{[EntitySetName=\"msfp_questions\"]}[Data],\r\n
#\"Renamed Columns\" = Table.RenameColumns(msfp_questions,{\r\n{\"createdby\",
\"Created By\"},\r\n{\"createdon\",\"Created On\"}, \r\n{\"createdonbehalfby\",\"Created By
(Delegate)\"},\r\n{\"importsequencenumber\",\"Import Sequence Number\"},\r\n{\"modifiedby\",
\"Modified By\"},\r\n{\"modifiedon\",\"Modified On\"},\r\n{\"modifiedonbehalfby\",
\"Modified By (Delegate)\"}\r\n})\r\nin\r\n    #\"Renamed Columns\""
"let\n    Source = OData.Feed(Company1, null, [Implementation=\"2.0\"]),\n
PBICloseIncStmtGLEntries = Source[PBIClosingGLEntries],\n
#\"Renamed Columns\" = Table.RenameColumns(PBICloseIncStmtGLEntries,\n
{{\"G_L_Account_No\", \"G/L Account No.\"}}),
#\"Changed Type\" = Table.TransformColumnTypes(#\"Renamed Columns\",
{{\"Posting_Date\", type date}}),\n
#\"Renamed Columns1\" = Table.RenameColumns(#\"Changed Type\",
{{\"Posting_Date\", \"Posting Date\"}, {\"Amount\", \"Amt.\"}}),\n
#\"Renamed Columns2\" = Table.RenameColumns(#\"Changed Type2\",
{{\"SmallIconUri\", \"Thumbnail\"}, {\"Description\",
\"Description Full\"}, {\"ReleaseDate\", \"Release Date\"}, {\"AppVersion\",
\"Version\"}, {\"NumberOfRatings\", \"# of Ratings\"}, {\"AverageRating\",
\"Average Rating\"}, {\"Tags\", \"Certified\"},
{\"ShortDescription\", \"Description\"}})\nin\n    #\"Changed Type\""

Dataflows

"section Section1;\r\nshared Query = let\r\n  Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText(\"i45WMlTSUTIyVorViVYyBjJNzZRiYwE=\",
BinaryEncoding.Base64), Compression.Deflate)),
let _t = ((type nullable text) meta [Serialized.Text = true]) in type table
[#\"test 1\" = _t, #\"test 2\" = _t]),
\r\n  #\"Changed column type\" = Table.TransformColumnTypes(Source, {{\"test 1\", Int64.Type},
{\"test 2\", Int64.Type}}),\r\n  #\"Renamed columns\" = Table.RenameColumns(#\"Changed column type\",
{{\"test 2\", \"test 4\"}})\r\nin\r\n
#\"Renamed columns\";\r\nshared #\"Query (2)\" = let\r\n
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText(\"i45WMlTSUTICYmOlWJ1oMMsEiM2UYmMB\",
BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true])
in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
\r\n  #\"Renamed columns\" = Table.RenameColumns(#\"Changed column type\",
{{\"Column3\", \"Column34\"}, {\"Column2\", \"Column23\"}})\r\nin\r\n
#\"Renamed columns\";\r\nshared Table = let\r\n
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText(\"i45WMlTSUTICYmOl2FgA\",
BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true])
in type table [#\"A 1 \" = _t, #\"A 2\" = _t, A3 = _t]),
\r\n  #\"Changed column type\" = Table.TransformColumnTypes(Source, {{\"A 1 \", Int64.Type}, {\"A 2\",
Int64.Type}, {\"A3\", Int64.Type}}),\r\n
#\"Renamed columns\" = Table.RenameColumns(#\"Changed column type\",
{{\"A 1 \", \"B1\"}, {\"A 2\", \"B2\"}, {\"A3\", \"B3\"}})\r\nin\r\n  #\"Renamed columns\";\r\n";