Importing a Data Dictionary from a CSV or TSV File

Versions V R3 (5.6.x) and above

You can import a data dictionary into Alation to bulk-edit field values for a data object. Uploading a data dictionary can update the values of existing fields, but it cannot add fields to or remove fields from the data object templates.

Note that Alation also offers an API for uploading the field values in bulk: Custom Field API.

By uploading a data dictionary, you can:

  • Fill empty custom fields with new values from the imported dictionary

  • Overwrite the existing field values with new ones from the imported dictionary.

You can use CSV/TSV or ER/Studio files as source files for your data dictionary upload. If you prefer using CSV, you can download the existing data dictionary, import it into a table-editing tool as a table, bulk-update values, save as CSV/TSV and upload the new version of the dictionary back into Alation.

The source file can only include the keys of data objects that need to be updated and the fields to be updated. There is no need to import the full dictionary. For example, if you only want to update one field on one data object, you can create a source file that only includes the key of this data object, the name of the field to be updated, and the new value.

To import successfully, make sure your data dictionary CSV/TSV source file conforms to several format requirements.

Important

If your source CSV file includes Unicode characters, make sure it is encoded in UTF-8.

Important

Source files must have unique field names. Duplicate field names are not permitted and validation includes verifying imported field names against both custom field names and built-in field names. A warning message appears when an uploaded file conflicts with an existing field name.

Example: An uploaded data dictionary that contains a “description” field name would be disallowed as this conflicts with the catalog built-in field name “description”.

Requirements for CSV/TSV Source File

The source CSV/TSV file must include the head with field names in the required format and the field values for each of the data objects to be updated.

Custom Field Format for Upload

Object Set

The definition for the Object Set custom field must include the name of the field and the object type of the value in the following format: "field name:object type"

When uploading values for the Object Set field, you need to be aware of the Permitted Types included into this field. Overall, an Object Set custom field can refer to such objects as:

  • Data source

  • Schema

  • Table

  • Column

  • User

  • Group

  • Article

For each Permitted Object that can included in the field, you will need to add the associated object type . Use the following definitions:

Alation Object

Associated object type to be used in the CSV/TSV head

data source

data

schema

schema

table

table

column

attribute

user

user

group

groupprofile

article

article

Important

  • As the name of the Object Set field in the source file, use its Name Singular property.

  • For each Permitted Type included in the field, add an individual entry to the head of the file. For example, if the object set field "test" can reference columns and articles, the head should include individual definitions for either of them - if one type of value is being updated or both - if both types of value are updated: "test:attribute","test:article"

  • If you are uploading data object names as values, you need to use the qualified name of the object as the value, including the data source ID. For example:

    • 70.schema1 - value for schema1 in data source with id =``70``

    • 70.schema1.tableA - value for table tableA in schema1 in data source with id =``70``

    • 70.schema1.tableA.volleyball - value for column volleyball in tableA in schema1 in data source with id =``70``

Example:

Assume that there is an Object Set field "Test" on the catalog page of a Data Source A. The field includes all Permitted Types.

You want to import values for this field using a data dictionary. The definition of the field "Test" that you need to include into the file head will look like this:

"test:data","test:schema","test:table", "test:attribute","test:article","test:groupprofile","test:user"

The actual values may look like this:

"70","70.schema1","70.schema1.tableA","70.schema1.tableA.volleyball","Article About Schema1","qagroup","allie.robot@alation.com"

Multiple values for the same object type must use the following format:

"[""70.schema1"",""70.schema2"",""80.schema1""]"

People Set

The definition for the People Set custom field must include the name of the field and the definition for the object name that is referenced by the value in the following format: "field name:object type"

For People Set, you can use two object types:

  • user

  • groupprofile

Each of these types, must have an individual entry in the file head.

Example:

Assume that there is a People Set field "Expert" on the catalog page of a Data Source A.

You want to import a value for this field "Bill Travis". Bill has an account in Alation with the username "bill.travis@email.com".

The definition of the field “Expert” that you need to include into the file head will look like this:

"expert:user"

The actual value to be uploaded will look like this:

"bill.travis@emal.com"

If the set of values to be uploaded includes both users and groups, then the head should include individual definitions for both the object types and will look like this:

"expert:user","expert:groupprofile"

The actual value to be uploaded will look like this:

"bill.travis@emal.com", "qa group"

Multiple values for one type of value must use the following format:

"[""bill.travis@emal.com"",""andy.smith@email.com"",""terry.rauch@email.com""]"

Reference

For the Reference type of field, the head must use the following format: "field name:object type"

Permitted object types and the format are the same as for Object Set.

Picker

For the Picker type of field, the head must use the following format: "field name"

Example:

Head:"country"

Value:"Romania"

Multi-Select Picker

Head requirements for this field type are the same as for Picker. Because there can be multiple values added to the field of this type, the multiple values must use the following format: "[""Value1"", ""Value2"", ""Value3""]"

Example:

The field Countries is a multi-select picker:

Head: "countries"

Values: "[""Germany"", ""Canada"", ""Korea""]"

Rich Text

For the Rich Text type of field, the head must use the following format: "field name"

The value must use the "value" format. For example: "This is a short summary."

Date

For the Date type of field, the head must use the following format: "field name"

The value must use the "YYYY/MM/DD" format. For example: "2019/02/17".

Summary Table

Custom Field Type

Format for Head

Example for Head

Format for Uploaded Value

Value Example

Object Set

"field name:permitted object type1" "field name:permitted object type2"

"test:article" "test:attribute"

"value for object type1" "value for object type2"

"nothing useful" “70.uppercasedb1.mixedcase.volleybal”`

People Set

"field name:user" "field name:groupprofile"

"steward:user" "steward:groupprofile"

"Alation username" "Alation group name"

"allie.robot@alation.com"

Reference

"field name:permitted object type"

"library:article"

"value"

"nothing useful"

Picker

"field name"

"country"

"value"

"Japan"

Multi-Select Picker

"field name"

"countries"

"[""Value1"",""Value2"",""Value3""]"

"[""Germany"",""Canada"",""Korea""]"

Rich Text

"field name"

"summary"

"value"

"Here is a test summary"

Date

"field name"

"creation date"

"YYYY/MM/DD"

"2019/01/29"

Sample CSV

key,title,description,closure rule,dq conformity,has pii,stewards:groupprofile,stewards:user,risk & sensitivity,gold master:attribute,gdpr data compliance,policy:article,relevant dataset:table,pii compliance
census.annres_2010_2014,,,N/A,N/A,N/A,,,N/A,,N/A,,N/A,
census.annres_2010_2014.pop_2010,,,,,,,,,N/A,,,,
census.annres_2010_2014.pop_2014,,,,,,,,,N/A,,,,
census.annres_2010_2014.pop_2012,,,,,,,,,N/A,,,,
census.annres_2010_2014.pop_2011,,,,,,,,,N/A,,,,
census.annres_2010_2014.pop_2013,,,,,,,,,N/A,,,,
census.annres_2010_2014.cbsa,Core-Based Statistical Area,,,,,,,,N/A,,,,

Importing a Data Dictionary From a CSV/TSV File

To import a data dictionary from a CSV/TSV file,

  1. Sign in to Alation and open the catalog page of a data source.

  2. In the upper-right corner, click More then click Upload Dictionary. The Import Data Dictionary page will open.

  3. From the Type of File list, select the type of your source file: CSV/TSV

  4. Drag and drop or upload your source file in the drag-and-drop area. Note that the best performance max file size is 5MB, and overall max file size is 25MB. After you drag and drop or upload, Alation will parse the source file and display the file validation page with the description of any errors there may be. The error messages will appear on top of the page and in the Error column of the file preview.

Import Data Dictionary:

../../_images/Screen_Shot_2019-02-28_at_3.58.57_PM.png
  1. Fix any errors there may be and re-import. When validation succeeds, specify how you want to treat the field values that currently exist in the catalog:

  • Will be kept if this option is selected, only the new values for empty fields will be uploaded from the data dictionary. This option gives precedence to values that currently exist in the catalog over the values for these fields contained in the imported data dictionary. In other words, this option loads the new values for previously empty fields, and does not change the values that already exist in the catalog.

  • Will be replaced if this option is selected, the values in the dictionary will overwrite the values that currently exist in the catalog. This option uploads both the new values for empty fields and updates the existing values with values from the dictionary.

Validation Page - No Errors:

../../_images/Screen_Shot_2019-02-28_at_4.02.17_PM.png

Validation Page - Error:

../../_images/Screen_Shot_2019-02-28_at_4.11.14_PM.png

There are a few things to pay attention to in the data preview of the imported dictionary: The currently existing values - if they exist for a field - appear on top of the imported dictionary values. The preview shows what is going to be uploaded vs. what is going to be ignored:

  • The “active” values that are going to be uploaded appear in darker font color.

  • The “disabled” values that are going to be ignored or overwritten appear in gray font color. If you switch between the upload options, the font color will be switching between “active” and “disabled” values.

Current vs. New Values:

../../_images/Screen_Shot_2019-02-28_at_5.05.44_PM.png
  1. Click Confirm to finish the import. Field values in Alation will be updated from your imported data dictionary.