Requirements for CSV/TSV Source Files

Alation Cloud Service Applies to Alation Cloud Service instances of Alation

Customer Managed Applies to customer-managed instances of Alation

To upload a data dictionary successfully, make sure your CSV or TSV source file conforms to the format requirements below.

Important

Using third-party CSV file editing tools like Excel or Google Sheets to edit data dictionary files carries a risk of unintentional content modifications, specifically to the header line (the first line of the file). When editing your CSV file with these tools, ensure the export settings are configured to use commas as delimiters. It is also essential to preserve the original formatting of the column header line exactly as it was in the file downloaded from Alation.

For example, if a custom field name contains any of these special characters: ,, :, ;, |, they will be enclosed within additional double quotes in the downloaded file. A field name will appear as follows: """subject to:"":attribute". Quotes are removed by tools like Excel when the file is opened, edited, and saved. Verify the header format if you have fields with special characters.

Learn more in Special Characters in Custom Field Names.

Custom Field Format

The definition for the Object Set custom field must include the name of the field and the object type of the value in the following format: "field name:object type".

When uploading values for the Object Set field, you need to be aware of the Permitted Types included into this field. Overall, an Object Set custom field can refer to any combination of the following object types:

  • Data source

  • Schema

  • Table

  • Column

  • User

  • Group

  • Article

  • Term (2023.3.4 and later)

  • Policy (2023.3.4 and later)

For each Permitted Object that can be included in the field, you will need to add the associated object type. Use the following definitions:

Alation Object

Associated object type to be used in CSV/TSV head

Object set value format

Example

data source

data

<datasource_id>

70

schema

schema

<datasource_id>.<schema_name>

70.schema1

table

table

<datasource_id>.<schema_name>.<table_name>

70.schema1.tableA

column

attribute

<datasource_id>.<schema_name>.<table_name>.<col_name>

70.schema1.tableA.volleyball

user

user

<username>

john.doe@example.com

group

groupprofile

<groupname>

Group 1

article

article

<article_name>

Test Article

term

glossary_term

<term_name>

Privacy

policy

business_policy

<policy_name>

Customer Access Policy

Important

  • As the name of the Object Set field in the source file, use its Name Singular property. From Alation version 2023.3.3, you can also specify the object set’s object ID, or a combination of the name and object ID, in the form object_id | object_name. Using an object ID can improve performance in updating from an uploaded data dictionary.

  • For each Permitted Type included in the field, add an individual entry to the head of the file. For example, if the object set field test can reference columns and articles, the head should include individual definitions for either columns (test:attribute) or articles (test:article) (if one type of value is being updated) or both (if both types of value are updated): test:attribute,test:article.

  • If you are uploading data object names as values, you need to use the qualified name of the object as the value, including the data source ID. For example:

    • 70.schema1 - value for schema1 in data source with id = 70

    • 70.schema1.tableA - value for table tableA in schema1 in data source with id = 70

    • 70.schema1.tableA.volleyball - value for column volleyball in tableA in schema1 in data source with id = 70

Example

Assume that there is an Object Set field Test on the catalog page of a Data Source A. The field includes all Permitted Types.

You want to import values for this field using a data dictionary. The definition of the field Test that you need to include into the file head will look like this:

test:data,test:schema,test:table,test:attribute,test:article,test:groupprofile,test:user,test:glossary_term,test:business_policy

The actual values may look like this:

70,70.schema1,70.schema1.tableA,70.schema1.tableA.volleyball,Article About Schema1,qagroup,allie.robot@alation.com,Privacy,Customer Access Policy

Multiple values for the same object type should be separated by semicolons:

70.schema1;70.schema2;80.schema1

The definition for the People Set custom field must include the name of the field and the definition for the object name that is referenced by the value in the following format: field name:object type.

For People Set, you can use two object types:

  • user

  • groupprofile

Each of these types must have an individual entry in the file head.

Example

Assume that there is a People Set field Expert on the catalog page of a Data Source A.

You want to import a value for this field Bill Travis. Bill has an account in Alation with the username bill.travis@example.com.

The definition of the field Expert that you need to include into the file head will look like this:

expert:user

The actual value to be uploaded will look like this:

bill.travis@example.com

If the set of values to be uploaded includes both users and groups, then the head should include individual definitions for both the object types and will look like this:

expert:user,expert:groupprofile

The actual value to be uploaded will look like this:

bill.travis@example.com,qa group

Multiple values for one type of value must be separated by semicolons:

bill.travis@example.com;andy.smith@example.com;terry.rauch@example.com

Note

In the data dictionary file downloaded from Alation, you’ll notice that the People Set type of values have the format of <object_ID>|<value>, for example: 524|sally@alationmail.com. This is a more technical format that reduces the time to identify a catalog object during the processing of the source file. Also, if there are conflicting values in the catalog, IDs are the only way to differentiate them.

Alation also supports simpler formats, such as just <object_ID> and <value>.

To find an object ID, open the catalog page of this object. The ID will be present in the URL, for example:

  • For a user https://<base_URL>/user/524/—The object ID is 524. The corresponding value in the data dictionary would be 524|sally@alationmail.com.

  • For a group https://https://<base_URL>/group/22/—The object ID is 22. The corresponding value in the data dictionary would be 22|curators.

For the Reference type of field, the head must use the following format: field name:object type.

Permitted object types and the format are the same as for Object Set.

For the Picker type of field, the head must use the following format: field name.

Example

Head:country

Value:Romania

Head requirements for this field type are the same as for Picker. Because there can be multiple values added to the field of this type, the multiple values must use the following format: "[""Value1"",""Value2"",""Value3""]".

The special value "[]" resets the value of the multipicker field.

Example

The field Countries is a multi-select picker:

Head: countries

Values: "[""Germany"",""Canada"",""Korea""]"

For the Rich Text type of field, the head must use the following format: field name.

The value must use the value format. For example: This is a short summary.

For the Date type of field, the head must use the following format: field name.

The value must use the YYYY/MM/DD format. For example: 2019/02/17.

Formats Cheat Sheet

Custom Field Type

Format for Head

Example for Head

Format for Uploaded Value

Value Example

Object Set

field name:permitted object type1 field name:permitted object type2

test:article test:attribute

value for object type1 value for object type2

nothing useful 70.uppercasedb1.mixedcase.volleyball

People Set

field name:user

field name:groupprofile

steward:user

steward:groupprofile

Alation username

ID|Alation username

Alation group name

ID|Alation group name

allie.robot@alation.com

524|sally@alationmail.com

curators

22|curators

Reference

field name:permitted object type

library:article

value

nothing useful

Picker

field name

country

value

Japan

Multi-Select Picker

"field name"

"countries"

"[""Value1"",""Value2"",""Value3""]"

"[""Germany"",""Canada"",""Korea""]"

Rich Text

field name

summary

value

Here is a test summary

Date

field name

creation date

YYYY/MM/DD

2019/01/29

Special Characters in Custom Field Names

The data dictionary upload process considers some characters as special for its own formatting. Custom field names containing such characters require special formatting:

  • If a custom field name contains a pipe (|) or colon (:) character, the whole field name should be wrapped in quotes ("").

  • If a custom field name contains a comma (,), the whole field name should be wrapped in quotes ("") and the comma should be escaped with a backslash (\).

The following table shows how some sample custom field names should appear on the column header line for different use cases:

Custom Field name

Column Header format

sample-custom-field-name

...,sample-custom-field-name,...

space separated custom field name

...,space separated custom field name,...

Role, Responsibility, Duty

...,"role\\, responsibility\\, duty",...

Owner | Custodian (a people-set field)

...,"owner | custodian":user,"owner | custodian":groupprofile,...

Related: Objects (an object-set field)

...,"related: objects":data,"related: objects":schema,"related: objects":glossary_term

Example CSV

2023.3.3 and Newer

al_datadict_item_properties,al_datadict_item_column_data_type,key,title,description,contains pii,critical data element,data quality policies,data quality status,data quality summary,derived term:glossary_term,expert:groupprofile,expert:user,status,steward:groupprofile,steward:user
oid=42;otype=table,,"""alation_edw.finance"".bnk_cust_dim",Bank Customer Details,This table is most often used by the Sales Operation Team.  Data in this table is from customers in the banks North America region.,,_,_,,,_,_,_,_,,11|sally@alationmail.com
oid=11846;otype=attribute,"NUMBER(38,0)","""alation_edw.finance"".bnk_cust_dim.account_number",Account Number,,,,,,_,_,,,,,
oid=7667;otype=attribute,VARCHAR(31),"""alation_edw.finance"".bnk_cust_dim.city",City,,,,,,_,_,,,,,
oid=7854;otype=attribute,"NUMBER(38,0)","""alation_edw.finance"".bnk_cust_dim.custid",Customer unique ID,,,,,,_,_,,,,,
oid=8325;otype=attribute,VARCHAR(10),"""alation_edw.finance"".bnk_cust_dim.firstname",First Name,,,,,,_,_,,,,,
oid=8192;otype=attribute,VARCHAR(12),"""alation_edw.finance"".bnk_cust_dim.lastname",Lastname,,,,,,_,_,,,,,
oid=7549;otype=attribute,VARCHAR(2),"""alation_edw.finance"".bnk_cust_dim.state",State,,,,,,_,_,,,,,

2023.3.1 and Earlier

key,title,description,closure rule,dq conformity,has pii,stewards:groupprofile,stewards:user,risk & sensitivity,gold master:attribute,gdpr data compliance,policy:article,relevant dataset:table,pii compliance
census.annres_2010_2014,,,N/A,N/A,N/A,,,N/A,,N/A,,N/A,
census.annres_2010_2014.pop_2010,,,,,,,,,N/A,,,,
census.annres_2010_2014.pop_2014,,,,,,,,,N/A,,,,
census.annres_2010_2014.pop_2012,,,,,,,,,N/A,,,,
census.annres_2010_2014.pop_2011,,,,,,,,,N/A,,,,
census.annres_2010_2014.pop_2013,,,,,,,,,N/A,,,,
census.annres_2010_2014.cbsa,Core-Based Statistical Area,,,,,,,,N/A,,,,