Downloading a Data Dictionary¶
Alation Version V R3 (5.6.x) and above
When Alation users fill their catalog with content - titles, descriptions, and custom field values - all this information about data becomes an important asset on its own.
If you want to export this information for analysis, you can use the data dictionary download to do that. Moreover, you can upload a data dictionary from a source file to bulk-update field values on a data source.
A Data Dictionary in Alation is a consolidated summary of all titles, descriptions, and custom field values which exist for a data source and its child data objects. It is a file that is generated on demand and “pulls in” all the information about this data source that is added to the catalog by users.
Note
Data dictionaries are only available for Data Source type of Sources - sources that connect Alation to databases.
The download of a data dictionary can be done for all the cataloged schemas of a data source or narrowed down to a particular child data object of this data source down to the table level. For example, you can download a data dictionary for one schema and its child tables or one table and its child columns. There is no ability to download the dictionary for one single data object without child objects if it has them.
Any data dictionary you download will include the key, title, description, and custom fields.
Key¶
key
is the qualified name of a data object in Alation which both identifies it and points to its “place” in the structure of the parent data object.
For example, ins.all_claims.claim_id is the key for the column claim_id
in the table all_claims
in the schema ins
in the data source the data dictionary of which is exported. The names of parent and child objects are separated with a dot.
The data source itself uses an empty key value.
Key Format Summary¶
Object |
Key |
data (for data sources) |
” “ |
schema |
“schema_name” |
table |
“schema_name.table_name” |
attribute (for columns) |
“schema_name.table_name.attribute_name” |
Title, Description, Custom Fields¶
title
is the Title field of the data objectdescription
is the Description field of the data objectcustom fields are the custom fields on the templates of the parent object and its child objects (schemas, tables, columns). For more information about templates and custom fields, see Creating Custom Fields for Catalog Pages.
The data dictionary does not include any technical metadata (such as, for example, the database name, schema types, table types, or data types).
Example:
Assume there is a data source in Alation that has 1 schema with two tables and two columns each. The data dictionary you can download from the catalog page of this data source will include the keys, titles, descriptions, and the custom fields associated with the catalog templates of the data source itself, its schema, the 2 tables, and the four columns.
Note
Custom field values that are propagated from Catalog Sets are not included in data dictionaries. For more on Catalog Sets, see Catalog Sets.
Data Dictionary Download Formats¶
You can download the data dictionary as:
CSV
XML
JSON
Downloading a Dictionary¶
To download a data dictionary,
Sign in to Alation and open the catalog page of a data source. If you want to download the complete data dictionary for this data source and all its child data objects, download from the data source page. If you need the data dictionary for one particular child data object (for example, one of the schemas or tables), then navigate to the catalog page of this data object and download from there.
In the upper-right corner, click More then click Download a Dictionary. The Download Dictionary dialog will open.
In the dialog, select the file format: CSV, XML, or JSON - and click Download. In releases V R3 (5.6.x) to V R5 (5.9.x), the data dictionary will be downloaded to your computer. Starting from V R6 (5.10.x), the download will happen asynchronously: Asynchronous Data Dictionary Download. The name of the data dictionary file will include the data source and the object IDs. For example:
data_dict_72_data_72
means it is a data dictionary for the data source (object typedata
) with ID72
; anddata_dict_72_table_1810
means it is a data dictionary for the table with ID1810
in data source with ID72
.
N/A vs. Empty Values¶
In the data dictionary file that you download, you may see empty values
for some of the fields, and N/A
values for some other.
An empty value - an empty space separated by the delimiter - means that the field does
exist for the data object described by this current line of the
dictionary, but the value has not been filled.
Also the values propagated from Catalog Sets will be reflected as empty values.
The N/A
value can mean:
the given field is not associated with this particular data object template and the value does not exist for this data object.
User does not have permission to view this field
Example¶
The following data dictionary lines (CSV):
key,title,description,executive summary,short description,countries,department
,Redshift,,N/A,N/A,"[""Germany"", ""Canada"", ""Korea""]",N/A
mean that:
There is no “key” value for the data object described by this row of values (it is a data source)
There is also no description (empty value)
The fields “executive summary” ,”short description”, and “department” either do not apply to this data object and do not exist on its page template in Alation or the user does not have permission to view them.