Auto-Titling and Lexicon¶
Lexicon is a dictionary used by ALLIE to auto-title cataloged data objects. It lists mappings between abbreviations and expansions found in your catalog data.
Abbreviations are parts of names of data objects found in the metadata of your Catalog. When Lexicon runs, it parses the catalog metadata and makes a list of all abbreviations it finds.
If the name of a column cataloged in Alation is rgnl_sls, the Lexicon will register two abbreviations from this name: rgnl and sls. Lexicon will treat the underscore as the separator between these abbreviations.
The Lexicon job also computes expansions for the abbreviations it found. Expansions are meaningful words or expressions that are algorithmically mapped to their respective abbreviations. The list of all abbreviation-expansion matches can be found on the dedicated Lexicon page that is available to users with the Server or Catalog Admin roles.
The Lexicon runs on an automatic schedule every weekend (Sundays, 8 am). It parses catalog metadata, re-computes existing abbreviation-expansion mappings, and adds new matches. Auto-titles are re-computed at the same time. Starting with V R6 (5.10.x), Suggested Terms for glossaries are computed by Lexicon too. If required, you can run the Lexicon job manually on demand.
The Lexicon job triggers an auto-titling algorithm, or ALLIE, that will suggest titles for Schema, Table, and Column objects in your catalog based on the abbreviation-expansion matches in the Lexicon dictionary.
The Lexicon job parses the names of extracted schemas, tables and columns for separate words and abbreviations. For example, if a column name is lu_drg, then Alation will infer two abbreviations from this name: lu and drg, the underscore being the separator. If a title consists of full words joined together, like shippingdata, Alation will infer two separate words from this title: shipping and data. All abbreviations and words Alation discovers from data object names will enrich the Lexicon dictionary (Admin Settings > Catalog Admin > Lexicon).
The Lexicon job also parses the text fields on data objects and article objects to find expansions for the abbreviations it has added to the Lexicon dictionary. The abbreviations are then matched to expansions.
Auto-titles for cataloged objects are generated based on the abbreviation-expansion matches found in the Lexicon. For example, if a column name is lu_drg, and Alation has registered such matches as lu = “lookup* and drg” = “diagnosis related group, then it will suggest a title “Lookup Diagnosis Related Group” for this column.
Auto-suggested titles appear in the Title column of the table with metadata objects on a catalog page. There is a “robot head” icon next to auto-suggested titles that indicates that the title is auto-generated by Alation and is waiting for your confirmation. Depending on the level of confidence of the guess, the icon of new titles will either be red Le(low-confidence guess) or yellow (high-confidence guess):
You can teach the ALLIE auto-titling algorithm to make better guesses by confirming or discarding the auto-titles directly on the Lexicon page (Server and Catalog Admins).
To apply accumulated confirmations and rejections of auto-titles on demand, Server or Catalog Admins must run the Lexicon job. See Running the Lexicon Job Manually. The Lexicon job will also run automatically over the weekend and apply all changes that are made before it running on schedule.