Pentaho Data Catalog
Data QualityData IntegrationBusiness Analytics
  • Overview
    • Pentaho Data Catalog ..
  • Overview
  • Data Catalog
    • Getting Started
      • Data Sources
      • Process the data
      • Identify the data
      • Business Glossaries & Terms
      • Reference Data
      • Data Lineage
    • Management
      • Users, Roles & Community
      • Data Identification Methods
      • Business Rules
      • Metadata Rules
      • Schedules
      • Workers
    • Components
      • Keycloak
      • Reverse Proxy Server
      • App Server
      • Metadata Store
      • Worker Server
      • Observability
Powered by GitBook
On this page
  1. Data Catalog
  2. Getting Started

Business Glossaries & Terms

Standardise your business terminology across the organisation ..

PreviousIdentify the dataNextReference Data

Last updated 11 months ago

How to Organize Business Glossaries in Your Organization

To streamline communication and enhance understanding within your organization, consider developing functional business glossaries. These glossaries can include terms frequently used across different departments. Here's how you can structure your business terms effectively:

  • Business Term Creation: You can define a business term within a specific domain and category for easy navigation and categorization. This hierarchical arrangement allows for better organization and retrieval of terms.

  • Placement Options: A term can be categorized under a domain, placed within a category under that domain, or established as a standalone term.

  • Unassigned Terms: In cases where a term is created without assigning it to a domain or category, it will be labeled as unassigned. This ensures that no term is left out, even if it's not yet categorized.

Organizing your business glossary with these methods can significantly aid in maintaining clarity and consistency across your organization.

Managing Business Glossaries in Pentaho Data Catalog

The Pentaho Data Catalog streamlines the process of managing your data environment through business glossaries, providing a unified platform for the creation, organization, curation, and identification of key glossary items such as domains, categories, and terms. This enables efficient navigation and access to the right data.

Role-Based Access and Permissions

Interaction with the business glossaries is governed by the roles and permissions assigned to users, ensuring secure and targeted access to data and metadata:

  • Business Steward: Users with this role have the capability to import/export/create/delete/update glossaries, allowing for the expansion and refinement of the business glossary structure.

  • Analyst: Users with this role can enrich glossaries by adding terms within categories and creating associations for existing terms in accessible glossaries. This role facilitates detailed and comprehensive glossary content development.

Leveraging the Business Glossary

The business glossary not only aids in efficient data navigation but also serves as a critical tool in role-based access control. This ensures that valuable data and metadata are secured, properly segmented, and prevented from reaching unintended recipients.

By effectively utilizing the Pentaho Data Catalog and its business glossary capabilities, organizations can enhance their data governance, improve data understanding, and secure sensitive information.

Import the 'Healthcare' Glossary

Ensure you have logged in as: Business Steward.

Username

business_steward@hv.com

Password

Welcome123!

  1. Click Glossary in the left navigation menu & select: Import from the drop=down Actions menu options.

  1. In the Glossary Items field, browse and select the file you want to import. You can also download a template if needed.

In Data Catalog, you can import a glossary from a file in one of the following file types:

• JavaScript Object Notation (application/json)

• Comma Separated Values (text/csv)

• Multipurpose Internet Mail Extensions (application/vnd.ms-excel)

The Glossary is located at:

~/Workshop--Pentaho-Data-Catalog/Glossary/healthcare-glossary.json

You have successfully imported the glossary!

  1. View the imported terms in the left glossary item tree.

Create a Glossary

Establishing a hierarchical structure by categorizing business terms into domains and specific categories simplifies data navigation and management. This organized structure boosts efficient data discovery and strengthens governance through role-based access controls. In the realm of data management, business terms are crucial in a data catalog, guaranteeing seamless identification, access, and utilization of data in line with organizational goals and compliance mandates.

Ensure you have logged in as: Business Steward.

Username

business_steward@hv.com

Password

Welcome123!

  1. Click Glossary in the left navigation menu & select 'Add New Glossary'.

  1. Enter 'Test' and click 'Create'.

  1. Enter a Definition & Purpose by clicking on the Edit option.

  1. Click 'Save Changes'.


The following panels enable to track and audit any changes to the Glossary.

The Properties panel enables you to track and audit any changes.

Property
Value(s)

Sensitivity

LOW, MEDIUM, HIGH

Domain

Enter Domain name

Custodian

Search & Select from list of users

Business Steward

Search & Select the Business Steward

Critical Data Element

This property is usually applied to columns. These columns should be critical pieces of information that are necessary for decision making and so need to be governed with the highest care.

Status

Accepted, Draft, Review, Deprecated

Besides organizing your Glossary by Domain & Catagory, the Data Catalog allows you to assign tags to your resources. A tag is a label you can use to describe an element and to retrieve it later when browsing or searching.

  1. You can manually add a Tag: Healthcare

  1. Save & Test by searching for: 'Test'

You can select the colour & change the icon.

The UI will also allow to set / add:

• A star rating

• Associate with Business Rule(s)

• Add Comments

• Add Owners

• Tags

  1. Select 'Test' Domain & then 'Add New Category'

  1. Enter the Category Name: 'Test Category' & select Parent: 'Test'.

  1. Click 'Create'.

  2. Enter a Definition & Purpose by clicking on the Edit option.

The UI will also allow to set / add:

• A star rating

• Associate with Business Rule(s)

• Add Comments

• Add Owners

• Tags

  1. Click 'Save Changes'.

  1. Select 'Test Category' & then 'Add New Term'.

  1. Enter the Term Name: 'Test Term' & select Parent: 'Test Category'.

  1. Click 'Create'.

  2. Enter a Definition & Purpose by clicking on the Edit option.


In a data catalog, a Business Term refers to metadata that describes the business aspects of a data asset. For example, a business term might indicate whether the data represents customer demographics, financial transactions, or product inventory.

Custom Term

A Business Term can also be associated 'free text' or 'custom values'.

An example could be to either create a Term - Marital Status - and either associate with a Rule or some (database) custom values: M, S, D, and so on ..

Under our 'Test Category' let's create a Business Term - - and assciate it with

  1. Click the 'Custom' tab.

  2. Click the “+ Add Custom Property” button.

  1. Enter the Label, default value and select either Free text or Select Value that will be associated with the Term.

  1. Click 'Save'.

x

Data Elements

A data element within a data catalog plays a crucial role in organizing and managing an organization’s data assets.

These assets can include structured (tabular) data, unstructured data (such as documents, web pages, and social media content), reports, query results, data visualizations, dashboards, machine learning models, and connections between databases.

A Data Element refers to an entity in your data source. With JDBC datasources the options are:

SCHEMA

TABLE

COLUMN

In this example were going to associated our Test Term with the Social Security Number (SSN) - a column in the Patients table.

  1. Click the 'Data Elements' tab.

  2. Click the “+ ADD DATA ELEMENT” button.

  1. Select: postgresql:synthea -> synthea -> patients -> ssn

Resource -> Schema -> Table

You can also search for the entity.

  1. Click 'Add' then 'Close'.

  1. Click on: 'View' to verify that the Business Term: 'Test Term' has now been successfully associated with patients -> ssn column.

Obviously we would change the Business Term to: Social Security Number.

For the further information on:

Properties, Tags & Styles
Components of a Glossary
Healthcare Glossary
Import Healthcare Glossary
Import healthcare-glossary.json
Healthcare Glossary
Add New Domain
'Test' Domain
Business Glossary - Test
Tags
Note: not case sensitive
Style Options
Add a Test Category
Test Category
Add New Term
Add 'Test Term'
Business Term
Business Terms
Custom Property - Marital Status
Default value - Single
Add a Data Element
Add Entity to Business Term
Data Element - SSN
Data Element - SSN
Business Term - associated with a column in a 'Resource' Table.