Pentaho Data Catalog
Data QualityData IntegrationBusiness Analytics
  • Overview
    • Pentaho Data Catalog ..
  • Overview
  • Data Catalog
    • Getting Started
      • Data Sources
      • Process the data
      • Identify the data
      • Business Glossaries & Terms
      • Reference Data
      • Data Lineage
    • Management
      • Users, Roles & Community
      • Data Identification Methods
      • Business Rules
      • Metadata Rules
      • Schedules
      • Workers
    • Components
      • Keycloak
      • Reverse Proxy Server
      • App Server
      • Metadata Store
      • Worker Server
      • Observability
Powered by GitBook
On this page
  1. Data Catalog
  2. Management

Business Rules

Data Quality Rules - Identify non-compliant rows in your data ..

PreviousData Identification MethodsNextMetadata Rules

Last updated 11 months ago

It’s not simply the quality and timeliness of data that matters, it’s what you can do with it that really matters. It’s all about showing more value to business people from having higher-quality, more up-to-date data.

Business Rules translate business requirements into logic-based rules you can use to tag your data. You can define business rules to manage your data and track its quality by designating whether or not that data is compliant.

You can use the page to define the compliant and non-compliant data and data formats.

Using these definitions, you can use business rules to apply SQL commands (called Data Quality Rules) that identify non-compliant rows in your data. You can add any number of data quality rules to a Business Rule.

If rows are returned it means some of our data is non-compliant (this data is of poor data quality) and we should further investigate and remedy the data at source. Once that is complete, the number of non-compliant rows will hopefully reduce to zero, thus increasing the overall data quality.

Business Rules can be assigned any number of Data Quality (DQ):

• You can choose whether or not to enable data quality rules.

• You can also decide if a rule requires supervisor approval before being deployed.

• Use custom tags to track and group business rules according to your needs.

• For data quality type rules, you can further define one of the 7 standard dimensions of DQ.

Accessing Your Catalog

To access your catalog, please follow these steps:

  1. Open Google Chrome web browser. and click on the bookmark, or

  2. Enter the following email and password, then click Sign In.

Username

data_developer@hv.com

Password

Welcome123!

Security Advisory: Handling Login Credentials

For enhanced security, it is strongly recommended that users avoid saving their login details directly in web browsers. Browsers may inadvertently autofill these credentials in unrelated fields, posing a security risk.

Best Practice

• Disable Autofill: To mitigate potential risks, users should disable the autofill functionality for login credentials in their browser settings. This preventive measure ensures that sensitive information is not unintentionally exposed or misused.

  1. From the Business Rules card click Add New and select: Add Business Rule.

Business rules relate to data quality in at least two fundamental ways. First, they can automate the decisions that the company makes in its day-to-day operations, Second, they can be used to audit data produced by existing processes for compliance with external regulation as well as internal business policies and goals.

  1. Click Management in the left navigation menu.

  2. In the 'Business Rules', click: 'Add Business Rule'.

  3. In the Create Business Rule page, enter the following information.

Field
Description

Business Rule Name (Required)

Enter the unique name of the rule that your users will recognize. Names must start with a letter, and contain only letters, digits, hyphens, or underscores. White spaces are supported, but trailing spaces are not allowed in names.

Created by

Select the username of the owner of the rule. The default value of this field is the logged-in user.

Description

Enter a description for this rule. For example, you may want to indicate the purpose of the rule to assist other users.

Note

Enter additional comments for the rule. For example, you may want to describe the workflow or use case of the rule.

Custom Tag

Add Tag - enables you to group the rules.

Rule Enabled

By default, a new rule is enabled.

Clear the check box to disable the rule. When a Rules Execution job is run, disabled rules are skipped and are not evaluated.

Rule Approved

Select to approve the rule. This option is only available to users with the Data Quality Administrator role.

If you've taken a look at the synthea:patients table you will have noticed that not all the patient passport numbers have been entered.

Let's create a Business Rule, a Data Quality Rule, checking for the passport number.

When the Data Quality Rule is applied (non-compliant passport numbers >= 10% of total count) the corresponding Business Rule status = FAILED.

Field
Setting

Business Rule Name

synthea:patients:passport

Created by

System Operator

Description

DQ - passport number

Note

Applies to all records

Custom Tag

Rule Enabled

check

Rule Approved

uncheck - if you 'approve' the rule you will not be able to configure the rule.

  1. Click Create Business Rule to save your rule.

After you create a Business Rule, you can configure it in the Configuration view of the Business Rule page.

  1. Locate the business rule you want to configure in the table of rules and select the View Details button (>) in its row.

In this example the Business Rule has already been previously executed & modified ..

  1. If you have a large number of rules, select Show Filters to help you find the rule you want to edit.

  2. Click Configuration tab and enter the following details:

Business Rule Type

Internal DQ

Data Quality Dimension

Completeness

Schedule

None

Update Row Count

Enable

  1. Select the Data Quality Dimension - to differentiate between different quality rules that you create.

Dimension
Description

Accuracy

Uniqueness

Consistency

Timeliness

Conformity

Completeness

Validity

  1. Set the schedule to run the business rule - None

As the Business Rule has been previously executed the Last Run detials are dispalyed.

  1. Set the rule scope and condition:

    • Select resources on which you want the rule to be evaluated and applied. Select the target table or column.

    • Define the rule's condition for evaluation using the SQL query. Write an SQL query to identify non-compliant data by returning rows that do not match compliant rows and the scope count.

In the following SQL example patients table is selected to identify patients with missing (NULL) passport numbers.

SELECT 
	count(*) total_count,
	count(passport) scopeCount,
	SUM(CASE 
		WHEN passport isnull THEN 1
		ELSE 0
	END) nonCompliant
FROM synthea.patients

The total number of rows in the passport column in the patients table.

The number of patients with a passport number value.

When the value of the passport number is null then set as 1 = nonCompliant

Finally .. configure some rule actions:

  1. Set the rule: IF NON COMPLIANT ITEMS >= 10% then set the Business Rule Status to: FAILED

  1. Run the rule ..

  1. Check the status of the Job

  1. And take a look at the Details ..

To organize the business rules, you can create a rule group and add your rules. This helps you keep your rules organized and easily accessible.

  1. Click Management in the left navigation menu.

  2. Select Rules Group.

  3. Enter the Rule group name, Description, and Note.

  4. Set the schedule to run all the business rules in a group, and select one of the following schedules: Daily, Weekly, and Monthly.

  5. Click Assign Rules, select the rules you want to add to a group, and then click Assign.

  1. Click Save Changes.

Navigate to:

In the confirmation window, click Configure to or click Close to configure it later.

This task assumes you have completed and are on the Business Rules page.

count(*) total_count
count(passport) scopeCount
SUM(CASE 
		WHEN passport isnull THEN 1
		ELSE 0
	END) nonCompliant
https://pdc.pentaho.example/
Configure a rule
Create a rule
Business Glossary
Management - Business Rules
patients.passport
Create Business Rule
Configure Business Rule
Business Rule - synthea:patients:passport
Business Rule - View Details
Business Rule - patients.passport
Business Rule - set scope
Configure Rule Actions
Business Rule Status
Business Rule - Workers
Business Rule - Details
Assign Rules to Group