Process the data

Collect statistical and informative summaries about the data ..

Processing and Profiling the 'Patient' Table

After successfully ingesting the metadata from our database schemas, our next step focuses on the data - 'Patient' table. This phase is essential for evaluating the data's structure, quality, and integrity to ensure our database's efficiency and effectiveness. Although currently limited to database metadata, Data Profiling will offer deeper insights, which we will explore later in this workshop.

Processing the data

Accessing Your Catalog

To access your catalog, please follow these steps:

  1. Open Google Chrome web browser. and click on the bookmark, or

    Navigate to: https://pdc.pentaho.example/

  2. Enter the following email and password, then click Sign In.

Password

Welcome123!

Security Advisory: Handling Login Credentials


Process the Data

  1. Select 'Data Canvas' from the left menu option.

  2. Click the checkbox to select the 'synthea' schema.

Process synthea schema
  1. Click 'Process'.

In the process of managing both structured and unstructured data, two critical steps stand out: Metadata Ingest and Data Profiling. This distinction is essential for ensuring data quality and accessibility.

Ingest Metadata

Metadata ingest is a foundational process in data management within a Data Catalog. It involves the automatic collection of metadata — the data about data — from a database schema / file / object. This step is crucial for understanding and organizing the data, making it easily accessible for further analysis and data profiling.

  1. Navigate to the metadata ingest section of your Data Catalog tool and initiate the process by clicking the Start button.

Metadata Ingest
  1. Users can select specific tables or datasets for metadata ingestion. For example, if you are interested in patient information, you might expand the 'patients' table and opt for relevant fields such as 'passport'.

  2. After starting the ingest process, monitor its progress on the Manage Workers page. This page provides real-time updates on the ingestion task.

Metadata Ingest - Worker

Last updated