In Pega, Data Model Design refers to defining how data is stored, organized, and accessed within applications. Efficient data models improve application performance, scalability, and maintainability.
Data relationships define how data elements interact with one another. By designing relationships correctly, you can ensure that data is reliable, organized, and reusable across applications.
The Single Source of Truth (SSOT) ensures that all data originates from a single, authoritative source. This principle avoids data duplication and ensures consistency.
Scenario: Customer Data Management
Implementation:
Data relationships determine how records in one data entity (class) relate to records in another data entity. Pega supports the following types of relationships:
In a One-to-One relationship, each record in one data class maps to exactly one record in another data class.
Scenario: Each customer has one primary address.
| Customer Table | Address Table |
|---|---|
| CustomerID: 001 | AddressID: A001 |
| Name: John Doe | Street: 123 Elm Street |
| Email: [email protected] | City: Springfield |
Implementation in Pega:
Address (Single Page property) stores the address for the customer.Data-Address) as the source.In a One-to-Many relationship, a single parent record links to multiple child records.
Scenario: A customer can place multiple orders.
| Customer Table | Order Table |
|---|---|
| CustomerID: 001 | OrderID: O001, CustomerID: 001 |
| Name: John Doe | OrderID: O002, CustomerID: 001 |
| Email: [email protected] | OrderID: O003, CustomerID: 001 |
Implementation in Pega:
Orders (Page List) contains all orders for a customer.Data-Order) for the Page List property.In a Many-to-Many relationship, multiple records in one class relate to multiple records in another class. This is typically managed using intermediate tables or joins.
Scenario: Students enroll in multiple courses, and each course can have multiple students.
| Student Table | Course Table | Enrollment Table |
|---|---|---|
| StudentID: S001 | CourseID: C101 | StudentID: S001, CourseID: C101 |
| Name: Alice | CourseName: Math | StudentID: S001, CourseID: C102 |
| StudentID: S002, CourseID: C101 |
Implementation in Pega:
Data-Student and Data-Course.Data-Enrollment) to store the relationship.| Relationship | Definition | Pega Implementation | Example |
|---|---|---|---|
| One-to-One | Each record links to one record in another class. | Single Page Property | Customer → Address |
| One-to-Many | One parent record links to multiple child records. | Page List Property | Customer → Orders |
| Many-to-Many | Multiple records in two classes relate to each other. | Join Table and Page Lists | Students ↔ Courses |
In Pega, Data Classes organize data logically, while Rule Types (such as properties, data transforms, and data pages) define, manage, and manipulate the data.
A Data Class is a Pega class used to store, organize, and manage data. It acts as a blueprint for defining data structure and attributes.
System-Defined Classes:
Data- classes for reusable data structures.Custom Data Classes:
Data-Customer, Data-Order, or Data-Product.Derived Data Classes:
Data-Customer: Represents customer information.
Data-Order: Represents order information.
Data-Address: Represents address details.
Pega provides various rule types to define and manage data in a data class.
Properties define the attributes or fields of a data class. Each property represents a specific type of data, such as text, numbers, or dates.
Single Value: Holds a single piece of data.
Name (Text), Age (Integer).Page: Stores a complex data object or single-page structure.
Address (Single Page) containing Street, City, State.Page List: Stores a list of complex objects.
Orders (Page List) containing multiple orders for a customer.Page Group: Stores a set of pages grouped under a key.
| Property Name | Type | Description |
|---|---|---|
| CustomerID | Integer | Unique identifier for customer |
| Name | Text | Customer's full name |
| Text | Customer's email address | |
| Phone | Text | Contact phone number |
| Address | Single Page | Reference to Address details |
A Data Page is a rule that retrieves and caches data for use in applications. It acts as an abstraction layer between the application and the data source.
Scope: Determines who can access the data.
Data Loading:
Refresh Strategies:
Scenario: Fetch customer data in real time from an external CRM.
D_CustomerData.Data Transforms are rules that allow you to map, manipulate, and transform data between structures or classes.
Scenario: Map customer data from an external API to the Data-Customer class.
| Source Property | Target Property | Mapping Logic |
|---|---|---|
| external_customer_id | CustomerID | Direct mapping |
| external_full_name | Name | Direct mapping |
| external_email | Direct mapping | |
| external_phone_number | Phone | Set default if null |
Data Classes: Logical structures for storing and managing data.
Data-Customer, Data-Order, Data-Address.Properties: Attributes of data (e.g., Single Value, Page, Page List).
Data Pages: Retrieve and cache data from external or internal sources.
Data Transforms: Map, manipulate, and initialize data across classes.
A Data Page is a rule in Pega that retrieves and caches data for use across the application. It provides an abstraction layer for data access, allowing developers to fetch data without worrying about its source or implementation details.
A Data Page is a read-only data structure that fetches and stores data temporarily for use in an application. It can load data from external sources (e.g., REST APIs, databases) or internal sources (e.g., reports, case data).
The scope of a Data Page determines its visibility and lifecycle within the application. Pega provides three scopes:
Example:
Example:
Example:
| Scope | Visibility | Use Case |
|---|---|---|
| Thread | Single thread (case/task-specific) | Loan details for a single case |
| Requestor | Single user session | Logged-in user profile |
| Node | All users on a server node | Static product catalogs |
Data Pages can load data from various sources. The loading mechanism can be manual or automatic.
Manual Load
Automatic Load
Scenario: Fetch customer data from an external CRM system.
| Configuration | Details |
|---|---|
| Name | D_CustomerDetails |
| Scope | Thread |
| Source | REST Connector to fetch data via API |
| Refresh Strategy | Reload if older than 30 minutes |
D_CustomerDetails.The refresh strategy determines when the Data Page reloads its data. This is important for ensuring data accuracy and performance.
Reload if older than:
Do not reload:
Reload per interaction:
| Scenario | Strategy |
|---|---|
| Fetch customer data (CRM) | Reload if older than 30 minutes |
| Static product catalog | Do not reload |
| Live stock price data | Reload per interaction |
Static Data (Node Scope):
D_LoanProducts: List of loan products fetched from a database.User-Specific Data (Requestor Scope):
D_UserProfile: Logged-in user’s profile information.Case-Specific Data (Thread Scope):
D_LoanDetails: Loan details for a specific case.Choose the Correct Scope:
Minimize Data Queries:
Define Clear Refresh Strategies:
Reuse Data Pages:
Parameterize Data Pages:
Monitor Performance:
Data Integration in Pega allows your application to interact with external systems, ensuring real-time access to accurate and up-to-date data. Integrations are essential for enabling seamless communication between Pega and third-party applications, databases, or APIs.
Pega provides several integration methods to connect to external data sources, such as APIs, databases, and middleware. These options are implemented using connectors.
A Connector is a rule in Pega that allows you to interact with external systems to send, receive, or query data.
REST (Representational State Transfer) Connectors integrate with RESTful APIs to exchange data using HTTP methods such as GET, POST, PUT, and DELETE.
Scenario: Fetch customer data from an external CRM API.
API Endpoint: https://crm.company.com/api/customers/{CustomerID}
HTTP Method: GET
Request Parameter: CustomerID
Response: JSON format:
{
"CustomerID": "001",
"Name": "John Doe",
"Email": "[email protected]"
}
Use the Integration Wizard in Dev Studio:
Provide the following details:
Define the Data Page to use the REST Connector as its source.
Map the API response to Pega properties using a Data Transform.
SOAP (Simple Object Access Protocol) Connectors integrate with older, XML-based web services.
Scenario: Retrieve an insurance policy from a SOAP web service.
WSDL File: Provided by the external system.
Request: Input policy ID.
Response: XML format:
<Policy>
<PolicyID>12345</PolicyID>
<CustomerName>John Doe</CustomerName>
<Premium>500</Premium>
</Policy>
Connect SQL allows Pega to execute SQL queries on relational databases for data integration.
Scenario: Retrieve all orders for a specific customer from a relational database.
| SQL Query |
|---|
SELECT * FROM Orders WHERE CustomerID = '001'; |
Data Virtualization allows Pega to fetch data in real time from external systems without storing it locally in the Pega database. This avoids redundant data storage and ensures the most up-to-date information is always retrieved.
Pega integrates with external systems via:
Middleware Tools:
APIs:
Databases:
| Integration Method | Use Case | Key Tool |
|---|---|---|
| REST Connectors | Integrate with RESTful APIs | REST Connector Rule |
| SOAP Connectors | Integrate with legacy systems | SOAP Connector Rule |
| Connect SQL | Interact with relational databases | Connect SQL Rule |
| Middleware (Kafka, MuleSoft) | Enterprise integrations | Integration Middleware |
| Data Virtualization | Real-time external data access | Data Pages + Connectors |
Data Validation in Pega ensures that data entered by users or retrieved from external sources meets specific rules, constraints, and formats. Proper validation prevents errors, enhances data quality, and ensures consistency across the application.
A Validation Rule in Pega defines constraints or conditions that data must meet. These rules are applied to ensure user inputs or external data are valid before progressing in the case flow.
Field-Level Validation:
@ and a domain name.Form-Level Validation:
Business Logic Validation:
Scenario: Validate that the "Email Address" field contains a valid email format.
| Property | Validation | Message |
|---|---|---|
Must match pattern: *@*.com |
"Please enter a valid email." |
Steps to Configure:
Email contains @ and ends with .com.Scenario: Validate that all required fields in a loan application form are filled.
| Field | Validation | Message |
|---|---|---|
| Name | Required | "Name cannot be empty." |
| Loan Amount | Required | "Loan Amount is required." |
| Phone Number | Must be numeric | "Phone Number must be valid." |
Steps to Configure:
Definition:
An Edit Validate rule checks user-entered data against a specific pattern or condition during input.
Scenario: Ensure the phone number contains exactly 10 digits.
[0-9]{10}.Result: If the user enters a value other than 10 digits, the system displays an error message.
Definition:
An Edit Input rule automatically reformats user input into a standardized format.
(123)-456-7890).Scenario: Convert the entered phone number 1234567890 to the format (123)-456-7890.
Create an Edit Input Rule:
Logic to reformat the input:
return "(" + input.substring(0,3) + ")-" + input.substring(3,6) + "-" + input.substring(6);
Attach the rule to the Phone Number property.
Declarative Constraints in Pega ensure that data meets specific conditions or business rules automatically without requiring manual checks.
Scenario: Ensure the Loan Amount does not exceed the Annual Income.
| Condition | Message |
|---|---|
| Loan Amount > Annual Income | "Loan Amount cannot exceed income." |
Steps to Configure:
LoanAmount > AnnualIncome.Email Field Validation:
@ and .com.Numeric Validation:
Business Rule Validation:
Data Transformation:
(123)-456-7890.Form-Level Validation:
Performance optimization focuses on improving data retrieval, reducing redundant processing, and ensuring efficient database interactions. Proper optimization techniques enhance scalability, responsiveness, and user experience.
Efficient data retrieval minimizes the number of queries to the database or external systems, thereby reducing response times and improving application performance.
Avoid Unnecessary Queries:
Pagination:
Optimize Report Definitions:
Lazy Loading:
Use Declarative Data:
Scenario: Display orders for a customer in a dashboard.
| Before Optimization | After Optimization |
|---|---|
| Fetch all columns and all orders at once. | Fetch only required columns (OrderID, Date, Total) and filter by CustomerID. |
Optimized Query:
Use a Report Definition with filters:
SELECT OrderID, OrderDate, OrderTotal
FROM Orders
WHERE CustomerID = '001'
ORDER BY OrderDate DESC
LIMIT 20;
Caching improves performance by reducing the need for frequent data retrieval from databases or external systems.
Thread-Level Cache:
Requestor-Level Cache:
Node-Level Cache:
Reload if older than X minutes:
Do not reload:
Reload per interaction:
Scenario: Fetch product details from a database and cache them for 1 hour.
Benefit: The product data is fetched once and shared across all users, reducing database queries.
Avoid storing duplicate or unnecessary data, as it leads to performance bottlenecks and inconsistency.
Normalization:
Use Data Virtualization:
Reference Data Pages:
Database indexing improves query performance by enabling faster searches and retrieval of data.
An index is a database structure that allows faster lookup of specific rows in a table.
Scenario: Frequently query the CustomerID field in the Customer table.
Add an index on the CustomerID column:
CREATE INDEX idx_customer_id ON Customer (CustomerID);
Result: Queries using WHERE CustomerID = '001' will run faster.
CustomerID.We have now covered all the key topics in Data Model Design:
In complex many-to-many relationships, a Join Table class (e.g., Data-Enrollment) is commonly used to relate two entities (e.g., Students and Courses). To make this relationship more reusable and extensible, consider modeling the association using a Keyed Page Group.
Allows the system to index relationships using a meaningful key, such as Course ID or Student ID.
Supports easy access and iteration over specific related records.
Enhances clarity and reuse across multiple case types or modules.
Property: Enrollments (Page Group)
Class: Data-Enrollment
Key: CourseID
You can now access enrollment details using:.Enrollments(C101).Status, .Enrollments(C102).Grade, etc.
This design is particularly powerful when you need structured, indexed access to relationship data in reports, UI sections, or integrations.
To avoid duplication and promote reuse, logic and properties encapsulated in Data Classes should be shared across multiple Case Types using Page or Page List references.
Create a reusable Data Class, e.g., Data-Product.
In your Case Type, define a Single Page or Page List:
.ProductDetails (Page) → Data-Product
.ProductList (Page List) → Data-Product
This enables centralized updates and ensures data standardization across the application.
A common real-world pattern involves looping through a Page List to process or aggregate values.
Loop: OrderItems()
.TotalAmount += OrderItems().ItemAmount
This logic in a Data Transform accumulates item values into a single .TotalAmount property — often used in invoicing or summary screens.
Pega supports two structures for Data Pages:
| Structure | Description | Example |
|---|---|---|
| Page | Holds a single object | D_CustomerDetails for one customer |
| Page List | Holds multiple objects | D_AllLoanTypes returning all loan types |
This is frequently tested in exams — especially in scope + structure pairings like:
Node + Page List → Static reference data
Thread + Page → Case-specific dynamic fetch
Use parameters in Data Pages to fetch context-specific data.
Example:
D_CustomerDetails[CustomerID: "C1001"]
This pattern supports:
Real-time queries to external systems
Dynamic data fetches for form population
Avoids hardcoding, promotes reuse
When integrations (e.g., REST or SOAP connectors) fail, it is critical to handle them gracefully and predictably.
Configure Flow-level error handling via Error Paths or Alternate Stages.
Use StepStatusFail in activities or HasMessages condition in Data Transforms to detect failures.
Redirect users to:
A “Retry or Contact Support” screen
A manual processing queue via assignment
This approach aligns with robust enterprise-grade resilience design, and is a common exam scenario.
Sometimes, validation requires composite logic: e.g., "Field A and Field B must be filled together, or both left blank."
IF (IsMissing(.A) AND NOT IsMissing(.B)) OR (NOT IsMissing(.A) AND IsMissing(.B))
THEN Show Error: "Please fill both A and B or leave both empty."
Declarative Constraints are implemented via:
A When rule (evaluating the condition)
Bound to a property through a Constraint rule
Upon violation, Pega automatically displays an error message
This is reactive validation, triggered whenever data changes — and commonly appears in rule resolution and form validation questions.
Use tools like:
Clipboard Analyzer: To inspect whether Data Pages are overused, stale, or large.
Pega PDC (Predictive Diagnostic Cloud): To detect high-frequency queries, slow report execution, or repeated page reloads.
These tools support data-driven optimization of performance.
Exam questions often focus on how to design efficient reports. Emphasize:
Filter and sort by indexed properties (e.g., .SubmissionDate, .Status)
Avoid filters on non-indexed computed properties
Use database views or pre-aggregated tables when joining multiple large tables
Failing to do so leads to full table scans — a common performance bottleneck.
Strong data modeling governance ensures:
Avoidance of rule duplication
Modular and maintainable data structures
Compliance with standards and enterprise architecture policies
Use built-on applications to centralize reusable Data- classes.
Organize reusable assets in shared RuleSets (e.g., Common-Data, Enterprise-Rules).
Enforce naming conventions:
Page names: CustomerInfo, ProductList
Property names: CamelCase or enterprise-standard styles
LSA exam scenarios often test design consistency and governance, especially in modular application setups and reuse strategies across business lines.
What is the role of data pages in Pega data modeling?
Data pages act as a caching and data access layer that provides reusable, centralized data to applications.
They decouple data retrieval from business logic, enabling efficient data reuse and reducing redundant integrations. Data pages can be configured for different scopes (node, requestor, thread) based on usage. A common mistake is directly calling integrations instead of using data pages, which leads to performance issues and duplication. Proper use ensures consistency and scalability.
Demand Score: 83
Exam Relevance Score: 90
How should a case data model be designed to support reuse and integrity?
A case data model should use normalized structures, reusable data objects, and clear relationships to ensure consistency and maintainability.
Designers should avoid embedding redundant data within cases and instead reference shared data objects. This improves data integrity and simplifies updates. A common mistake is duplicating data across cases, which leads to inconsistencies. Proper relationships and validation rules ensure reliable data management across the application.
Demand Score: 82
Exam Relevance Score: 91
When should polymorphism be used in Pega data modeling?
Polymorphism should be used when multiple classes share common behavior but require different implementations.
It allows flexible data handling through inheritance and dynamic class references. This is useful in scenarios where similar entities have variations. A common mistake is overcomplicating the model with unnecessary inheritance, which reduces clarity. Proper use improves extensibility and reuse while maintaining a clean structure.
Demand Score: 81
Exam Relevance Score: 88
What are the benefits of using a data virtualization layer in Pega?
A data virtualization layer abstracts data sources, enabling unified access without duplicating or physically storing data.
It allows applications to interact with multiple data sources through a consistent interface, often implemented via data pages. This reduces integration complexity and improves flexibility. A common mistake is tightly coupling applications to specific data sources, which limits scalability and maintainability. Virtualization supports easier changes and integration reuse.
Demand Score: 80
Exam Relevance Score: 87
How should existing or industry data models be extended in Pega?
Existing data models should be extended using inheritance and specialization while preserving the base structure.
This ensures compatibility with upgrades and reuse of standard definitions. Designers should avoid modifying base classes directly and instead create extensions in higher layers. A common mistake is altering foundational models, which complicates maintenance and upgrades. Proper extension maintains consistency and leverages industry best practices.
Demand Score: 79
Exam Relevance Score: 89