Definition

The Patient data model provides a comprehensive, longitudinal record of individuals registered with healthcare organizations. It serves as the foundational dataset for patient-centric information, containing demographic details, registration status, and other key administrative data. This model is essential for accurately identifying and managing patient information across various healthcare services and systems.

Information

The Patient data model is a structured representation of all essential information related to an individual receiving healthcare services. It typically includes patient identifiers, demographic details, patient type, registration status, patient data preferences, access preferences and so on. This model ensures accurate, secure, and efficient management of patient data across healthcare systems.

Each patient record is uniquely identified by a combination of patient_id and organisation, ensuring a distinct entry for every individual within a specific healthcare organization.

The model includes several key data points to provide a complete view of the patient:

Key Data Fields

Some of the key data fields are as follow:

Patient Identifiers

patient_id: A unique identifier for the patient within the system.
organisation: The unique identifier for the healthcare organization where the patient is registered.

Demographics

date_of_birth: The patient’s date of birth.
nhs_gender: The patient’s gender.
ethnicity_emis_code_id: The patient’s ethnic background identifier to get detail from relevant data model.
postcode: The patient’s postal code, providing geographical context.

Registration Information

registration_start_date: The date the patient’s registration with the organization began.
registration_end_date: The date the patient’s registration ended, if applicable.
patient_status_description: The current status of the patient’s registration (e.g., Registered, Left, Dead).
patient_type_description: The type of patient registration (e.g., Regular, Dummy).

Vital Status

is_deceased: A boolean flag indicating if the patient is deceased.
date_of_death: The recorded date of death for the patient.

Data Management

is_deleted - Indicates whether a session record has been marked as deleted in the source system.
transform_datetime: The timestamp indicating when the record was last processed and updated in the data warehouse. This field is crucial for understanding the currency of the patient’s data.

Overview

flowchart TB
    subgraph container["Data Collection"]
        n10["Patient Address"]
        n11["Patient NDOP status"]
        n12["Patient Registration Data"]
    end

    n17["Organisation 1"] --> n7
    n17["Organisation 1"] --> n5
    n18["Organisation 2"] --> n6
    n18["Organisation 2"] --> n8
    n18["Organisation 2"] --> n4

    n7["Patient 123"] --> container
    n5["Patient 98"] --> container
    n6["Patient 456"] --> container
    n4["Patient 20"] --> container
    n8["Patient 47"] --> container

    container --> n13["Gather Patient Data"]
    n13 --> |"Unique Patient IDs"|n16["123, 98, 456, 20, 47"]
    n16 --> n14["ETL"]
    n14 --> n15["Patient Data Model"]

    n7["Patient 123"]:::rect
    n5["Patient 98"]:::rect
    n6["Patient 456"]:::rect
    n4["Patient 20"]:::rect
    n8["Patient 47"]:::rect

    n10:::rect
    n11:::rect
    n12:::rect
    n13:::extract
    n14:::event
    n15:::database
    n16:::rect

    classDef rect rect
    classDef extract circle
    classDef event path
    classDef database cylinder

Changes in iPCV v2

Deceased Flag Logic Improvement

In iPCV v2 of the patient data models, the logic for determining whether a patient is deceased or not, has been refined and unified.

The logic now uses all available fields from EMIS Web, such as patient status and date of death. Because the date of death is optional in EMIS Web, it will be used in calculations when present; if not, other available fields will be used. This ensures we don’t rely only on optional fields that may be missing for some patients.

Why?

Previously, iPCV v1 used inconsistent logic between patient and patient opt out models, leading to potential discrepancies. Now, both use the same criteria for accuracy and consistency.

In addition, the previous logic sometimes missed deceased patients due to missing date of death values. The new approach provides a more reliable and consistent method

Customer benefit

Improved Accuracy: More reliable identification of deceased patients across datasets
Simplified Integration: Unified logic reduces the need for custom handling or cross-referencing multiple fields
Better Reporting: Enhances downstream analytics and reporting consistency

Customer action

Review any custom logic or filters that rely on deceased status and update them to align with the new unified logic in v2
Validate downstream systems or dashboards to ensure they reflect the updated logic

Refined Age Calculation Logic in Patient Data Mode

In iPCV v1 of the patient data model, the ‘age’ field was calculated by comparing the current date with the patient’s date of birth, and only updated when there was a change in the patient record. iPCV v2 introduces a more dynamic and accurate approach:

Age is now calculated using either the current date or the date of death (if populated), ensuring correct age representation for all patients.
The age field is updated not only when the patient record changes, but also when the patient’s birthday passes, ensuring age remains current without manual intervention.

Why?

iPCV v1 did not in some case show the correct patient age i.e. continued to increment age even after a patient was marked as deceased at the source. iPCV v2 corrects this for data accuracy.

Customer benefit

Improved Accuracy: Ensures age is correctly calculated for both living and deceased patients
Timely Updates: Automatically reflects age changes as time progresses, reducing stale data
Better Analytics: Enhances age-based reporting and segmentation, especially in longitudinal studies

Customer action

Review any logic or filters that rely on the ‘age’ field and ensure they align with the updated calculation method
Validate downstream systems or reports that use age to confirm they reflect the new logic
No immediate action required if consuming the age field directly from the v2 model

Refined Registration Status, Logic in Patient Data Model

iPCV v1 included patients with both actual registration and pre-registration statuses as “registered”.

Whereas iPCV v2 includes only patients with actual registration status are considered “registered”; pre-registration statuses are excluded.

Why?

This change is to ensure that only genuinely registered patients are counted, avoiding premature inclusion of pre-registered patients.

Customer benefit

Improved Accuracy: More precise identification of registered patients
Better Segmentation: Enables clearer distinction between registered and pre-registered patients
Enhanced Reporting: Supports more reliable registration-based metrics and analysis

Customer action

Review any logic or filters that rely on is_registered and update them to reflect the new definition
Validate downstream systems or reports that use registration status to ensure consistency with v2
No immediate action required if consuming the field directly from the v2 model

Accurate Registration Start & End Date in Patient Data Model

In iPCV v1, the Registration start date could be set to the patient record creation date, even if the patient was not yet registered. And Registration end date might not accurately reflect when the patient actually left the practice.

In iPCV v2, the Registration start date is now set to the actual date when the patient is registered at the practice, not just when the record is created. And Registration end date reflects the true date when the patient leaves the practice, ensuring accurate lifecycle tracking.

Why?

This change is to provide precise tracking of patient registration periods, avoiding errors from premature or inaccurate date assignment.

Customer benefit

Improved Accuracy: Eliminates false positives for registration start dates
Cleaner Data: Reduces noise in registration-related reporting and analysis
Better Insights: Supports more meaningful metrics around patient onboarding and registration timelines

Customer action

Review any logic or reports that rely on registration_start or registration_end and ensure they reflect the updated definition
Validate downstream systems or dashboards to confirm they exclude unregistered patients from registration-based metrics
No immediate action required if consuming the field directly from the v2 model

Updated Active Status Logic in Patient Data Model

In iPCV v1, the active patient flag could be set based on patient status changes, including cases where a patient had died but subsequent administrative actions (like issuing a death certificate) altered their status, leading to incorrect active flag assignment.

In iPCV v2, the logic has been refined so that the active patient flag is now set only for patients who are truly active at the practice. This explicitly excludes deceased patients—i.e., administrative or notification status changes after death no longer affect the active flag. Additionally, the definition of “active” now includes patients with a pre-registration status, not just those fully registered, ensuring that patients in the process of joining the practice are appropriately flagged.

Why?

This change prevents deceased patients from being incorrectly counted as active due to post-mortem administrative updates, and ensures that the active flag accurately reflects patients who are either currently registered or in the process of registration, but are alive and associated with the practice.

Customer benefit

Improved Accuracy: Prevents deceased patients from being incorrectly marked as active
Operational Clarity: Aligns active status with real-world patient engagement stages
Better Filtering: Supports more meaningful segmentation for active caseloads

Customer action

Review any logic or filters that rely on the active field and update them to reflect the new definition
Validate downstream systems or dashboards to ensure deceased patients are excluded from active patient views
No immediate action required if consuming the field directly from the v2 model

Refined Patient Left Status Logic in Patient Data Model

In iPCV v1, the logic for identifying patients who had left the practice could include deceased patients or be affected by inconsistent status handling, leading to inaccurate counts.

iPCV v2, the “left” status is now assigned only to patients whose caseload or patient status indicates they have left, explicitly excluding deceased patients. The logic is now standardized to ensure only appropriate patients are marked as “left,” improving data consistency.

Why?

This change is to avoid misclassifying deceased patients as having simply left the practice

Customer benefit

Improved Accuracy: Clearer distinction between patients who have left and those who are deceased
Reliable Reporting: Enhances service exit metrics and patient flow analysis
Operational Clarity: Supports better caseload management and follow-up processes

Customer action

Review any logic or reports that rely on the left field and ensure they reflect the updated definition
Validate downstream systems or dashboards to confirm deceased patients are excluded from “left” status views
No immediate action required if consuming the field directly from the v2 model

Expanded Ethnicity Classification in Patient Data Model

In iPCV v1, ethnicity was determined using national codes for the ‘Ethnic Group’ field, which provided a basic classification.

iPCV v2 enhances this by:

Continuing to use national codes for consistency
Adding SNOMED code 397731000 (Ethnic group)
Including all child codes under EMIS hierarchy code 141291000000111
Below listed ethnicity specific details will not be given part of patient model, instead customer can retrieve required information of ethnicity from extended model using ethnicity_emis_code_id
- ethnic_category_snomed_concept_id
- ethnic_group_description
- ethnic_group_id
- ethnic_group_snomed_description

Why?

This expanded logic allows for a more granular and clinically relevant classification of ethnicity, improving alignment with healthcare standards and terminology.

Customer benefit

Improved Accuracy & Coverage: Captures a broader and more precise range of ethnic classifications
Enhanced Interoperability: Aligns with SNOMED and EMIS standards for better integration across systems
Better Reporting & Insights: Supports more detailed demographic analysis and equity monitoring

Customer action

Review any logic or filters that rely on ethnicity codes and ensure they accommodate the expanded SNOMED and EMIS hierarchy
Validate downstream systems or dashboards to confirm they reflect the enhanced classification
No immediate action required if consuming the field directly from the v2 model

Standardised Observation-Based Attribute Logic in Patient Data Model

In iPCV v1, the technique for determining the latest instance of observation-based attributes varied across fields, leading to inconsistencies—especially in migrated data.

iPCV v2 introduces a standardized and prioritized approach for selecting the most recent observation:

Effective date (if available)
Availability time
Observation ID (as a fallback)

This method is now consistently applied across key patient attributes, including:

Opt-outs (data consent)
Email/SMS consent
Ethnicity
Sexual Orientation
Language

Why?

This results in a more accurate and reliable representation of patient data, particularly for systems with historical or migrated records.

Customer benefit

Improved Accuracy: Ensures the most relevant and recent observation is used
Consistency Across Fields: Reduces discrepancies in how different attributes are handled
Better Data Quality: Enhances trust in reporting and analytics, especially for consent and demographic data

Customer action

Review any logic or reports that rely on observation-based attributes and ensure they reflect the updated selection method
Validate downstream systems or dashboards to confirm they are using the latest and most accurate data
No immediate action required if consuming the fields directly from the v2 model

Corrected Identifier Logic for Usual GP and External Usual GP in Patient Data Model

In iPCV v1, the model attempted to provide the GUID for both Usual GP and External Usual GP. However, this approach was incorrect for External Usual GP, leading to mismatches or missing links when joining to user data.

iPCV v2 corrects this by:

Providing the EMIS ID - usual_gp_user_in_role_id and external_usual_gp_id for both Usual GP external_usual_gp_id and External Usual GP respectively
These IDs are designed to be joined to the User in Role model, ensuring accurate linkage and representation

Why?

This change was necessary to address a fundamental flaw in how GP identifiers were handled in the previous model. By incorrectly using GUIDs for External Usual GPs, the system introduced inconsistencies that compromised data integrity and made it difficult to reliably associate patients with their correct GP records. Switching to EMIS IDs aligns the model with the actual data structure used in downstream systems, enabling more dependable joins and ensuring that the representation of GP relationships reflects real-world practice configurations.

Customer benefit

Improved Accuracy: Ensures correct identification and linkage of GP records
Better Integration: Supports reliable joins to user data for reporting and analysis
Reduced Errors: Eliminates mismatches caused by incorrect GUID usage

Customer action

Update any joins or lookups that previously relied on GUIDs to use EMIS IDs instead
Ensure downstream systems or reports referencing GP data are aligned with the new identifier logic
No immediate action required if consuming the field directly from the v2 model and joining via EMIS ID

Simplified Carer Representation in Patient Data Model

In iPCV v1, the model attempted to provide either carer details or a carer flag, but this approach was limited and did not account for patients having multiple carers.

iPCV v2 introduces a simplified and more scalable approach:

Provides a has_carer flag to indicate whether a patient has one or more carers
Removed deprecated columns - carer_name, carer_relation
A separate carer model is being considered for future releases to represent carer details and relationships more accurately

Why?

Customer benefit

Improved Accuracy: Reflects the presence of carers without oversimplifying or misrepresenting relationships
Scalability: Prepares the data model for future enhancements with a dedicated carer structure
Cleaner Design: Reduces ambiguity and clutter in the patient view

Customer action

Review any logic or reports that previously relied on detailed carer data and adjust to use the has_carer flag
Prepare for future integration with a separate carer model if detailed carer data is required
No immediate action required if consuming the field directly from the v2 model

Expanded Patient Identifier Coverage in Patient Data Model

In iPCV v1, the patient model provided only the NHS Number as the primary identifier.

iPCV v2 enhances this by including additional identifiers to support broader interoperability and regional coverage:

NHS Number (existing)
CHI Number (Scotland)
HC Number (Northern Ireland)
Hospital Number
SSD Number
GHA Number

Why?

This expansion ensures that patients can be accurately identified across different healthcare systems and geographies.

Customer benefit

Improved Coverage: Supports identification across UK regions and healthcare settings
Enhanced Interoperability: Facilitates smoother integration with external systems and datasets
Better Data Matching: Reduces duplication and improves linkage accuracy in multi-source environments

Customer action

Review any logic or matching processes that rely solely on NHS Number and consider incorporating additional identifiers
Validate downstream systems or reports to ensure they can handle and benefit from the expanded identifier set
No immediate action required if consuming the field directly from the v2 model and joining via EMIS ID

Removal of Redundant or Deprecated Fields

In iPCV v1, several attributes were directly included in the patient model. In iPCV v2, the below fields have been removed from the patient data model, and relevant IDs are provided to retrieve this information from dedicated extended data models.

Why?

This change simplifies the core patient model, making it lighter and easier to manage. By moving specific attributes to their own models, data redundancy is reduced, and maintainability is improved. This allows for more detailed and focused information within the extended models without cluttering the primary patient record, reducing the complexity of multiple joins.

Decommissioned Column	Relevant ID/ Column	Reason / Extended Data Model to be Referred
`language_code`	`language_emis_code_id`	`mkb_mapping_attributes_v2`
`language_preferred`	`language_emis_code_id`	`mkb_mapping_attributes_v2`
`language_preferred_snomed_concept_id`	`language_emis_code_id`	`mkb_mapping_attributes_v2`
`named_gp`	`usual_gp_user_in_role_id`	`user_in_role_v2`
`nhs_no_status`/ `nhs_number_status`	N/A	Removed as it was deprecated in iPCV v1
`sexual_orientation_nationalcode`	`sexual_orientation_emis_code_id`	`mkb_mapping_ethnicity_v2`
`external_usualgp_guid`	`external_usual_gp_id`	`user_in_role_v2`
`external_usualgp_name`	`external_usual_gp_id`	`user_in_role_v2`
`usualgp_displayname`	`usual_gp_user_in_role_id`	`user_in_role_v2`
`emis_usualgp_userinrole_guid`	`usual_gp_user_in_role_id`	`user_in_role_v2`
`regular_current_active_and_inactive_flag`	`is_regular`	`patient_v2`
`processing_id`		Removed as it is deprecated in iPCV v1
`data_filter`		Removed as it is deprecated in v2
`_record_version`		No longer available in v2
`_update_date`		Changes tracked differently in v2
`_update_hour`		Changes tracked differently in v2

Customer benefit

Improved Performance: A leaner patient model can lead to faster query performance.
Enhanced Scalability: Decoupled models are easier to maintain and extend independently.
Data Consistency: Centralizing specific attributes in their own models ensures a single source of truth.

Customer action

Identify any queries or reports that use the removed fields.
Update these processes to join with the appropriate extended data models to retrieve the required information.
For any reports or queries that rely on the removed fields, update them to join with the appropriate extended data models using the new IDs.

Addition of New Columns to the Flavours for Enhanced Data Accuracy

iPCV v2 introduces several new columns across different flavours to improve data accuracy, provide clearer insights into data lineage, and support future enhancements. These columns are included only where relevant to the specific flavour as presented in the schema.

The consolidated list of new columns across all flavours:

is_active
is_merged
is_national_data_opted_in
is_registered
sex
postcode_no_space
registration_organisation_id

Why?

These additions provide more granular flags for patient status, consent, and data lineage. The introduction of new ID-based columns aligns the model with modern identifier practices, paving the way for deprecating older fields and improving system consistency.

Customer benefit

Improved Data Lineage: The is_merged flag offers greater transparency into the history of patient records affected by organizational changes
Enhanced Consistency: Standardizes the use of identifiers and status flags across different model flavours
Richer Analytics: Enables more precise filtering and analysis based on registration status, consent, and active status
Future-Proofing: Adopting new ID-based columns ensures a smoother transition as older identifiers are phased out

Customer action

Review any logic that handles patient data to incorporate the new flags and IDs for more accurate reporting
Begin planning the migration from older identifiers (e.g., registration_organisation_guid) to their new ID-based counterparts (e.g., registration_organisation_id) in any custom queries or downstream systems
Validate downstream systems or reports to ensure they can utilize the new columns effectively
No immediate action required if consuming the fields directly from the v2 model

Examples

Currently active patients

SELECT
  patient_id,
  patient_guid,
  patient_name,
  date_of_birth,
  registration_start_datetime,
  registration_end_datetime,
  has_carer,
  full_address,
  organisation
FROM hive.explorer_ipcv.srv_patient
WHERE
  is_active

Deceased Patients

SELECT
  patient_id,
  patient_guid,
  patient_name,
  date_of_birth,
  registration_start_datetime,
  registration_end_datetime,
  has_carer,
  full_address,
  organisation
FROM hive.explorer_ipcv.srv_patient
WHERE
  has_died

Patients opted to national data sharing

SELECT
  patient_id,
  patient_guid,
  patient_name,
  date_of_birth,
  registration_start_datetime,
  registration_end_datetime,
  has_carer,
  full_address,
  organisation
FROM hive.explorer_ipcv.srv_patient
WHERE
  is_national_data_opted_in