Skip to content
Partner Developer Portal

Definition

The Patient data model provides a comprehensive, longitudinal record of individuals registered with healthcare organizations. It serves as the foundational dataset for patient-centric information, containing demographic details, registration status, and other key administrative data. This model is essential for accurately identifying and managing patient information across various healthcare services and systems.

The Patient data model is a structured representation of all essential information related to an individual receiving healthcare services. It typically includes patient identifiers, demographic details, patient type, registration status, patient data preferences, access preferences and so on. This model ensures accurate, secure, and efficient management of patient data across healthcare systems.

Each patient record is uniquely identified by a combination of patient_id and organisation, ensuring a distinct entry for every individual within a specific healthcare organization.

The model includes several key data points to provide a complete view of the patient:

Some of the key data fields are as follow:

  • patient_id: A unique identifier for the patient within the system.
  • organisation: The unique identifier for the healthcare organization where the patient is registered.
  • date_of_birth: The patient’s date of birth.
  • nhs_gender: The patient’s gender.
  • ethnicity_emis_code_id: The patient’s ethnic background identifier to get detail from relevant data model.
  • postcode: The patient’s postal code, providing geographical context.
  • registration_start_date: The date the patient’s registration with the organization began.
  • registration_end_date: The date the patient’s registration ended, if applicable.
  • patient_status_description: The current status of the patient’s registration (e.g., Registered, Left, Dead).
  • patient_type_description: The type of patient registration (e.g., Regular, Dummy).
  • is_deceased: A boolean flag indicating if the patient is deceased.
  • date_of_death: The recorded date of death for the patient.
  • is_deleted - Indicates whether a session record has been marked as deleted in the source system.
  • transform_datetime: The timestamp indicating when the record was last processed and updated in the data warehouse. This field is crucial for understanding the currency of the patient’s data.
flowchart TB
    subgraph container["Data Collection"]
        n10["Patient Address"]
        n11["Patient NDOP status"]
        n12["Patient Registration Data"]
    end

    n17["Organisation 1"] --> n7
    n17["Organisation 1"] --> n5
    n18["Organisation 2"] --> n6
    n18["Organisation 2"] --> n8
    n18["Organisation 2"] --> n4

    n7["Patient 123"] --> container
    n5["Patient 98"] --> container
    n6["Patient 456"] --> container
    n4["Patient 20"] --> container
    n8["Patient 47"] --> container

    container --> n13["Gather Patient Data"]
    n13 --> |"Unique Patient IDs"|n16["123, 98, 456, 20, 47"]
    n16 --> n14["ETL"]
    n14 --> n15["Patient Data Model"]

    n7["Patient 123"]:::rect
    n5["Patient 98"]:::rect
    n6["Patient 456"]:::rect
    n4["Patient 20"]:::rect
    n8["Patient 47"]:::rect

    n10:::rect
    n11:::rect
    n12:::rect
    n13:::extract
    n14:::event
    n15:::database
    n16:::rect

    classDef rect rect
    classDef extract circle
    classDef event path
    classDef database cylinder

In iPCV v2 of the patient data models, the logic for determining whether a patient is deceased or not, has been refined and unified.

The logic now uses all available fields from EMIS Web, such as patient status and date of death. Because the date of death is optional in EMIS Web, it will be used in calculations when present; if not, other available fields will be used. This ensures we don’t rely only on optional fields that may be missing for some patients.

Previously, iPCV v1 used inconsistent logic between patient and patient opt out models, leading to potential discrepancies. Now, both use the same criteria for accuracy and consistency.

In addition, the previous logic sometimes missed deceased patients due to missing date of death values. The new approach provides a more reliable and consistent method

  • Improved Accuracy: More reliable identification of deceased patients across datasets

  • Simplified Integration: Unified logic reduces the need for custom handling or cross-referencing multiple fields

  • Better Reporting: Enhances downstream analytics and reporting consistency

  • Review any custom logic or filters that rely on deceased status and update them to align with the new unified logic in v2

  • Validate downstream systems or dashboards to ensure they reflect the updated logic

Refined Age Calculation Logic in Patient Data Mode

Section titled “Refined Age Calculation Logic in Patient Data Mode”

In iPCV v1 of the patient data model, the ‘age’ field was calculated by comparing the current date with the patient’s date of birth, and only updated when there was a change in the patient record. iPCV v2 introduces a more dynamic and accurate approach:

  • Age is now calculated using either the current date or the date of death (if populated), ensuring correct age representation for all patients.

  • The age field is updated not only when the patient record changes, but also when the patient’s birthday passes, ensuring age remains current without manual intervention.

Why?

iPCV v1 did not in some case show the correct patient age i.e. continued to increment age even after a patient was marked as deceased at the source. iPCV v2 corrects this for data accuracy.

Customer benefit

  • Improved Accuracy: Ensures age is correctly calculated for both living and deceased patients

  • Timely Updates: Automatically reflects age changes as time progresses, reducing stale data

  • Better Analytics: Enhances age-based reporting and segmentation, especially in longitudinal studies

Customer action

  • Review any logic or filters that rely on the ‘age’ field and ensure they align with the updated calculation method

  • Validate downstream systems or reports that use age to confirm they reflect the new logic

  • No immediate action required if consuming the age field directly from the v2 model

Refined Registration Status, Logic in Patient Data Model

Section titled “Refined Registration Status, Logic in Patient Data Model”

iPCV v1 included patients with both actual registration and pre-registration statuses as “registered”.

Whereas iPCV v2 includes only patients with actual registration status are considered “registered”; pre-registration statuses are excluded.

Why?

This change is to ensure that only genuinely registered patients are counted, avoiding premature inclusion of pre-registered patients.

Customer benefit

  • Improved Accuracy: More precise identification of registered patients

  • Better Segmentation: Enables clearer distinction between registered and pre-registered patients

  • Enhanced Reporting: Supports more reliable registration-based metrics and analysis

Customer action

  • Review any logic or filters that rely on is_registered and update them to reflect the new definition

  • Validate downstream systems or reports that use registration status to ensure consistency with v2

  • No immediate action required if consuming the field directly from the v2 model

Accurate Registration Start & End Date in Patient Data Model

Section titled “Accurate Registration Start & End Date in Patient Data Model”

In iPCV v1, the Registration start date could be set to the patient record creation date, even if the patient was not yet registered. And Registration end date might not accurately reflect when the patient actually left the practice.

In iPCV v2, the Registration start date is now set to the actual date when the patient is registered at the practice, not just when the record is created. And Registration end date reflects the true date when the patient leaves the practice, ensuring accurate lifecycle tracking.

Why?

This change is to provide precise tracking of patient registration periods, avoiding errors from premature or inaccurate date assignment.

Customer benefit

  • Improved Accuracy: Eliminates false positives for registration start dates

  • Cleaner Data: Reduces noise in registration-related reporting and analysis

  • Better Insights: Supports more meaningful metrics around patient onboarding and registration timelines

Customer action

  • Review any logic or reports that rely on registration_start or registration_end and ensure they reflect the updated definition

  • Validate downstream systems or dashboards to confirm they exclude unregistered patients from registration-based metrics

  • No immediate action required if consuming the field directly from the v2 model

Updated Active Status Logic in Patient Data Model

Section titled “Updated Active Status Logic in Patient Data Model”

In iPCV v1, the active patient flag could be set based on patient status changes, including cases where a patient had died but subsequent administrative actions (like issuing a death certificate) altered their status, leading to incorrect active flag assignment.

In iPCV v2, the logic has been refined so that the active patient flag is now set only for patients who are truly active at the practice. This explicitly excludes deceased patients—i.e., administrative or notification status changes after death no longer affect the active flag. Additionally, the definition of “active” now includes patients with a pre-registration status, not just those fully registered, ensuring that patients in the process of joining the practice are appropriately flagged.

Why?

This change prevents deceased patients from being incorrectly counted as active due to post-mortem administrative updates, and ensures that the active flag accurately reflects patients who are either currently registered or in the process of registration, but are alive and associated with the practice.

Customer benefit

  • Improved Accuracy: Prevents deceased patients from being incorrectly marked as active

  • Operational Clarity: Aligns active status with real-world patient engagement stages

  • Better Filtering: Supports more meaningful segmentation for active caseloads

Customer action

  • Review any logic or filters that rely on the active field and update them to reflect the new definition

  • Validate downstream systems or dashboards to ensure deceased patients are excluded from active patient views

  • No immediate action required if consuming the field directly from the v2 model

Refined Patient Left Status Logic in Patient Data Model

Section titled “Refined Patient Left Status Logic in Patient Data Model”

In iPCV v1, the logic for identifying patients who had left the practice could include deceased patients or be affected by inconsistent status handling, leading to inaccurate counts.

iPCV v2, the “left” status is now assigned only to patients whose caseload or patient status indicates they have left, explicitly excluding deceased patients. The logic is now standardized to ensure only appropriate patients are marked as “left,” improving data consistency.

Why?

This change is to avoid misclassifying deceased patients as having simply left the practice

Customer benefit

  • Improved Accuracy: Clearer distinction between patients who have left and those who are deceased

  • Reliable Reporting: Enhances service exit metrics and patient flow analysis

  • Operational Clarity: Supports better caseload management and follow-up processes

Customer action

  • Review any logic or reports that rely on the left field and ensure they reflect the updated definition

  • Validate downstream systems or dashboards to confirm deceased patients are excluded from “left” status views

  • No immediate action required if consuming the field directly from the v2 model

Expanded Ethnicity Classification in Patient Data Model

Section titled “Expanded Ethnicity Classification in Patient Data Model”

In iPCV v1, ethnicity was determined using national codes for the ‘Ethnic Group’ field, which provided a basic classification.

iPCV v2 enhances this by:

  • Continuing to use national codes for consistency

  • Adding SNOMED code 397731000 (Ethnic group)

  • Including all child codes under EMIS hierarchy code 141291000000111

  • Below listed ethnicity specific details will not be given part of patient model, instead customer can retrieve required information of ethnicity from extended model using ethnicity_emis_code_id

    • ethnic_category_snomed_concept_id
    • ethnic_group_description
    • ethnic_group_id
    • ethnic_group_snomed_description

Why?

This expanded logic allows for a more granular and clinically relevant classification of ethnicity, improving alignment with healthcare standards and terminology.

Customer benefit

  • Improved Accuracy & Coverage: Captures a broader and more precise range of ethnic classifications

  • Enhanced Interoperability: Aligns with SNOMED and EMIS standards for better integration across systems

  • Better Reporting & Insights: Supports more detailed demographic analysis and equity monitoring

Customer action

  • Review any logic or filters that rely on ethnicity codes and ensure they accommodate the expanded SNOMED and EMIS hierarchy

  • Validate downstream systems or dashboards to confirm they reflect the enhanced classification

  • No immediate action required if consuming the field directly from the v2 model

Standardised Observation-Based Attribute Logic in Patient Data Model

Section titled “Standardised Observation-Based Attribute Logic in Patient Data Model”

In iPCV v1, the technique for determining the latest instance of observation-based attributes varied across fields, leading to inconsistencies—especially in migrated data.

iPCV v2 introduces a standardized and prioritized approach for selecting the most recent observation:

  • Effective date (if available)

  • Availability time

  • Observation ID (as a fallback)

This method is now consistently applied across key patient attributes, including:

  • Opt-outs (data consent)

  • Email/SMS consent

  • Ethnicity

  • Sexual Orientation

  • Language

Why?

This results in a more accurate and reliable representation of patient data, particularly for systems with historical or migrated records.

Customer benefit

  • Improved Accuracy: Ensures the most relevant and recent observation is used

  • Consistency Across Fields: Reduces discrepancies in how different attributes are handled

  • Better Data Quality: Enhances trust in reporting and analytics, especially for consent and demographic data

Customer action

  • Review any logic or reports that rely on observation-based attributes and ensure they reflect the updated selection method

  • Validate downstream systems or dashboards to confirm they are using the latest and most accurate data

  • No immediate action required if consuming the fields directly from the v2 model

Corrected Identifier Logic for Usual GP and External Usual GP in Patient Data Model

Section titled “Corrected Identifier Logic for Usual GP and External Usual GP in Patient Data Model”

In iPCV v1, the model attempted to provide the GUID for both Usual GP and External Usual GP. However, this approach was incorrect for External Usual GP, leading to mismatches or missing links when joining to user data.

iPCV v2 corrects this by:

  • Providing the EMIS ID - usual_gp_user_in_role_id and external_usual_gp_id for both Usual GP external_usual_gp_id and External Usual GP respectively

  • These IDs are designed to be joined to the User in Role model, ensuring accurate linkage and representation

Why?

This change was necessary to address a fundamental flaw in how GP identifiers were handled in the previous model. By incorrectly using GUIDs for External Usual GPs, the system introduced inconsistencies that compromised data integrity and made it difficult to reliably associate patients with their correct GP records. Switching to EMIS IDs aligns the model with the actual data structure used in downstream systems, enabling more dependable joins and ensuring that the representation of GP relationships reflects real-world practice configurations.

Customer benefit

  • Improved Accuracy: Ensures correct identification and linkage of GP records

  • Better Integration: Supports reliable joins to user data for reporting and analysis

  • Reduced Errors: Eliminates mismatches caused by incorrect GUID usage

Customer action

  • Update any joins or lookups that previously relied on GUIDs to use EMIS IDs instead

  • Ensure downstream systems or reports referencing GP data are aligned with the new identifier logic

  • No immediate action required if consuming the field directly from the v2 model and joining via EMIS ID

Simplified Carer Representation in Patient Data Model

Section titled “Simplified Carer Representation in Patient Data Model”

In iPCV v1, the model attempted to provide either carer details or a carer flag, but this approach was limited and did not account for patients having multiple carers.

iPCV v2 introduces a simplified and more scalable approach:

  • Provides a has_carer flag to indicate whether a patient has one or more carers

  • Removed deprecated columns - carer_name, carer_relation

  • A separate carer model is being considered for future releases to represent carer details and relationships more accurately

Why?

This change was necessary to address a fundamental flaw in how GP identifiers were handled in the previous model. By incorrectly using GUIDs for External Usual GPs, the system introduced inconsistencies that compromised data integrity and made it difficult to reliably associate patients with their correct GP records. Switching to EMIS IDs aligns the model with the actual data structure used in downstream systems, enabling more dependable joins and ensuring that the representation of GP relationships reflects real-world practice configurations.

Customer benefit

  • Improved Accuracy: Reflects the presence of carers without oversimplifying or misrepresenting relationships

  • Scalability: Prepares the data model for future enhancements with a dedicated carer structure

  • Cleaner Design: Reduces ambiguity and clutter in the patient view

Customer action

  • Review any logic or reports that previously relied on detailed carer data and adjust to use the has_carer flag

  • Prepare for future integration with a separate carer model if detailed carer data is required

  • No immediate action required if consuming the field directly from the v2 model

Expanded Patient Identifier Coverage in Patient Data Model

Section titled “Expanded Patient Identifier Coverage in Patient Data Model”

In iPCV v1, the patient model provided only the NHS Number as the primary identifier.

iPCV v2 enhances this by including additional identifiers to support broader interoperability and regional coverage:

  • NHS Number (existing)

  • CHI Number (Scotland)

  • HC Number (Northern Ireland)

  • Hospital Number

  • SSD Number

  • GHA Number

Why?

This expansion ensures that patients can be accurately identified across different healthcare systems and geographies.

Customer benefit

  • Improved Coverage: Supports identification across UK regions and healthcare settings

  • Enhanced Interoperability: Facilitates smoother integration with external systems and datasets

  • Better Data Matching: Reduces duplication and improves linkage accuracy in multi-source environments

Customer action

  • Review any logic or matching processes that rely solely on NHS Number and consider incorporating additional identifiers

  • Validate downstream systems or reports to ensure they can handle and benefit from the expanded identifier set

  • No immediate action required if consuming the field directly from the v2 model and joining via EMIS ID

In iPCV v1, several attributes were directly included in the patient model. In iPCV v2, the below fields have been removed from the patient data model, and relevant IDs are provided to retrieve this information from dedicated extended data models.

Why?

This change simplifies the core patient model, making it lighter and easier to manage. By moving specific attributes to their own models, data redundancy is reduced, and maintainability is improved. This allows for more detailed and focused information within the extended models without cluttering the primary patient record, reducing the complexity of multiple joins.

Decommissioned ColumnRelevant ID/ ColumnReason / Extended Data Model to be Referred
language_codelanguage_emis_code_idmkb_mapping_attributes_v2
language_preferredlanguage_emis_code_idmkb_mapping_attributes_v2
language_preferred_snomed_concept_idlanguage_emis_code_idmkb_mapping_attributes_v2
named_gpusual_gp_user_in_role_iduser_in_role_v2
nhs_no_status/ nhs_number_statusN/ARemoved as it was deprecated in iPCV v1
sexual_orientation_nationalcodesexual_orientation_emis_code_idmkb_mapping_ethnicity_v2
external_usualgp_guidexternal_usual_gp_iduser_in_role_v2
external_usualgp_nameexternal_usual_gp_iduser_in_role_v2
usualgp_displaynameusual_gp_user_in_role_iduser_in_role_v2
emis_usualgp_userinrole_guidusual_gp_user_in_role_iduser_in_role_v2
regular_current_active_and_inactive_flagis_regularpatient_v2
processing_idRemoved as it is deprecated in iPCV v1
data_filterRemoved as it is deprecated in v2
_record_versionNo longer available in v2
_update_dateChanges tracked differently in v2
_update_hourChanges tracked differently in v2

Customer benefit

  • Improved Performance: A leaner patient model can lead to faster query performance.
  • Enhanced Scalability: Decoupled models are easier to maintain and extend independently.
  • Data Consistency: Centralizing specific attributes in their own models ensures a single source of truth.

Customer action

  • Identify any queries or reports that use the removed fields.

  • Update these processes to join with the appropriate extended data models to retrieve the required information.

  • For any reports or queries that rely on the removed fields, update them to join with the appropriate extended data models using the new IDs.

Addition of New Columns to the Flavours for Enhanced Data Accuracy

Section titled “Addition of New Columns to the Flavours for Enhanced Data Accuracy”

iPCV v2 introduces several new columns across different flavours to improve data accuracy, provide clearer insights into data lineage, and support future enhancements. These columns are included only where relevant to the specific flavour as presented in the schema.

The consolidated list of new columns across all flavours:

  • is_active
  • is_merged
  • is_national_data_opted_in
  • is_registered
  • sex
  • postcode_no_space
  • registration_organisation_id

Why?

These additions provide more granular flags for patient status, consent, and data lineage. The introduction of new ID-based columns aligns the model with modern identifier practices, paving the way for deprecating older fields and improving system consistency.

Customer benefit

  • Improved Data Lineage: The is_merged flag offers greater transparency into the history of patient records affected by organizational changes
  • Enhanced Consistency: Standardizes the use of identifiers and status flags across different model flavours
  • Richer Analytics: Enables more precise filtering and analysis based on registration status, consent, and active status
  • Future-Proofing: Adopting new ID-based columns ensures a smoother transition as older identifiers are phased out

Customer action

  • Review any logic that handles patient data to incorporate the new flags and IDs for more accurate reporting

  • Begin planning the migration from older identifiers (e.g., registration_organisation_guid) to their new ID-based counterparts (e.g., registration_organisation_id) in any custom queries or downstream systems

  • Validate downstream systems or reports to ensure they can utilize the new columns effectively

  • No immediate action required if consuming the fields directly from the v2 model

Currently active patients

SELECT
patient_id,
patient_guid,
patient_name,
date_of_birth,
registration_start_datetime,
registration_end_datetime,
has_carer,
full_address,
organisation
FROM hive.explorer_ipcv.srv_patient
WHERE
is_active

Deceased Patients

SELECT
patient_id,
patient_guid,
patient_name,
date_of_birth,
registration_start_datetime,
registration_end_datetime,
has_carer,
full_address,
organisation
FROM hive.explorer_ipcv.srv_patient
WHERE
has_died

Patients opted to national data sharing

SELECT
patient_id,
patient_guid,
patient_name,
date_of_birth,
registration_start_datetime,
registration_end_datetime,
has_carer,
full_address,
organisation
FROM hive.explorer_ipcv.srv_patient
WHERE
is_national_data_opted_in