The Simulacrum
Overview
The Simulacrum is a dataset that contains artificial patient-like cancer data to help researchers gain insights.
The Simulacrum imitates some of the data held securely by NHS England’s National Disease Registration Service (previously part of Public Health England). The data is synthetic and does not contain any information about real patients. It is free to use and allows anyone who wants to use record-level cancer data to do so, safe in the knowledge that while the data feels like the real thing, there is no danger of breaching patient confidentiality.
Simulacrum version 2 is the latest version of this synthetic dataset. This was released in April 2023, and more information can be found here: Simulacrum v2 – healthdatainsight.org.uk
In collaboration with:
The Simulacrum was developed by HDI in partnership with Astra Zeneca and IQVIA.
Timeframe
Simulacrum v1.1.0 was released on November 28, 2018.
Simulacrum v1.2.0 was released on January 21, 2021.
Simulacrum v2.0 was released in April 2023.
For further details click here for the Simulacrum website.
See also our intern project ‘Testing the Simulacrum’
Associated Publications:
Real-world evidence for patient outcomes and mutational burden in non-small cell lung (NSCLC) cancer patients in England using EGFR biomarker test data from routine clinical care
Real-world outcomes and biomarker testing in cancer patients: exploration of a novel genetic database from routine clinical practice in England
An overview on synthetic administrative data for research
Lora Frayling speaking at the HDR UK Synthetic Data Special Interest Group, December 2020. (Video opens in a new window).