Home » Projects » Synthetic Data » The Simulacrum
Simulacrum logo

The Simulacrum


The Simulacrum is a dataset that contains artificial patient-like cancer data to help researchers gain insights.

The Simulacrum imitates some of the data held securely by the Public Health England’s National Cancer Registration and Analysis Service. The data is synthetic and does not contain any information about real patients. It is free to use and allows anyone who wants to use record-level cancer data to do so, safe in the knowledge that while the data feels like the real thing, there is no danger of breaching patient confidentiality. The Simulacrum was developed by HDI in partnership with AstraZeneca and IQVIA and was first released on November 28, 2018. 

For further details click here for the Simulacrum website.

In collaboration with:

Astra Zeneca; IQVIA




Simulacrum v1.1.0 was released on November 28, 2018.

Simulacrum v1.2.0 was released on January 21, 2021. 

See also our intern project ‘Testing the Simulacrum

Lora Frayling speaking at the HDR UK Synthetic Data Special Interest Group, December 2020. (Video opens in a new window).

Lora Frayling talking at the HDR UK Synthetic Data Special Interest Group
Share This