Testing the Simulacrum
The Simulacrum is a database of synthetic data on cancer patients which imitates some of the data held securely by Public Health England’s National Cancer Registration and Analysis Service. The Simulacrum was created to mimic the real data about cancer patients so that it can be used for research without compromising patient confidentiality. But simulating artificial data which represents the true data is difficult.
Edward spent his internship with us in summer of 2019 testing and improving the Simulacrum. He compared the simulated data outputs to the real data outputs to make sure the simulated data can give accurate answers about cancer while also protecting the confidentiality of the patients.
Edward created a testing visualisation tool. The tool applies tests for Simulacrum distributions and creates visualisations of the test results automatically. It allows us to quickly examine distributions of interests and look for misrepresentations or missing data in the Simulacrum.
Edward’s presentation video:
This is “Edward Pearce” by Health Data Insight CIC on Vimeo, the home for high quality videos and the people who love them.
Edward’s end of attachment video:
This is “Edward Pearce” by Kim Whittlestone on Vimeo, the home for high quality videos and the people who love them.