BRCA Challenge development resource for pseudonymisation
The BRCA Challenge is a global initiative to understand all the mutations in the BRCA genes, which can predispose individuals for developing breast cancer. HDI has built a secure online platform which allows the National Cancer Registration and Analysis Service (NCRAS) to collect data about BRCA mutations from NHS laboratories, and to contribute to the BRCA challenge. The HDI software also pseudonymises the incoming data to protect patient confidentiality and allows linkage to cancer registration records.
Around 5-10% of breast cancers arise due to an inherited (‘germline’) mutation in a cancer susceptibility gene; most commonly the BRCA1 or BRCA2 genes. Women carrying a mutation in one of these genes have around an 80% risk of developing breast cancer over the course of their lifetime (compared to the general population risk of around 12%), and may also have an elevated risk of developing ovarian cancer.
The BRCA Challenge is a global effort to catalogue and understand all DNA sequence variants in the BRCA1 or BRCA2 genes. One of the big challenges facing geneticists is how to distinguish between a damaging gene variant and a harmless change in the gene. We all have sequence variants in our DNA, the vast majority of which are not harmful. The process of ‘variant interpretation’ relies on multiple threads of highly technical evidence; for this to be successful, it’s crucial that geneticists share knowledge and data with one another.
HDI works in partnership with the National Cancer Registration and Analysis Service (NCRAS) within Public Health England (PHE). Using a secure online interface built by HDI developers, NCRAS collects NHS laboratory data on germline BRCA gene variants. The HDI software generates a pseudonym upon upload, so the patient’s identity is protected, but their data can be accurately linked to their cancer registration record. Because all NHS labs use different databases, the genetic data from each lab arrives in a unique format, which presents a complex data challenge. HDI is developing a bespoke bioinformatics pipeline to process each lab’s data and standardise it into a consistent format.
Collection and standardisation of this data has a number of uses:
- We can count the number of times each DNA sequence variant has been identified in an NHS lab, and share this information with the Cancer Variant Interpretation Group (CanVIG), a committee of NHS geneticists from across the UK and Ireland. Having access to the anonymised data, these doctors and scientists can combine their expertise to classify each variant as harmful, benign or uncertain. This ensures that patients are given accurate information and advice based on all the best available evidence. When a variant is reclassified, genetic counsellors can invite the affected families back to clinic to update them on their cancer risk and their options. The anonymised, aggregated data, can also be shared as the UK’s contribution to the global BRCA Challenge.
- We can link each patient’s DNA sequence variant data to their tumour records. This will improve our understanding of the type of cancers that occur in BRCA mutation carriers, and might generate interesting findings on the biology of BRCA-associated tumours. We can also correlate BRCA-associated tumours with treatment and survival data – ultimately, this will help the NHS to provide increasingly personalised care to individuals and families with a genetic susceptibility to cancer
- We are extending this approach to other familial cancer syndromes, for example, Lynch syndrome, which involves a hereditary predisposition to a number of cancers, primarily colon (large bowel) and endometrium (lining of the womb).
This work is a collaborative project between HDI, PHE, and the NHS clinical and laboratory genetics community. The work has already brought benefits to some families whose BRCA variants are now understood1. The project was showcased at the meeting of the European Hereditary Tumour Group (EHTG) in 2018, where it won a prize. It has also been highlighted in the scientific literature2 as a pioneering approach to data collection.
(1) Miriam J. Smith , Emma R. Woodward, George J. Burghel, Catherine Banks, Robert D. Morgan, Andrew J. Wallace, Clare Turnbull, D. Gareth Evans
Rapid reversal of clinical down‐classification of a BRCA1 splicing variant avoiding psychological harm
Clinical Genetics, 2019, 95(4); 532-3. https://doi.org/10.1111/cge.13488