Cancer data to be analysed with Cambridge technology
Cancer Research UK and Cambridge company Linguamatics are collaborating to apply Linguamatics’ natural language processing text analytics platform, I2E, to automatically extract clinical attributes from cancer pathology reports.
An additional benefit of the joint venture will be to improve annotation of clinical samples relating to Cancer Research UK’s Stratified Medicine Programme (SMP). The project will allow the analysis of detailed patient characteristics alongside large volumes of genetic data, enabling more effective research into the causes and personalised treatment of cancer.
Dr Ian Walker, director of clinical research and strategic partnerships at Cancer Research UK, said: “Pathology reports tell us a range of important information about a patient’s cancer, but the way this data is recorded can vary widely, which makes it harder to spot trends or other significant information that could have a bearing on treatment decisions or prognosis.
“This collaboration should help translate these reports into more meaningful data, which should help our researchers better understand the disease and accelerate advances in personalised medicine.”
SMP was initiated to look at the use of genetic profiles in making cancer treatment decisions with a view to how personalised medicine would be implemented in the NHS and is a forerunner of the Genomics England 100,000 genomes project.
The first project (SMP1) looked at breast, colorectal, lung, prostate, melanoma and ovarian cancers across eight hospital groups and 9000 patients. The second project (SMP2) is focused on lung cancer.
Due to the complexity and variability of pathology reports, capturing key cancer characteristics (clinical attributes) as discrete data is currently a challenging and time-consuming manual task. The collaboration will use NLP to automatically extract key clinical attributes such as tumour size, TNM stage (Classification of Malignant Tumours), topography, histology grade and category, excision margin and use of biomarkers from pathology reports.
“As the healthcare industry moves towards precision medicine, rapid transformation of unstructured patient data, such as pathology reports, into structured insights is vital.” said Simon Beaulah (pictured), director of healthcare strategy for Linguamatics.
“This project will also demonstrate how to address the challenges from variable report structure and use of language across hospitals. We are delighted to be working with Cancer Research UK on such an innovative project as the Stratified Medicine Programme.
“Using NLP in this way will yield huge benefits to the cancer community by improving understanding of patient populations and ultimately cancer care.”