Artificial Intelligence in Oncology - Supporting scientific research
UMC Utrecht
In short, Edwin Cuppen's research entails the following:
About 3% of cancer patients are diagnosed annually with a Cancer of Unknown Primary (CUP) and there is also a significant fraction of patients with indeterminate, uncertain, or differential diagnoses, especially in metastatic or poorly differentiated tumors. Patients for which the primary tumor type is unknown have a worse prognosis, often due to complicated and laborious diagnostics and lack of therapeutic options as treatment options are almost exclusively driven by primary tumor type classification. Whole genome sequencing (WGS) is an emerging diagnostic approach and has already provided useful for the identification of matching personalized targeted treatments, but the obtained genome-wide knowledge also offers important potential for the identification of tumor tissue of origin. In the proposed project, we will use machine learning approaches to develop a WGS-based CUP classifier that can annotate cancers for which the primary tumor mass could not be identified. For this, we will consolidate, integrate and analyse the largest pan-cancer whole genome sequenced datasets worldwide, consisting of more than 8,000 patients. This project will be performed in a close collaboration between the research group of Edwin Cuppen at the UMC Utrecht, the pathology department of the Netherlands Cancer Institute and the Hartwig Medical Foundation and should drive implementation of the established algorithm in diagnostic patient reporting.'
Cancer of unknown primary (CUP) is a rare cancer indication (3% of all new cancer cases), yet affects thousands ofpatients annually in The Netherlands. CUPs represent with advanced stage metastatic cancer and typically involve a long diagnostic odyssey including step-wise histopathological and immunohistochemistry analyses which often still fail to be conclusive on the tumor tissue of origin. This results in uncertainty with the patient as well as a lack of standardized treatment options.
To aid CUP patient diagnostics, we have developed CUPLR (Cancer of Unknown Primary Location Resolver), a machine learning-based classification of tissue of origin for cancers of unknown primary based on whole-genome mutation features. To developed CUPLR we have leveraged simple and complex mutation features from world-wide the largest set of publicly available whole genome sequenced tumors. CUPRL is able to classify tissue of origin with high accuracy (90% overall recall), and outperformes current state of the art classification tools for certain cancer types both in terms of accuracy and the number of tumor (sub)types that can be discerned. CUPLR also provides human interpretable explanations alongside each prediction which helps pathologists understanding the outcomes and resolving complex cases. We demonstrated that CUPLR can predict the tumor tissue of origin with high accuracy for hundreds of patients diagnosed with CUP. The developed algorithms are currently used by Hartwig Medical Foundation for implementation in their routine WGS-based diagnostic test (see www.oncoact.nl for details).
At this moment, the brief summary of progress is only available in Dutch. You can find the Dutch summary here.