Lung cancer stands as a significant global health challenge, with early diagnosis playing a pivotal role in improving patient outcomes. Current diagnostic methods, such as radiography, computed tomography (CT), positron emission tomography (PET), histopathology, and next-generation sequencing, are often time-consuming, costly, or require expert interpretation. These challenges are especially pronounced in resource-constrained countries like India, where widespread access to panel-based next-generation sequencing is limited. In addition, issues related to tissue adequacy in lung core biopsies and inherent intratumoral heterogeneity further emphasize the need for an AI-driven solution capable of autonomously detecting and learning nuanced lung nodule features from PET-CT and histopathological images while predicting the likelihood of mutations in critical oncogenes like EGFR.
In this study, we aimed to harness radiological and histopathological features from whole slide images (WSI) to predict lung malignancy, cancer type, and the presence of mutations, particularly in the EGFR gene. Building upon a prototype system, our project outlines the expansion of our initial work to create an end-to-end AI pipeline ready for deployment in oncology hospitals. This pipeline predicts essential lung nodule features, including radiological characteristics (e.g., malignancy, texture, margin), and histopathology (e.g., acinar, lepidic, papillary), and associates them with mutational information.
To date, we have assembled a valuable resource, comprising extracted lung cancer CT/PET scan datasets, alongside annotations from 2000 Indian lung cancer patients, as well as an independent dataset sourced from the Cancer Image Archive and other repositories. We have also developed a user-friendly AI web tool tailored for clinical use, with preliminary tests demonstrating impressive accuracy (e.g., achieving a peak AUC of 91% in specific use cases). Our next phase anticipates the acquisition of an additional 2000 images, along with annotations and mutational information from our collaborators. Our system is a robust machine learning (ML) platform, uniquely tailored for Indian lung cancer patients, incorporating multiple modules, including Region-based Convolutional Neural Network (R-CNN) and other pertinent deep learning libraries. Our ultimate goal is to create a cost-effective AI platform that will empower oncologists, radiologists, and patients in resource-limited settings to achieve highly effective early lung cancer screening. This initiative promises to reduce costs and enhance treatment options, significantly benefiting patients.