By: Ritvik Pulya
Every year, almost 30% of all breast cancer cases are diagnosed late, with a 68% difference in five-year survival rates between Stage 2 and Stage 4 individuals (American Cancer Society). However, current breast cancer risk estimation models such as the Tyler-Cuzick and Gail models have moderate predictive accuracy (only slightly better than 50% or choosing an outcome randomly) and only focus on risk factors such as BRCA mutations, family history, and progesterone/estrogen levels. Hence, since breast cancer imaging is typically performed every two years (biennially), I wanted to develop an image-based breast cancer risk quantification model looking at MR and mammogram scans, which account for 91.3% of all breast cancer diagnoses.
Working with researchers at Harvard Medical School, I developed an algorithm that had two main functions. First, the algorithm could take FFDM and MR images and segment out fibroglandular tissue, a fatty tissue that has been associated with higher risk for breast cancer. These MR images were separated into different types (before and after adding a contrast agent). Then, I used CNNs (convolutional neural networks) which are deep learning models that can predict the outcome (high risk vs low risk for breast cancer in this case). Overall, I had seven total models (the individual images and combined ensemble models), and I achieved a 94% percent validation accuracy for the pre-contrast MR images + mammogram combined model indicating that contrast agents don’t necessarily need to be used with my model (some contrast agents have harmful effects).
This project won 3rd place overall in the Region IV Science Fair and won the 1st place Computer Science/Math award. It has also been submitted to several conferences for publication.