A Machine Learning Approach for Early Pancreatic Cancer Risk Stratification Using Circulating MicroRNA Profiles
DOI:
https://doi.org/10.58445/rars.2610Keywords:
Pancreatic Cancer, MicroRNA, Machine learning, Early Detection, Biomarkers, Precision medicine, Bioinformatics, Cancer diagnostics, CancerAbstract
Pancreatic cancer is one of the deadliest cancers, with a five-year survival rate under 11% due to the lack of reliable early detection tools. Most cases are diagnosed at advanced stages, when treatment options are limited. Recent research has highlighted the potential of circulating microRNAs (miRNAs) as non-invasive biomarkers for early cancer detection. This study aims to identify miRNA expression signatures in pre-diagnostic plasma samples that can predict the risk of pancreatic cancer using machine learning methods.
We analyzed a nested case-control dataset (GSE262260) with 924 samples collected up to five years before diagnosis. After rigorous preprocessing, including missing value imputation, near-zero variance filtering, and recursive feature elimination, we selected 100 informative miRNAs. Logistic Regression, Random Forest, and Gradient Boosting classifiers were trained and evaluated on a 70/30 train-test split. Logistic Regression performed best, achieving an ROC AUC of 0.6765. Feature importance analysis revealed that certain miRNAs, such as miR-147a and miR-494-5p, were associated with a reduced risk of pancreatic cancer, while others, like miR-202-3p, miR-10a-5p, and miR-885-5p, were linked to increased risk. These findings are consistent with prior literature suggesting tissue- and context-specific roles for these miRNAs in cancer biology.
This study demonstrates the feasibility of using machine learning and circulating miRNAs to stratify pancreatic cancer risk years before clinical diagnosis. While further validation is needed in larger and more diverse cohorts, the results support the promise of miRNA-based, non-invasive screening tools to improve early detection and potentially patient outcomes in pancreatic cancer/
References
Bartel, D. P. (2004). MicroRNAs. Cell, 116(2), 281–297. https://doi.org/10.1016/s0092-8674(04)00045-5
Chen, P., Zhang, W., Chen, Y., Zheng, X., & Yang, D. (2020). Comprehensive analysis of aberrantly expressed long non‑coding RNAs, microRNAs, and mRNAs associated with the competitive endogenous RNA network in cervical cancer. Molecular Medicine Reports, 22(1), 405–415. https://doi.org/10.3892/mmr.2020.11120
Li, C., Wang, X., & Song, Q. (2020). MicroRNA 885-5p Inhibits Hepatocellular Carcinoma Metastasis by Repressing AEG1; OncoTargets and Therapy, Volume 13, 981–988. https://doi.org/10.2147/ott.s228576
Lu, Y., & Luan, X. R. (2019). miR-147a suppresses the metastasis of non-small-cell lung cancer by targeting CCL5. Journal of International Medical Research, 48(4). https://doi.org/10.1177/0300060519883098
Mitchell, P. S., Parkin, R. K., Kroh, E. M., Fritz, B. R., Wyman, S. K., Pogosova-Agadjanyan, E. L., Peterson, A., Noteboom, J., O’Briant, K. C., Allen, A., Lin, D. W., Urban, N., Drescher, C. W., Knudsen, B. S., Stirewalt, D. L., Gentleman, R., Vessella, R. L., Nelson, P. S., Martin, D. B., & Tewari, M. (2008). Circulating microRNAs as stable blood-based markers for cancer detection. Proceedings of the National Academy of Sciences, 105(30), 10513–10518. https://doi.org/10.1073/pnas.0804549105
O’Neill, K., Syed, N., Crook, T., Dubey, S., Potharaju, M., Limaye, S., Ranade, A., Anichini, G., Patil, D., Datta, V., & Datar, R. (2023). Profiling of circulating glial cells for accurate blood‐based diagnosis of glial malignancies. International Journal of Cancer, 154(7), 1298–1308. https://doi.org/10.1002/ijc.34827
Rachagani, S., Macha, M. A., Heimann, N., Seshacharyulu, P., Haridas, D., Chugh, S., & Batra, S. K. (2014). Clinical implications of miRNAs in the pathogenesis, diagnosis and therapy of pancreatic cancer. Advanced Drug Delivery Reviews, 81, 16–33. https://doi.org/10.1016/j.addr.2014.10.020
Schultz, N. A., Dehlendorff, C., Jensen, B. V., Bjerregaard, J. K., Nielsen, K. R., Bojesen, S. E., Calatayud, D., Nielsen, S. E., Yilmaz, M., Holländer, N. H., Andersen, K. K., & Johansen, J. S. (2014). MicroRNA biomarkers in whole blood for detection of pancreatic cancer. JAMA, 311(4), 392. https://doi.org/10.1001/jama.2013.284664
Sidey-Gibbons, J. a. M., & Sidey-Gibbons, C. J. (2019). Machine learning in medicine: a practical introduction. BMC Medical Research Methodology, 19(1). https://doi.org/10.1186/s12874-019-0681-4
Siegel, R. L., Miller, K. D., Wagle, N. S., & Jemal, A. (2023). Cancer statistics, 2023. CA a Cancer Journal for Clinicians, 73(1), 17–48. https://doi.org/10.3322/caac.21763
Wang, J., Tao, W., Chen, X., Farokhzad, O. C., & Liu, G. (2017). Emerging Advances in Nanotheranostics with Intelligent Bioresponsive Systems. Theranostics, 7(16), 3915–3919. https://doi.org/10.7150/thno.21317
Downloads
Posted
Categories
License
Copyright (c) 2025 Krithik Alluri

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.