Mortality prediction with adjuvant tamoxifen in breast cancer: Machine learning-integrated explainable artificial intelligence and Bayesian model results
Original scientific article
DOI:
https://doi.org/10.5599/admet.3321Keywords:
Selective estrogen receptor modulator, personalized medicine, extreme gradient boosting, hormonal therapyAbstract
Background and purpose: Tamoxifen is a cornerstone of adjuvant endocrine therapy for breast cancer, yet significant inter-individual variability in treatment response and mortality exists. Identifying robust predictors of outcomes remains a critical need. This study integrated machine learning, explainable artificial intelligence (XAI) and Bayesian modelling to predict mortality and identify key prognostic factors in breast cancer patients receiving adjuvant tamoxifen. Experimental approach: We analysed data from 568 patients from the International Tamoxifen Pharmacogenomics Consortium database. The outcome was all-cause mortality, with predictors including age, race, menopausal status, tumour size, estrogen receptor status, radiation treatment, and CYP2D6 metabolizer status. Four algorithms, logistic regression, random forest, eXtreme Gradient Boosting (XGBoost) and support vector machine, were developed and validated. Model performance was assessed using accuracy and area under the receiver operating characteristic curve (AUC). SHapley Additive exPlanations (SHAP) analysis provided interpretability for the XGBoost model, and Bayesian logistic regression with weakly informative priors was employed for probabilistic inference. Key results: The overall mortality rate was 19.4 %. XGBoost demonstrated the highest discriminative ability (AUC 0.833; 95 % confidence interval: 0.725 to 0.941), while random forest exhibited superior sensitivity for identifying deceased patients (83.3 %). SHAP analysis revealed that white race, increased age, absence of radiation treatment, larger tumour size and the CYP2D6 poor metabolizer (PM/PM) genotype were associated with elevated mortality risk, whereas the extensive metabolizer (EM/EM) genotype was protective. Significant variability was observed in exploratory subgroup analyses, with the model achieving excellent discrimination in patients without radiation treatment (AUC 0.901) and those with the EM/PM genotype (AUC 0.956) but failing to identify any mortality events in the Caucasian subgroup. Bayesian logistic regression yielded comparable performance to frequentist methods (AUC 0.820), with tumour size emerging as a consistently strong predictor in partial dependence plots. Conclusion: Integrating machine learning with XAI and Bayesian approaches effectively identified key predictors of mortality in tamoxifen-treated breast cancer patients. However, marked heterogeneity in model performance across subgroups highlights the critical need for external validation and careful evaluation of algorithmic fairness before clinical implementation.
Downloads
References
[1] J. Huang, P.S. Chan, V. Lok, X. Chen, H. Ding, Y. Jin, J. Yuan, X.Q. Lao, Z.J. Zheng, M.C. Wong. Global incidence and mortality of breast cancer: a trend analysis. Aging (Albany NY) 13 (2021) 5748-5803. https://doi.org/10.18632/aging.202502 DOI: https://doi.org/10.18632/aging.202502
[2] F. Lumachi, G. Luisetto, S. Mm Basso, U. Basso, A. Brunello, V. Camozzi. Endocrine therapy of breast cancer. Current medicinal chemistry 18 (2011) 513-522. https://doi.org/10.2174/092986711794480177 DOI: https://doi.org/10.2174/092986711794480177
[3] A. Howell, S.J. Howell. Tamoxifen evolution. British Journal of Cancer 128 (2023) 421-425. https://doi.org/10.1038/s41416-023-02158-5 DOI: https://doi.org/10.1038/s41416-023-02158-5
[4] S. Manna, M.K. Holz. Tamoxifen action in ER-negative breast cancer. Signal transduction insights 5 (2016) STI-29901. https://doi.org/10.4137/sti.s29901 DOI: https://doi.org/10.4137/STI.S29901
[5] Early Breast Cancer Trialists' Collaborative Group. Aromatase inhibitors versus tamoxifen in early breast cancer: patient-level meta-analysis of the randomised trials. The Lancet 386 (2015) 1341-1352. https://doi.org/10.1016/s0140-6736(15)61074-1 DOI: https://doi.org/10.1016/S0140-6736(15)61074-1
[6] Y. Yang, W. Pan, X. Tang, S. Wu, X. Sun. A meta-analysis of randomized controlled trials comparing the efficacy and safety of anastrozole versus tamoxifen for breast cancer. Oncotarget 8 (2017) 48362-48374. https://doi.org/10.18632/oncotarget.16466 DOI: https://doi.org/10.18632/oncotarget.16466
[7] W. Lorizio, A.H. Wu, M.S. Beattie, H. Rugo, S. Tchu, K. Kerlikowske, E. Ziv. Clinical and biomarker predictors of side effects from tamoxifen. Breast cancer research and treatment 132 (2012) 1107-1118. https://doi.org/10.1007/s10549-011-1893-4 DOI: https://doi.org/10.1007/s10549-011-1893-4
[8] J.M. Hoskins, L.A. Carey, H.L. McLeod. CYP2D6 and tamoxifen: DNA matters in breast cancer. Nature Reviews Cancer 9 (2009) 576-586. https://doi.org/10.1038/nrc2683 DOI: https://doi.org/10.1038/nrc2683
[9] Z. Zeng, Y. Liu, Z. Liu, J. You, Z. Chen, J. Wang, Q. Peng, L. Xie, R. Li, S. Li, X. Qin. CYP2D6 polymorphisms influence tamoxifen treatment outcomes in breast cancer patients: a meta-analysis. Cancer chemotherapy and pharmacology 72 (2013) 287-303. https://doi.org/10.1007/s00280-013-2195-9 DOI: https://doi.org/10.1007/s00280-013-2195-9
[10] M.E. Ozer, P.O. Sarica, K.Y. Arga. New machine learning applications to accelerate personalized medicine in breast cancer: rise of the support vector machines. Omics: a journal of integrative biology 24 (2020) 241-246. https://doi.org/10.1089/omi.2020.0001 DOI: https://doi.org/10.1089/omi.2020.0001
[11] C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature machine intelligence 1 (2019) 206-215. https://doi.org/10.1038/s42256-019-0048-x DOI: https://doi.org/10.1038/s42256-019-0048-x
[12] A. Aravindkumar, M. Ramadoss, S.A. Fakhruddin Ahmed, V. Sampath, K. Lakshminarayanan. Explainable AI in healthcare: a systematic review of XAI use cases in imaging, diagnostics, and rehabilitation. Front Artif Intell 9 (2026) 1749527. https://doi.org/10.3389/frai.2026.1749527 DOI: https://doi.org/10.3389/frai.2026.1749527
[13] M. A. Province, M.P. Goetz, H. Brauch, D.A. Flockhart, J.M. Hebert, R. Whaley, V.J. Suman, W. Schroth, S. Winter, H. Zembutsu, T. Mushiroda. CYP2D6 genotype and adjuvant tamoxifen: meta‐analysis of heterogeneous study populations. Clinical Pharmacology & Therapeutics 95 (2014) 216-227. https://doi.org/10.1038/clpt.2013.186 DOI: https://doi.org/10.1038/clpt.2013.186
[14] R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at: https://www.R-project.org/ (Accessed on 27 April 2026).
[15] C. Verschraegen, V. Vinh-Hung, G. Cserni, R. Gordon, M.E. Royce, G. Vlastos, P. Tai, G. Storme. Modeling the effect of tumor size in early breast cancer. Annals of surgery 241 (2005) 309-318. https://doi.org/10.1097/01.sla.0000150245.45558.a9 DOI: https://doi.org/10.1097/01.sla.0000150245.45558.a9
[16] A.W. Fyles, D.R. McCready, L.A. Manchul, M.E. Trudeau, P. Merante, M. Pintilie, L.M. Weir, I.A. Olivotto. Tamoxifen with or without breast irradiation in women 50 years of age or older with early breast cancer. New England Journal of Medicine 351 (2004) 963-70. https://doi.org/10.1056/nejmoa040595 DOI: https://doi.org/10.1056/NEJMoa040595
[17] K.A. Bertrand, R.M. Tamimi, C.G. Scott, M.R. Jensen, V.S. Pankratz, D. Visscher, A. Norman, F. Couch, J. Shepherd, B. Fan, Y.Y. Chen. Mammographic density and risk of breast cancer by age and tumor characteristics. Breast Cancer Research 15 (2013) R104. https://doi.org/10.1186/bcr3570 DOI: https://doi.org/10.1186/bcr3570
[18] J. Cuzick, J. Warwick, E. Pinney, R.M. Warren, S.W. Duffy. Tamoxifen and breast density in women at increased risk of breast cancer. Journal of the National Cancer Institute 96 (2004) 621-628. https://doi.org/10.1093/jnci/djh106 DOI: https://doi.org/10.1093/jnci/djh106
[19] C. Owusu, D.S. Buist, T.S. Field, T.L. Lash, S.S. Thwin, A.M. Geiger, V.P. Quinn, F. Frost, M. Prout, M.U. Yood, F. Wei. Predictors of tamoxifen discontinuation among older women with estrogen receptor–positive breast cancer. Journal of Clinical Oncology 26 (2008) 549-55. https://doi.org/10.1200/jco.2006.10.1022 DOI: https://doi.org/10.1200/JCO.2006.10.1022
[20] S.B. Wheeler, J. Spencer, L.C. Pinheiro, C.C. Murphy, J.A. Earp, L. Carey, A. Olshan, C.K. Tse, M.E. Bell, M. Weinberger, K.E. Reeder-Hayes. Endocrine therapy nonadherence and discontinuation in black and white women. JNCI: Journal of the National Cancer Institute 111 (2019) 498-508. https://doi.org/10.1093/jnci/djy136 DOI: https://doi.org/10.1093/jnci/djy136
[21] H.M. Johnson, H. Shivalingappa, W. Irish, J.H. Wong, M. Muzaffar, K. Verbanac, N.A. Vohra. Race May Not Impact Endocrine Therapy–Related Changes in Breast Density. Cancer Epidemiology, Biomarkers & Prevention 29 (2020) 1049-57. https://doi.org/10.1158/1055-9965.epi-19-1066 DOI: https://doi.org/10.1158/1055-9965.EPI-19-1066
[22] D.P. Cronin-Fenton, P. Damkier. Tamoxifen and CYP2D6: a controversy in pharmacogenetics. Advances in pharmacology 83 (2018) 65-91. https://doi.org/10.1016/bs.apha.2018.03.001 DOI: https://doi.org/10.1016/bs.apha.2018.03.001
[23] W. Schroth, M.P. Goetz, U. Hamann, P.A. Fasching, M. Schmidt, S. Winter, P. Fritz, W. Simon, V.J. Suman, M.M. Ames, S.L. Safgren. Association between CYP2D6 polymorphisms and outcomes among women with early stage breast cancer treated with tamoxifen. JAMA 302 (2009) 1429-1436. https://doi.org/10.1001/jama.2009.1420 DOI: https://doi.org/10.1001/jama.2009.1420
[24] R.D. Riley, G.S. Collins, L. Kirton, K.I. Snell, J. Ensor, R. Whittle, P. Dhiman, M. Van Smeden, X. Liu, J. Alderman, K. Nirantharakumar. Uncertainty of risk estimates from clinical prediction models: rationale, challenges, and approaches. BMJ 388 (2025) e080749. https://doi.org/10.1136/bmj-2024-080749 DOI: https://doi.org/10.1136/bmj-2024-080749
[25] Y. Zhang, Y. Weng, J. Lund. Applications of explainable artificial intelligence in diagnosis and surgery. Diagnostics 12 (2022) 237. https://doi.org/10.3390/diagnostics12020237 DOI: https://doi.org/10.3390/diagnostics12020237
[26] T. Hulsen. Explainable artificial intelligence (XAI): concepts and challenges in healthcare. AI 4 (2023) 652-666. https://doi.org/10.3390/ai4030034 DOI: https://doi.org/10.3390/ai4030034
[27] Y. Yang, X. Liu. Application of explainable artificial intelligence integrating with electronic health record in oncology. Exploration of Targeted Anti-Tumor Therapy 7 (2026) 1002357. https://doi.org/10.37349/etat.2026.1002357 DOI: https://doi.org/10.37349/etat.2026.1002357
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Kannan Sridharan, Gowri Sivaramakrishnan

This work is licensed under a Creative Commons Attribution 4.0 International License.



