Skip to main content

2024 | OriginalPaper | Buchkapitel

BCSNP-ML: A Novel Breast Cancer Prediction Model Base on LightGBM and Estrogen Metabolic Enzyme Genes

verfasst von : Tianlei Zheng, Shi Geng, Wei Yan, Fengjun Guan, Na Yang, Lei Zhao, Bei Zhang, Xueyan Zhou, Deqiang Cheng

Erschienen in: Proceedings of the 2nd International Conference on Internet of Things, Communication and Intelligent Technology

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Estrogen-related metabolic enzyme gene polymorphisms have been demonstrated to be linked to breast cancer, and in this paper, a novel noninvasive breast cancer prediction model was developed utilizing machine learning algorithms incorporating estrogen metabolic enzyme gene single nucleotide polymorphisms (SNPs). To precisely forecast the susceptibility to breast cancer,, the coded data of 14 SNPs from enrolled breast patients and normal women were randomly shuffled, with 80% of the data designated as training data, the remaining 20% reserved as the test group to be validated. Single factor analysis was performed to screen independent risk factors, and subsequent application of Breast Cancer with Single Nucleotide Polymorphisms - Machine Learning model (BCSNP-ML) prediction model was completed using Light Gradient Boosting Machine (LightGBM) algorithm. A total of 14 SNPs variables from 280 subjects were utilized in this study. Single factor analysis indicated that a meaningful association between SULT1A1 rs1042028, CYP1A1 rs1048943, CYP1B1 rs1056827, CYP1A1 rs1056836 and the incidence of breast cancer, with 14 variables demonstrates a notable area under the receiver operating characteristic curve (AUROC) of 0.809. The AUROC of the BCSNP-ML model constructed by four variables was 0.831. Additionally, BCSNP-ML is visualized and interpretated in the paper using SHapley Additive exPlanations analysis to further validate that the model exhibits great potential as a robust tool for clinical forecasting of breast cancer.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Michailidou, K., Hall, P., Gonzalez-Neira, A., et al.: Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat. Genet. 45(4), 353–361 (2013)CrossRef Michailidou, K., Hall, P., Gonzalez-Neira, A., et al.: Large-scale genotyping identifies 41 new loci associated with breast cancer risk. Nat. Genet. 45(4), 353–361 (2013)CrossRef
2.
Zurück zum Zitat Yin, M., et al.: Analysis on incidence and mortality trends and age-period-cohort of breast cancer in Chinese women from 1990 to 2019. Int. J. Environ. Res. Publ. Health 20(1) (2023) Yin, M., et al.: Analysis on incidence and mortality trends and age-period-cohort of breast cancer in Chinese women from 1990 to 2019. Int. J. Environ. Res. Publ. Health 20(1) (2023)
3.
Zurück zum Zitat Yager, J.D., Davidson, N.E.: Estrogen carcinogenesis in breast cancer. N. Engl. J. Med. 354(3), 270–282 (2006)CrossRef Yager, J.D., Davidson, N.E.: Estrogen carcinogenesis in breast cancer. N. Engl. J. Med. 354(3), 270–282 (2006)CrossRef
4.
Zurück zum Zitat Clemons, M., Goss, P.: Estrogen and the risk of breast cancer. N. Engl. J. Med. 344(4), 276–285 (2001)CrossRef Clemons, M., Goss, P.: Estrogen and the risk of breast cancer. N. Engl. J. Med. 344(4), 276–285 (2001)CrossRef
5.
Zurück zum Zitat Peto, J., Mack, T.M.: High constant incidence in twins and other relatives of women with breast cancer. Nat. Genet. 26(4), 411–414 (2000)CrossRef Peto, J., Mack, T.M.: High constant incidence in twins and other relatives of women with breast cancer. Nat. Genet. 26(4), 411–414 (2000)CrossRef
6.
Zurück zum Zitat Michailidou, K., et al.: Association analysis identifies 65 new breast cancer risk loci. Nature 551(7678), 92–94 (2017)CrossRef Michailidou, K., et al.: Association analysis identifies 65 new breast cancer risk loci. Nature 551(7678), 92–94 (2017)CrossRef
7.
Zurück zum Zitat Friesenhengst, A., et al.: Elevated aromatase (CYP19A1) expression is associated with a poor survival of patients with Estrogen receptor positive breast cancer. Horm. Cancer 9(2), 128–138 (2018)CrossRef Friesenhengst, A., et al.: Elevated aromatase (CYP19A1) expression is associated with a poor survival of patients with Estrogen receptor positive breast cancer. Horm. Cancer 9(2), 128–138 (2018)CrossRef
8.
Zurück zum Zitat Bahreini, F., et al.: MiR-559 polymorphism rs58450758 is linked to breast cancer. Br. J. Biomed. Sci. 77(1), 29–34 (2020)MathSciNetCrossRef Bahreini, F., et al.: MiR-559 polymorphism rs58450758 is linked to breast cancer. Br. J. Biomed. Sci. 77(1), 29–34 (2020)MathSciNetCrossRef
9.
Zurück zum Zitat Mavaddat, N., et al.: Prediction of breast cancer risk based on profiling with common genetic variants. J. Natl. Cancer Inst. 107(5) (2015) Mavaddat, N., et al.: Prediction of breast cancer risk based on profiling with common genetic variants. J. Natl. Cancer Inst. 107(5) (2015)
10.
Zurück zum Zitat Reinbolt, R.E., et al.: Genomic risk prediction of aromatase inhibitor-related arthralgia in patients with breast cancer using a novel machine-learning algorithm. Cancer Med. 7(1), 240–253 (2018)CrossRef Reinbolt, R.E., et al.: Genomic risk prediction of aromatase inhibitor-related arthralgia in patients with breast cancer using a novel machine-learning algorithm. Cancer Med. 7(1), 240–253 (2018)CrossRef
11.
Zurück zum Zitat Cui, P., et al.: SNP rs2071095 in LincRNA H19 is associated with breast cancer risk. Breast Cancer Res. Treat. 171(1), 161–171 (2018)CrossRef Cui, P., et al.: SNP rs2071095 in LincRNA H19 is associated with breast cancer risk. Breast Cancer Res. Treat. 171(1), 161–171 (2018)CrossRef
12.
Zurück zum Zitat Desautels, T., et al.: Prediction of early unplanned intensive care unit readmission in a UK tertiary care hospital: a cross-sectional machine learning approach. BMJ Open 7(9), e017199 (2017)CrossRef Desautels, T., et al.: Prediction of early unplanned intensive care unit readmission in a UK tertiary care hospital: a cross-sectional machine learning approach. BMJ Open 7(9), e017199 (2017)CrossRef
13.
Zurück zum Zitat Ho, D.S.W., et al.: Machine learning SNP based prediction for precision medicine. Front. Genet. 10, 267 (2019)CrossRef Ho, D.S.W., et al.: Machine learning SNP based prediction for precision medicine. Front. Genet. 10, 267 (2019)CrossRef
14.
Zurück zum Zitat Pattarabanjird, T., et al.: A machine learning model utilizing a Novel SNP shows enhanced prediction of coronary artery disease severity. Genes (Basel) 11(12) (2020) Pattarabanjird, T., et al.: A machine learning model utilizing a Novel SNP shows enhanced prediction of coronary artery disease severity. Genes (Basel) 11(12) (2020)
15.
Zurück zum Zitat Gaudillo, J., et al.: Machine learning approach to single nucleotide polymorphism-based asthma prediction. PLoS ONE 14(12), e0225574 (2019)CrossRef Gaudillo, J., et al.: Machine learning approach to single nucleotide polymorphism-based asthma prediction. PLoS ONE 14(12), e0225574 (2019)CrossRef
16.
Zurück zum Zitat Wang, H.Y., et al.: Machine learning-based method for obesity risk evaluation using single-nucleotide polymorphisms derived from next-generation sequencing. J. Comput. Biol. 25(12), 1347–1360 (2018)CrossRef Wang, H.Y., et al.: Machine learning-based method for obesity risk evaluation using single-nucleotide polymorphisms derived from next-generation sequencing. J. Comput. Biol. 25(12), 1347–1360 (2018)CrossRef
17.
Zurück zum Zitat Tai, K.Y., Dhaliwal, J., Wong, K.: Risk score prediction model based on single nucleotide polymorphism for predicting malaria: a machine learning approach. BMC Bioinform. 23(1), 325 (2022)CrossRef Tai, K.Y., Dhaliwal, J., Wong, K.: Risk score prediction model based on single nucleotide polymorphism for predicting malaria: a machine learning approach. BMC Bioinform. 23(1), 325 (2022)CrossRef
18.
Zurück zum Zitat Lakeman, I.M.M., et al.: Addition of a 161-SNP polygenic risk score to family history-based risk prediction: impact on clinical management in non-BRCA1/2 breast cancer families. J. Med. Genet. 56(9), 581–589 (2019)CrossRef Lakeman, I.M.M., et al.: Addition of a 161-SNP polygenic risk score to family history-based risk prediction: impact on clinical management in non-BRCA1/2 breast cancer families. J. Med. Genet. 56(9), 581–589 (2019)CrossRef
19.
Zurück zum Zitat Reeves, G.K., et al.: Incidence of breast cancer and its subtypes in relation to individual and multiple low-penetrance genetic susceptibility loci. JAMA 304(4), 426–434 (2010)CrossRef Reeves, G.K., et al.: Incidence of breast cancer and its subtypes in relation to individual and multiple low-penetrance genetic susceptibility loci. JAMA 304(4), 426–434 (2010)CrossRef
20.
Zurück zum Zitat Lee, O., et al.: Association of genetic polymorphisms with local steroid metabolism in human benign breasts. Steroids 177, 108937 (2022)CrossRef Lee, O., et al.: Association of genetic polymorphisms with local steroid metabolism in human benign breasts. Steroids 177, 108937 (2022)CrossRef
21.
Zurück zum Zitat Babu, G., Bin Islam, S., Khan, M.A.: A review on the genetic polymorphisms and susceptibility of cancer patients in Bangladesh. Mol. Biol. Rep. 49(7), 6725–6739 (2022)CrossRef Babu, G., Bin Islam, S., Khan, M.A.: A review on the genetic polymorphisms and susceptibility of cancer patients in Bangladesh. Mol. Biol. Rep. 49(7), 6725–6739 (2022)CrossRef
22.
Zurück zum Zitat Kristanti, A.N., et al.: Anticancer potential of beta-Sitosterol and Oleanolic acid as through inhibition of human estrogenic 17beta-hydroxysteroid dehydrogenase type-1 based on an in silico approach. RSC Adv. 12(31), 20319–20329 (2022)CrossRef Kristanti, A.N., et al.: Anticancer potential of beta-Sitosterol and Oleanolic acid as through inhibition of human estrogenic 17beta-hydroxysteroid dehydrogenase type-1 based on an in silico approach. RSC Adv. 12(31), 20319–20329 (2022)CrossRef
23.
Zurück zum Zitat Khorshid Shamshiri, A., et al.: Genetic architecture of mammographic density as a risk factor for breast cancer: a systematic review. Clin. Transl. Oncol. 25(6), 1729–1747 (2023)CrossRef Khorshid Shamshiri, A., et al.: Genetic architecture of mammographic density as a risk factor for breast cancer: a systematic review. Clin. Transl. Oncol. 25(6), 1729–1747 (2023)CrossRef
24.
Zurück zum Zitat Yi, M., Negishi, M., Lee, S.J.: Estrogen Sulfotransferase (SULT1E1): its molecular regulation, polymorphisms, and clinical perspectives. J. Pers. Med. 11(3) (2021) Yi, M., Negishi, M., Lee, S.J.: Estrogen Sulfotransferase (SULT1E1): its molecular regulation, polymorphisms, and clinical perspectives. J. Pers. Med. 11(3) (2021)
25.
Zurück zum Zitat Li, J., et al.: Value of UGT2B7-161 single nucleotide polymorphism in predicting the risk of cardiotoxicity in HER-2 positive breast cancer patients who underwent Pertuzumab combined with Trastuzumab therapy by PSL. Pharmgenomics Pers. Med. 15, 215–225 (2022) Li, J., et al.: Value of UGT2B7-161 single nucleotide polymorphism in predicting the risk of cardiotoxicity in HER-2 positive breast cancer patients who underwent Pertuzumab combined with Trastuzumab therapy by PSL. Pharmgenomics Pers. Med. 15, 215–225 (2022)
26.
Zurück zum Zitat Nyangwara, V.A., et al.: Cardiotoxicity and pharmacogenetics of doxorubicin in black Zimbabwean breast cancer patients. Br. J. Clin. Pharmacol. (2023) Nyangwara, V.A., et al.: Cardiotoxicity and pharmacogenetics of doxorubicin in black Zimbabwean breast cancer patients. Br. J. Clin. Pharmacol. (2023)
27.
Zurück zum Zitat Jin, M., et al.: Association between KRAS gene polymorphisms and genetic susceptibility to breast cancer in a Chinese population. J. Clin. Lab. Anal. 37(1), e24806 (2023)CrossRef Jin, M., et al.: Association between KRAS gene polymorphisms and genetic susceptibility to breast cancer in a Chinese population. J. Clin. Lab. Anal. 37(1), e24806 (2023)CrossRef
28.
Zurück zum Zitat Quinlan, J.R.: Learning decision tree classifiers. ACM Comput. Surv. 28(1), 71–72 (1996)CrossRef Quinlan, J.R.: Learning decision tree classifiers. ACM Comput. Surv. 28(1), 71–72 (1996)CrossRef
29.
30.
Zurück zum Zitat Cortes, C., Vapnik, V.J.M.L.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)CrossRef Cortes, C., Vapnik, V.J.M.L.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)CrossRef
31.
Zurück zum Zitat Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. ACM (2016) Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. ACM (2016)
32.
Zurück zum Zitat Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems, vol. 30 (2017) Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
33.
Zurück zum Zitat Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017) Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
34.
Zurück zum Zitat Wei, Q., et al.: Machine learning based on eye-tracking data to identify autism spectrum disorder: a systematic review and meta-analysis. J. Biomed. Inform. 137, 104254 (2023)CrossRef Wei, Q., et al.: Machine learning based on eye-tracking data to identify autism spectrum disorder: a systematic review and meta-analysis. J. Biomed. Inform. 137, 104254 (2023)CrossRef
35.
Zurück zum Zitat Morgenstern, J.D., et al.: Perspective: big data and machine learning could help advance nutritional epidemiology. Adv. Nutr. 12(3), 621–631 (2021)CrossRef Morgenstern, J.D., et al.: Perspective: big data and machine learning could help advance nutritional epidemiology. Adv. Nutr. 12(3), 621–631 (2021)CrossRef
36.
Zurück zum Zitat Liew, B.X.W., et al.: Machine learning versus logistic regression for prognostic modelling in individuals with non-specific neck pain. Eur. Spine J. 31(8), 2082–2091 (2022)CrossRef Liew, B.X.W., et al.: Machine learning versus logistic regression for prognostic modelling in individuals with non-specific neck pain. Eur. Spine J. 31(8), 2082–2091 (2022)CrossRef
37.
Zurück zum Zitat Founta, K., et al.: Gene targeting in amyotrophic lateral sclerosis using causality-based feature selection and machine learning. Mol. Med. 29(1), 12 (2023)CrossRef Founta, K., et al.: Gene targeting in amyotrophic lateral sclerosis using causality-based feature selection and machine learning. Mol. Med. 29(1), 12 (2023)CrossRef
38.
Zurück zum Zitat Yin, L., Ma, P., Deng, Z.: JLGBMLoc-a novel high-precision indoor localization method based on LightGBM. Sensors (Basel) 21(8) (2021) Yin, L., Ma, P., Deng, Z.: JLGBMLoc-a novel high-precision indoor localization method based on LightGBM. Sensors (Basel) 21(8) (2021)
39.
Zurück zum Zitat Gupta, V., Kumar, E.: H(3)O-LGBM: hybrid Harris hawk optimization based light gradient boosting machine model for real-time trading. Artif. Intell. Rev., 1–24 (2023) Gupta, V., Kumar, E.: H(3)O-LGBM: hybrid Harris hawk optimization based light gradient boosting machine model for real-time trading. Artif. Intell. Rev., 1–24 (2023)
40.
Zurück zum Zitat Xie, P., et al.: An explainable machine learning model for predicting in-hospital amputation rate of patients with diabetic foot ulcer. Int. Wound J. 19(4), 910–918 (2022)CrossRef Xie, P., et al.: An explainable machine learning model for predicting in-hospital amputation rate of patients with diabetic foot ulcer. Int. Wound J. 19(4), 910–918 (2022)CrossRef
41.
Zurück zum Zitat Zhao, F., et al.: Discovery of breast cancer risk genes and establishment of a prediction model based on Estrogen metabolism regulation. BMC Cancer 21(1), 194 (2021)CrossRef Zhao, F., et al.: Discovery of breast cancer risk genes and establishment of a prediction model based on Estrogen metabolism regulation. BMC Cancer 21(1), 194 (2021)CrossRef
42.
Zurück zum Zitat Roberts, E., Howell, S., Evans, D.G.: Polygenic risk scores and breast cancer risk prediction. Breast 67, 71–77 (2023)CrossRef Roberts, E., Howell, S., Evans, D.G.: Polygenic risk scores and breast cancer risk prediction. Breast 67, 71–77 (2023)CrossRef
43.
Zurück zum Zitat Lopes Cardozo, J.M.N., et al.: Associations of a breast cancer polygenic risk score with Tumor characteristics and survival. J. Clin. Oncol. 41(10), 1849–1863 (2023) Lopes Cardozo, J.M.N., et al.: Associations of a breast cancer polygenic risk score with Tumor characteristics and survival. J. Clin. Oncol. 41(10), 1849–1863 (2023)
44.
Zurück zum Zitat Warren Andersen, S., et al.: The associations between a polygenic score, reproductive and menstrual risk factors and breast cancer risk. Breast Cancer Res. Treat. 140(2), 427–434 (2013)CrossRef Warren Andersen, S., et al.: The associations between a polygenic score, reproductive and menstrual risk factors and breast cancer risk. Breast Cancer Res. Treat. 140(2), 427–434 (2013)CrossRef
Metadaten
Titel
BCSNP-ML: A Novel Breast Cancer Prediction Model Base on LightGBM and Estrogen Metabolic Enzyme Genes
verfasst von
Tianlei Zheng
Shi Geng
Wei Yan
Fengjun Guan
Na Yang
Lei Zhao
Bei Zhang
Xueyan Zhou
Deqiang Cheng
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2757-5_66

Premium Partner