Revolutionizing gastric cancer diagnosis through advanced machine learning approaches
Abstract
Early detection of gastric cancer through a Computer-Aided Detection (CAD) system has the potential to significantly reduce the mortality rate associated with this disease. This study aims to investigate the effects of class imbalance on the performance of machine learning classifiers in this context. Using a dataset of 145,787 screening records from NHS Liverpool Hospital, we employed stratified sampling to create balanced and unbalanced datasets and evaluated the performance of four machine learning algorithms—Logistic Regression, Support Vector Machine, Naive Bayes, and Multilayer Perceptron—under five different test conditions. The study’s novelty lies in its detailed examination of class imbalance in gastric cancer diagnosis, emphasizing the crucial role of balanced datasets in machine learning-based early detection systems. For the MLP model under 10-fold cross-validation, the Class 0 sensitivity (non-cancer cases) of the unbalanced dataset was 0.968, higher than the balanced dataset’s 0.902. However, the Class 1 sensitivity (cancer cases) and Positive Predictive Value (PPV) of the unbalanced dataset were much lower (0.383 and 0.527) than those of the balanced dataset (0.959 and 0.907), indicating a significant improvement in identifying true positive cases when using a balanced dataset. These findings highlight the negative effect of class imbalance on prediction accuracy for positive cancer cases and underscore the importance of addressing this imbalance for more reliable and accurate predictions in medical diagnosis and screening. This approach has the potential to improve patient outcomes and may contribute to strategies aimed at reducing the mortality rate associated with gastric cancer.
Keywords
Full Text:
PDFReferences
1. Jamil D, Palaniappan S, Lokman A, et al. Diagnosis of gastric cancer using machine learning techniques in healthcare sector: A survey. Informatica 2022; 45(7): 147–166. doi: 10.31449/inf.v45i7.3633
2. Jamil D, Palaniappan S, Zia SS, et al. Reducing the risk of gastric cancer through proper nutrition—A meta-analysis. International Journal of Online and Biomedical Engineering (iJOE) 2022; 18(7): 115–150. doi: 10.3991/ijoe.v18i07.30487.
3. Kolozsi P, Varga Z, Toth D. Indications and technical aspects of proximal gastrectomy. Frontiers in Surgery 2023; 10: 1115139. doi: 10.3389/fsurg.2023.1115139
4. World Health Organization. Cancer. Available online: http//www who intmediacentre/factsheets/fs297/en. (accessed on 12 May 2022).
5. Guo J, Liu C, Pan J, Yang J. Relationship between diabetes and risk of gastric cancer: A systematic review and meta-analysis of cohort studies. Diabetes Research and Clinical Practice 2022; 187: 109866. doi: 10.1016/j.diabres.2022.109866
6. Decherchi S, Pedrini E, Mordenti M, et al. Opportunities and challenges for machine learning in rare diseases. Frontiers in Medicine 2021; 8: 747612. doi: 10.3389/fmed.2021.747612
7. Jamil D, Palaniappan S, Debnath SK, et al. Prediction model for gastric cancer via class balancing techniques. International Journal of Computer Science and Network Security 2023; 23(1): 53–63.
8. Yu C, Helwig EJ. The role of AI technology in prediction, diagnosis and treatment of colorectal cancer. Artificial Intelligence Review 2022; 55(1): 323–343. doi: 10.1007/s10462-021-10034-y
9. Chen RJ, Lu MY, Chen TY, et al. Synthetic data in machine learning for medicine and healthcare. Nature Biomedical Engineering 2021; 5(6): 493–497. doi: 10.1038/s41551-021-00751-8
10. Xia JY, Aadam AA. Advances in screening and detection of gastric cancer. Journal of Surgical Oncology 2022; 125(7): 1104–1109. doi: 10.1002/jso.26844
11. Conti CB, Agnesi S, Scaravaglio M, et al. Early gastric cancer: Update on prevention, diagnosis and treatment. International Journal of Environmental Research and Public Health 2023; 20(3): 2149. doi: 10.3390/ijerph20032149
12. D Jamil, S Palaniappan SK Debnath , A Lokman A Prediction Model for Gastric Cancer via Class Balancing Techniques. International Journal of Computer Science Network Security. 2023;23 (01):p53-63 doi: http://paper.ijcsns.org/07_book/202301/20230108.pdf
13. Mahmoodi SA, Mirzaie K, Mahmoodi MS, Mahmoudi SM. A medical decision support system to assess risk factors for gastric cancer based on fuzzy cognitive map. Computational and Mathematical Methods in Medicine 2020; 2020: 1016284. doi: 10.1155/2020/1016284
14. Mirniaharikandehei S, Heidari M, Danala G, et al. Applying a random projection algorithm to optimize machine learning model for predicting peritoneal metastasis in gastric cancer patients using CT images. Computer Methods and Programs in Biomedicine 2021; 200: 105937. doi: 10.1016/j.cmpb.2021.105937
15. Alam MR, Abdul-Ghafar J, Yim K, et al. Recent applications of artificial intelligence from histopathologic image-based prediction of microsatellite instability in solid cancers: A systematic review. Cancers (Basel) 2022; 14(11): 2590. doi: 10.3390/cancers14112590
16. Cao R, Tang L, Fang M, et al. Artificial intelligence in gastric cancer: Applications and challenges. Gastroenterology Report 2022; 10: goac064. doi: 10.1093/gastro/goac064
17. Afrash MR, Shanbehzadeh M, Kazemi-Arpanahi H. Design and development of an intelligent system for predicting 5-year survival in gastric cancer. Clinical Medicine Insights. Oncology 2022; 16: 11795549221116833. doi: 10.1177/11795549221116833
18. Fan Z, He Z, Miao W, Huang R. Critical analysis of risk factors and machine-learning-based gastric cancer risk prediction models: A systematic review. Processes 2023; 11(8): 2324. doi: 10.3390/pr11082324
19. Shilaskar S, Ghatol A, Chatur P. Medical decision support system for extremely imbalanced datasets. Information Sciences 2017; 384: 205–219. doi: 10.1016/j.ins.2016.08.077
20. Ricci F, Rokach L, Shapira B. Recommender Systems Handbook. Springer; 2022.
21. Liu D, Wang X, Li L, et al. Machine learning-based model for the prognosis of postoperative gastric cancer. Cancer Management and Research 2022; 14: 135–155. doi: 10.2147/CMAR.S342352
22. Xiao Z, Ji D, Li F, et al. Application of artificial intelligence in early gastric cancer diagnosis. Digestion 2022; 103(1): 69–75. doi: 10.1159/000519601
23. Fujiyoshi MRA, Inoue H, Fujiyoshi Y, et al. Endoscopic classifications of early gastric cancer: A literature review. Cancers (Basel) 2021; 14(1): 100. doi: 10.3390/cancers14010100
24. Mathews L, Hari S. Learning from Imbalanced Data. Springer International Publishing; 2018. doi: 10.4018/978-1-5225-7598-6.ch030
25. Agarwal S, Yadav AS, Dinesh V, et al. By artificial intelligence algorithms and machine learning models to diagnosis cancer. Materials Today: Proceedings 2023; 80: 2969–2975. doi: 10.1016/j.matpr.2021.07.088
26. Nayak J, Favorskaya MN, Jain S, et al. Advanced Machine Learning Approaches in Cancer Prognosis. Springer; 2021.
27. Neto C, Brito M, Lopes V, et al. Application of data mining for the prediction of mortality and occurrence of complications for gastric cancer patients. Entropy (Basel, Switzerland) 2019; 21(12): 1163. doi: 10.3390/e21121163
28. D Jamil, S Palaniappan and A Lokman. (2022). E-Healthcare System Diagnosis and Prediction Using Machine Learning; A Mini Review. Biomedical Journal of Scientific & Technical Research (BJSTR). 45(1), pp.36185-36186. https://10.26717/BJSTR.2022.45.007157
29. Pham BT, Prakash I. Machine learning methods of kernel logistic regression and classification and regression trees for landslide susceptibility assessment at part of Himalayan area, India. Indian Journal of Science and Technology 2018; 11(12): 1–10. doi: 10.17485/ijst/2018/v11i12/99745
30. Hasnine MN, Akcapinar G, Flanagan B, et al. Towards final scores prediction over clickstream using machine learning methods. In: Proceedings of ICCE 2018—26th International Conference on Computers in Education, Workshop Proceedings; 28 November 2018; Manila, Philippines.
31. Fergus P, Chalmers C. Performance evaluation metrics. In: Applied Deep Learning: Tools, Techniques, and Implementation. Springer; 2022. pp. 115–138.
32. Felippe H, Viol A, de Araujo DB, et al. Threshold-free estimation of entropy from a Pearson matrix. EPL (Europhysics Letters) 2023; 141(3): 31003. doi: 10.1209/0295-5075/acb5bd
33. Vyas, S., Gupta, S., Kapoor, M., & Khan, S. (Eds.). (2024). Handbook on Augmenting Telehealth Services: Using Artificial Intelligence (1st ed.). CRC Press. https://doi: 10.3390/e21121163
34. Ishioka M, Osawa H, Hirasawa T, et al. Performance of an artificial intelligence-based diagnostic support tool for early gastric cancers: Retrospective study. Digestive Endoscopy: Official Journal of the Japan Gastroenterological Endoscopy Society 2023; 35(4): 483–491. doi: 10.1111/den.14455
35. Chaudhury P, Tripaty HK. An empirical study on attribute selection of student performance prediction model. International Journal of Learning Technology 2017; 12(3): 241. doi: 10.1504/IJLT.2017.088407
36. Mortezagholi A, Khosravizadehorcid O, Menhaj MB, et al. Make intelligent of gastric cancer diagnosis error in Qazvin’s medical centers: Using data mining method. Asian Pacific Journal of Cancer Prevention: APJCP 2019; 20(9): 2607–2610. doi: 10.31557/APJCP.2019.20.9.2607
37. Danish Jamil, Sellappan Palaniappan, Muhammad Naseem, and Asiah Lokman, "Enhancing Prediction Accuracy in Gastric Cancer Using High-Confidence Machine Learning Models for Class Imbalance," Journal of Advances in Information Technology, Vol. 14, No. 6, pp. 1410-1424, 2023.doi: 10.12720/jait.14.6.1410-1424
38. Janiesch C, Zschech P, Heinrich K. Machine learning and deep learning. Electronic Markets 2021; 31(3): 685–695. doi: 10.1007/s12525-021-00475-2
39. Shehab M, Abualigah L, Shambour Q, et al. Machine learning in medical applications: A review of state-of-the-art methods. Computers in Biology and Medicine 2022; 145: 105458. doi: 10.1016/j.compbiomed.2022.105458
40. Leung WK, Cheung KS, Li B, et al. Applications of machine learning models in the prediction of gastric cancer risk in patients after Helicobacter pylori eradication. Alimentary Pharmacology & Therapeutics 2021; 53(8): 864–872. doi: 10.1111/apt.16272
41. Saxena A, Chandra S. Artificial Intelligence and Machine Learning in Healthcare. Springer Singapore; 2022.
42. Nayak J, Favorskaya MN, Jain S, et al. Advanced Machine Learning Approaches in Cancer Prognosis: Challenges and Applications. Springer International Publishing; 2021.
43. Shaikh FJ, Rao DS. Prediction of cancer disease using machine learning approach. Materials Today: Proceedings 2021; 50: 40–47. doi: 10.1016/j.matpr.2021.03.625
44. Sahid A, Hasan M, Akter N, Tareq MR. Effect of imbalance data handling techniques to improve the accuracy of heart disease prediction using machine learning and deep learning. In: Proceedings of 2022 IEEE Region 10 Symposium (TENSYMP); 1–3 July 2022; Mumbai, India. doi: 10.1109/TENSYMP54529.2022.9864473
45. Ardon L. Improving on Imbalanced Data Classification by Feature Engineering Combined with Random Under-Sampling [Master’s thesis]. Tilburg University; 2020.
46. Kaur H, Pannu HS, Malhi AK. A systematic review on imbalanced data challenges in machine learning: Applications and solutions. ACM Computing Surveys 2019; 52(4): 79. doi: 10.1145/3343440
47. Nemade V, Pathak S, Dubey AK. A systematic literature review of breast cancer diagnosis using machine intelligence techniques. Archives of Computational Methods in Engineering 2022; 29(6): 4401–4430. doi: 10.1007/s11831-022-09738-3
DOI: https://doi.org/10.32629/jai.v7i4.1021
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 Danish Jamil, Sellappan Palaniappan, Muhammad Numan Ali Khan, Syed Mehr Ali Shah
License URL: https://creativecommons.org/licenses/by-nc/4.0/