Machine Learning Techniques for Lung Cancer Risk Prediction using Text Dataset

Authors

  • Kumar Mohan Department of Information Technology, University of Technology and Applied Sciences-Shinas, Al Aqar, Oman
  • Bhraguram Thayyil Department of Information Technology, University of Technology and Applied Sciences-Shinas, Al Aqar, Oman

DOI:

https://doi.org/10.59461/ijdiic.v2i3.73

Keywords:

Machine learning , Prediction , Lung Cancer, Random forest

Abstract

The early symptoms of lung cancer, a serious threat to human health, are comparable to those of the common cold and bronchitis. Clinical professionals can use machine learning techniques to customize screening and prevention strategies to the unique needs of each patient, potentially saving lives and enhancing patient care. Researchers must identify linked clinical and demographic variables from patient records and further pre-process and prepare the dataset for training a machine-learning model in order to properly predict the development of lung cancer. The goal of the study is to develop a precise and understandable machine learning (ML) model for early lung cancer prediction utilizing demographic and clinical variables, as well as to contribute to the growing field of medical research ML application that may improve healthcare outcomes. In order to create the most effective and precise predictive model, machine learning techniques like Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, K-Nearest Neighbor (KNN), and Naive Bayes were utilized in this article.

Downloads

Download data is not yet available.

References

M. Guo et al., “Autologous tumor cell–derived microparticle-based targeted chemotherapy in lung cancer patients with malignant pleural effusion,” Sci. Transl. Med., vol. 11, no. 474, Jan. 2019, doi: 10.1126/scitranslmed.aat5690.

L. Zhang, Y. Hang, M. Liu, N. Li, and H. Cai, “First-Line Durvalumab Plus Platinum-Etoposide Versus Platinum-Etoposide for Extensive-Stage Small-Cell Lung Cancer: A Cost-Effectiveness Analysis,” Front. Oncol., vol. 10, Dec. 2020, doi: 10.3389/fonc.2020.602185.

S. C, H. S A, and G. H L, “Artifact removal techniques for lung CT images in lung cancer detection,” Int. J. Data Informatics Intell. Comput., vol. 1, no. 1, pp. 21–29, Sep. 2022, doi: 10.59461/ijdiic.v1i1.14.

E. Dritsas and M. Trigka, “Lung Cancer Risk Prediction with Machine Learning Models,” Big Data Cogn. Comput., vol. 6, no. 4, p. 139, Nov. 2022, doi: 10.3390/bdcc6040139.

Y. Zhang, B. Dai, M. Dong, H. Chen, and M. Zhou, “A Lung Cancer Detection and Recognition Method Combining Convolutional Neural Network and Morphological Features,” in 2022 IEEE 5th International Conference on Computer and Communication Engineering Technology (CCET), IEEE, Aug. 2022, pp. 145–149. doi: 10.1109/CCET55412.2022.9906329.

A. Aggarwal et al., “The State of Lung Cancer Research: A Global Analysis,” J. Thorac. Oncol., vol. 11, no. 7, pp. 1040–1050, Jul. 2016, doi: 10.1016/j.jtho.2016.03.010.

S. G. Spiro and G. A. Silvestri, “One Hundred Years of Lung Cancer,” Am. J. Respir. Crit. Care Med., vol. 172, no. 5, pp. 523–529, Sep. 2005, doi: 10.1164/rccm.200504-531OE.

Y. She et al., “Development and Validation of a Deep Learning Model for Non–Small Cell Lung Cancer Survival,” JAMA Netw. Open, vol. 3, no. 6, p. e205842, Jun. 2020, doi: 10.1001/jamanetworkopen.2020.5842.

D. Deb, A. C. Moore, and U. B. Roy, “The 2021 Global Lung Cancer Therapy Landscape,” J. Thorac. Oncol., vol. 17, no. 7, pp. 931–936, Jul. 2022, doi: 10.1016/j.jtho.2022.03.018.

A. Hosny et al., “Deep learning for lung cancer prognostication: A retrospective multi-cohort radiomics study,” PLOS Med., vol. 15, no. 11, p. e1002711, Nov. 2018, doi: 10.1371/journal.pmed.1002711.

V. K. Raghu et al., “Validation of a Deep Learning–Based Model to Predict Lung Cancer Risk Using Chest Radiographs and Electronic Medical Record Data,” JAMA Netw. Open, vol. 5, no. 12, p. e2248793, Dec. 2022, doi: 10.1001/jamanetworkopen.2022.48793.

K.-H. Yu et al., “Reproducible Machine Learning Methods for Lung Cancer Detection Using Computed Tomography Images: Algorithm Development and Validation,” J. Med. Internet Res., vol. 22, no. 8, p. e16709, Aug. 2020, doi: 10.2196/16709.

P. Afshar et al., “$$text {DRTOP}$$: deep learning-based radiomics for the time-to-event outcome prediction in lung cancer,” Sci. Rep., vol. 10, no. 1, p. 12366, Jul. 2020, doi: 10.1038/s41598-020-69106-8.

Y. Wu et al., “Old age and EGFR mutation status in inoperable early‐stage non‐small cell lung cancer patients receiving stereotactic ablative radiotherapy: A single institute experience of 71 patients in Taiwan,” Thorac. Cancer, vol. 14, no. 7, pp. 654–661, Mar. 2023, doi: 10.1111/1759-7714.14786.

S. Doppalapudi, R. G. Qiu, and Y. Badr, “Lung cancer survival period prediction and understanding: Deep learning approaches,” Int. J. Med. Inform., vol. 148, p. 104371, Apr. 2021, doi: 10.1016/j.ijmedinf.2020.104371.

Rajesh N., A. Irudayasamy, M. S. K. Mohideen, and C. P. Ranjith, “Classification of Vital Genetic Syndromes Associated With Diabetes Using ANN-Based CapsNet Approach,” Int. J. e-Collaboration, vol. 18, no. 3, pp. 1–18, Aug. 2022, doi: 10.4018/IJeC.307133.

Y. Li, X. Wu, P. Yang, G. Jiang, and Y. Luo, “Machine Learning for Lung Cancer Diagnosis, Treatment, and Prognosis,” Genomics. Proteomics Bioinformatics, vol. 20, no. 5, pp. 850–866, Oct. 2022, doi: 10.1016/j.gpb.2022.11.003.

C. Anil Kumar et al., “Lung Cancer Prediction from Text Datasets Using Machine Learning,” Biomed Res. Int., vol. 2022, pp. 1–10, Jul. 2022, doi: 10.1155/2022/6254177.

E. Dritsas and M. Trigka, “Stroke Risk Prediction with Machine Learning Techniques,” Sensors, vol. 22, no. 13, p. 4670, Jun. 2022, doi: 10.3390/s22134670.

T. Tazin, M. N. Alam, N. N. Dola, M. S. Bari, S. Bourouis, and M. Monirujjaman Khan, “Stroke Disease Detection and Prediction Using Robust Learning Approaches,” J. Healthc. Eng., vol. 2021, pp. 1–12, Nov. 2021, doi: 10.1155/2021/7633381.

G. Sailasya and G. L. A. Kumari, “Analyzing the Performance of Stroke Prediction using ML Classification Algorithms,” Int. J. Adv. Comput. Sci. Appl., vol. 12, no. 6, 2021, doi: 10.14569/IJACSA.2021.0120662.

Downloads

Published

25-09-2023

How to Cite

Kumar Mohan, & Bhraguram Thayyil. (2023). Machine Learning Techniques for Lung Cancer Risk Prediction using Text Dataset. International Journal of Data Informatics and Intelligent Computing, 2(3), 47–56. https://doi.org/10.59461/ijdiic.v2i3.73

Issue

Section

Regular Issue