Machine learning approaches using correlation filters for heart failure diagnosis: a comparative study of supervised techniques

Authors

  • Vitória S. Souza Federal Institute of Education, Science and Technology of the Triângulo Mineiro (IFTM), Computational Intelligence and Robotics Laboratory (LICRO), Patrocínio Campus, Brazil. https://orcid.org/0000-0001-9057-9738
  • Danielli A. Lima Federal Institute of Education, Science and Technology of the Triângulo Mineiro (IFTM), Computational Intelligence and Robotics Laboratory (LICRO), Patrocínio Campus, Brazil. https://orcid.org/0000-0003-0324-6690

DOI:

https://doi.org/10.59461/ijdiic.v4i4.229

Keywords:

Classification, Machine Learning, Cardiology , Heart disease , Prediction diagnosis

Abstract

This study addressed how machine learning could be used to detect factors that influenced the probability of survival of patients with heart failure, based on a database with 12 attributes collected from 299 different patients. Along with applying correlation filters, to obtain attributes that may be more important in a certain way for the disease, further assisting in new forms of treatment, and helping to reduce costs for diagnosis. In this study, we evaluated the accuracy of eight data mining algorithms for predicting heart disease using the heart failure dataset. We implement a methodology that includes 100 simulations with 10 correlation filter variations to ensure reliable and robust results. Among the eight classification algorithms, Support Vector Machine and Random Forest provided the best accuracy (84.18%). Considering the averages for all correlation filter variations, the Random Forest algorithm had the highest average (80.07%), and the Probabilistic Neural Network had the worst performance (69.43%). Analysis of other evaluation metrics revealed that our approach using a Multilayer Perceptron with a correlation filter (0.10) was the best alternative with 83.50% accuracy. Therefore, the diagnosis of cardiac insufficiency required only four attributes: creatinine phosphokinase, serum sodium, sex, and hospitalization time. This streamlined approach not only saved time and resources but also enhanced diagnostic efficiency, unlike previous works that use all base attributes for classification. Our findings suggest that data mining techniques can be a useful tool for predicting heart disease, and the proposed method.

Downloads

Download data is not yet available.

References

H. Kim, L. E. Caulfield, V. Garcia‐Larsen, L. M. Steffen, J. Coresh, and C. M. Rebholz, “Plant‐Based Diets Are Associated With a Lower Risk of Incident Cardiovascular Disease, Cardiovascular Disease Mortality, and All‐Cause Mortality in a General Population of Middle‐Aged Adults,” J Am Heart Assoc, vol. 8, no. 16, Aug. 2019, doi: 10.1161/JAHA.119.012865.

S. Brouwers, I. Sudano, Y. Kokubo, and E. M. Sulaica, “Arterial hypertension,” Lancet, vol. 398, no. 10296, pp. 249–261, Jul. 2021, doi: 10.1016/S0140-6736(21)00221-X.

D. Chicco and G. Jurman, “Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone,” BMC Med Inform Decis Mak, vol. 20, no. 1, p. 16, Dec. 2020, doi: 10.1186/s12911-020-1023-5.

L. Donisi et al., “Bidimensional and Tridimensional Poincaré Maps in Cardiology: A Multiclass Machine Learning Study,” Electronics, vol. 11, no. 3, p. 448, Feb. 2022, doi: 10.3390/electronics11030448.

D. Chicco, M. J. Warrens, and G. Jurman, “The Matthews Correlation Coefficient (MCC) is More Informative Than Cohen’s Kappa and Brier Score in Binary Classification Assessment,” IEEE Access, vol. 9, pp. 78368–78381, 2021, doi: 10.1109/ACCESS.2021.3084050.

O. O. Oladimeji and O. Oladimeji, “Predicting Survival of Heart Failure Patients Using Classification Algorithms,” JITCE (Journal Inf Technol Comput Eng, vol. 4, no. 02, pp. 90–94, Sep. 2020, doi: 10.25077/jitce.4.02.90-94.2020.

Y. Xue and Y. Zhao, “Structure and weights search for classification with feature selection based on brain storm optimization algorithm,” Appl Intell, vol. 52, no. 5, pp. 5857–5866, Mar. 2022, doi: 10.1007/s10489-021-02676-w.

D. A. Lima, M. E. A. Ferreira, and A. F. F. Silva, “Machine Learning and Data Visualization to Evaluate a Robotics and Programming Project Targeted for Women,” J Intell Robot Syst, vol. 103, no. 1, p. 4, Sep. 2021, doi:10.1007/s10846-021-01443-w.

R. S. Dornelas and D. A. Lima, “Correlation Filters in Machine Learning Algorithms to Select Demographic and Individual Features for Autism Spectrum Disorder Diagnosis,” J Data Sci Intell Syst, vol. 1, no. 2, pp. 105–127, Jun. 2023, doi: 10.47852/bonviewJDSIS32021027.

P. C. Dinas, Y. Koutedakis, and A. D. Flouris, “Effects of active and passive tobacco cigarette smoking on heart rate variability,” Int J Cardiol, vol. 163, no. 2, pp. 109–115, Feb. 2013, doi: 10.1016/j.ijcard.2011.10.140.

D. Di Raimondo, G. Rizzo, G. Musiari, A. Tuttolomondo, and A. Pinto, “Role of Regular Physical Activity in Neuroprotection against Acute Ischemia,” Int J Mol Sci, vol. 21, no. 23, p. 9086, Nov. 2020, doi: 10.3390/ijms21239086.

X. Jia et al., “High-Sensitivity Troponin I and Incident Coronary Events, Stroke, Heart Failure Hospitalization, and Mortality in the ARIC Study,” Circulation, vol. 139, no. 23, pp. 2642–2653, Jun. 2019, doi: 10.1161/CIRCULATIONAHA.118.038772.

T. Nishikimi and Y. Nakagawa, “Potential pitfalls when interpreting plasma BNP levels in heart failure practice,” J Cardiol, vol. 78, no. 4, pp. 269–274, Oct. 2021, doi: 10.1016/j.jjcc.2021.05.003.

G. Isola, A. Polizzi, S. Santonocito, A. Alibrandi, and S. Ferlito, “Expression of Salivary and Serum Malondialdehyde and Lipid Profile of Patients with Periodontitis and Coronary Heart Disease,” Int J Mol Sci, vol. 20, no. 23, p. 6061, Dec. 2019, doi: 10.3390/ijms20236061.

T. Ahmad, A. Munir, S. H. Bhatti, M. Aftab, and M. A. Raza, “Survival analysis of heart failure patients: A case study,” PLoS One, vol. 12, no. 7, p. e0181001, Jul. 2017, doi: 10.1371/journal.pone.0181001.

M. Al Mehedi Hasan, J. Shin, U. Das, and A. Yakin Srizon, “Identifying Prognostic Features for Predicting Heart Failure by Using Machine Learning Algorithm,” in 2021 11th International Conference on Biomedical Engineering and Technology, New York, NY, USA: ACM, Mar. 2021, pp. 40–46. doi: 10.1145/3460238.3460245.

A. Ishaq et al., “Improving the Prediction of Heart Failure Patients’ Survival Using SMOTE and Effective Data Mining Techniques,” IEEE Access, vol. 9, pp. 39707–39716, 2021, doi: 10.1109/ACCESS.2021.3064084.

D. Kumar et al., “Cardiac Diagnostic Feature and Demographic Identification (CDF-DI): An IoT Enabled Healthcare Framework Using Machine Learning,” Sensors, vol. 21, no. 19, p. 6584, Oct. 2021, doi: 10.3390/s21196584.

M. F. Aslan, K. Sabanci, and A. Durdu, “A CNN-based novel solution for determining the survival status of heart failure patients with clinical record data: numeric to image,” Biomed Signal Process Control, vol. 68, p. 102716, Jul. 2021, doi: 10.1016/j.bspc.2021.102716.

V. S. Souza and D. A. Lima, “Identifying Risk Factors for Heart Failure: A Case Study Employing Data Mining Algorithms,” J Data Sci Intell Syst, vol. 2, no. 3, pp. 161–173, Sep. 2023, doi: 10.47852/bonviewJDSIS32021386.

G. A. Roth et al., “Global Burden of Cardiovascular Diseases and Risk Factors, 1990–2019,” J Am Coll Cardiol, vol. 76, no. 25, pp. 2982–3021, Dec. 2020, doi: 10.1016/j.jacc.2020.11.010.

G. C. De Santis, “Anemia,” Med (Ribeirao Preto Online), vol. 52, no. 3, pp. 239–251, Nov. 2019, doi: 10.11606/issn.2176-7262.v52i3p239-251.

H. Iwano and W. C. Little, “Heart failure: What does ejection fraction have to do with it?,” J Cardiol, vol. 62, no. 1, pp. 1–3, Jul. 2013, doi: 10.1016/j.jjcc.2013.02.017.

A. D. Deshpande, M. Harris-Hayes, and M. Schootman, “Epidemiology of Diabetes and Diabetes-Related Complications,” Phys Ther, vol. 88, no. 11, pp. 1254–1264, Nov. 2008, doi: 10.2522/ptj.20080020.

M. Scherlinger, C. Richez, G. C. Tsokos, E. Boilard, and P. Blanco, “The role of platelets in immune-mediated inflammatory diseases,” Nat Rev Immunol, vol. 23, no. 8, pp. 495–510, Aug. 2023, doi: 10.1038/s41577-023-00834-4.

H. Lu, S. Uddin, F. Hajati, M. A. Moni, and M. Khushi, “A patient network-based machine learning model for disease prediction: The case of type 2 diabetes mellitus,” Appl Intell, vol. 52, no. 3, pp. 2411–2422, Feb. 2022, doi: 10.1007/s10489-021-02533-w.

A. M. Santos et al., “Semivariogram and Semimadogram functions as descriptors for AMD diagnosis on SD-OCT topographic maps using Support Vector Machine,” Biomed Eng Online, vol. 17, no. 1, p. 160, Dec. 2018, doi: 10.1186/s12938-018-0592-3.

G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, “KNN Model-Based Approach in Classification,” 2003, pp. 986–996. doi: 10.1007/978-3-540-39964-3_62.

W. Huang, Y. Cui, H. Li, and X. Wu, “Effective Probabilistic Neural Networks Model for Model-Based Reinforcement Learning USV,” IEEE Trans Autom Sci Eng, vol. 22, pp. 11625–11641, 2025, doi: 10.1109/TASE.2025.3539317.

C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artif Intell Rev, vol. 54, no. 3, pp. 1937–1967, Mar. 2021, doi: 10.1007/s10462-020-09896-5.

J. A. M. Sidey-Gibbons and C. J. Sidey-Gibbons, “Machine learning in medicine: a practical introduction,” BMC Med Res Methodol, vol. 19, no. 1, p. 64, Dec. 2019, doi: 10.1186/s12874-019-0681-4.

K. V. V. Reddy, I. Elamvazuthi, A. A. Aziz, S. Paramasivam, H. N. Chua, and S. Pranavanand, “Heart Disease Risk Prediction Using Machine Learning Classifiers with Attribute Evaluators,” Appl Sci, vol. 11, no. 18, p. 8352, Sep. 2021, doi: 10.3390/app11188352.

Downloads

Published

24-10-2025

How to Cite

Vitória S. Souza, & Danielli A. Lima. (2025). Machine learning approaches using correlation filters for heart failure diagnosis: a comparative study of supervised techniques. International Journal of Data Informatics and Intelligent Computing, 4(4), 11–27. https://doi.org/10.59461/ijdiic.v4i4.229

Issue

Section

Regular Issue