Perbandingan Algoritma Machine Learning Menggunakan Pemilihan Fitur Chi-square dalam Pengklasifikasian Penyakit Jantung
DOI:
https://doi.org/10.33020/saintekom.v15i1.815Keywords:
heart disease, cardiovascular, feature selection, chi-square, hyperparameterAbstract
Heart disease is one of the deadliest diseases worldwide. This condition often presents symptoms that do not immediately cause severe effects on the sufferer, making early anticipation crucial. To reduce fatalities caused by heart disease or cardiovascular disorders, a system is required to identify its primary causes so that these factors can be minimized. Therefore, this study applies the Chi-square feature selection method to determine the key features influencing the accuracy of Machine Learning models. A comparison is conducted between K-Nearest Neighbor, Naïve Bayes, Logistic Regression, Support Vector Machine, and Random Forest algorithms. This comparison aims to obtain the most accurate results, as a higher algorithm accuracy leads to a more precise classification system for heart disease. The study’s findings indicate that eight key features selected using the Chi-square method yield the highest accuracy, specifically 93.51% with the KNN algorithm. These results demonstrate that using relevant features improves classification accuracy and system efficiency compared to utilizing all available features. Consequently, this research contributes to the selection of essential features in Machine Learning algorithms through the Chi-square technique, ensuring a more effective and optimized heart disease classification system.
Downloads
References
Adiatma, B. C. L., Utami, E., & Hartanto, A. D. (2021). Pengenalan Ekspresi Wajah Menggunakan Deep Convolutional Neural Network. EXPLORE, 11(2), 75. https://doi.org/10.35200/explore.v11i2.478
Albert, A. J., Murugan, R., & Sripriya, T. (2023). Diagnosis of heart disease using oversampling methods and decision tree classifier in cardiology. Research on Biomedical Engineering, 39(1), 99–113. https://doi.org/10.1007/s42600-022-00253-9
Bhatt, C. M., Patel, P., Ghetia, T., & Mazzeo, P. L. (2023). Effective Heart-Disease Prediction by Using Hybrid Machine Learning Technique. MDPI, 1670–1675. https://doi.org/10.1109/ICCPCT58313.2023.10245785
Biswas, N., Ali, M. M., Rahaman, M. A., Islam, M., Mia, M. R., Azam, S., Ahmed, K., Bui, F. M., Al-Zahrani, F. A., & Moni, M. A. (2023). Machine Learning-Based Model to Predict Heart Disease in Early Stage Employing Different Feature Selection Techniques. Hindawi BioMed Research International, 2023. https://doi.org/10.1155/2023/6864343
Bujang, S. D. A., Selamat, A., Ibrahim, R., Krejcar, O., Herrera-Viedma, E., Fujita, H., & Ghani, N. A. M. (2021). Multiclass Prediction Model for Student Grade Prediction Using Machine Learning. IEEE Access, 9, 95608–95621. https://doi.org/10.1109/ACCESS.2021.3093563
Chandrasekhar, N., & Peddakrishna, S. (2023). Enhancing Heart Disease Prediction Accuracy through Machine Learning Techniques and Optimization. MDPI, 11(4). https://doi.org/10.3390/pr11041210
Claesen, M., & De Moor, B. (2015). Hyperparameter Search in Machine Learning. ArXiv, 10–14. http://arxiv.org/abs/1502.02127
Escalante, H. J. (2005). A comparison of outlier detection algorithms for machine learning. Programming and Computer Software.
Estetikha, A. K. A., Gutama, D. H., Pradana, M. G., & Wijaya, D. P. (2021). Comparison of K-Means Clustering & Logistic Regression on University data to differentiate between Public and Private University. IJIIS: International Journal of Informatics and Information Systems, 4(1), 21–29. https://doi.org/10.47738/ijiis.v4i1.74
G, A., Ganesh, B., Ganesh, A., Srinivas, C., Dhanraj, & Mensinkal, K. (2022). Logistic regression technique for prediction of cardiovascular disease. Global Transitions Proceedings, 3(1), 127–130. https://doi.org/10.1016/j.gltp.2022.04.008
Ghosh, P., Azam, S., Jonkman, M., Karim, A., Shamrat, F. M. J. M., Ignatious, E., Shultana, S., Beeravolu, A. R., & De Boer, F. (2021). Efficient prediction of cardiovascular disease using machine learning algorithms with relief and lasso feature selection techniques. IEEE Access, 9, 19304–19326. https://doi.org/10.1109/ACCESS.2021.3053759
Jusia, P. A., Rahim, A., Yani, H., & Jasmir, J. (2024). Improving Performance of KNN and C4.5 using Particle Swarm Optimization in Classification of Heart Diseases. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 8(3), 333–339. https://doi.org/10.29207/resti.v8i3.5710
Khairi, A., Ghozali, A. F., & Hidayah, A. D. N. (2021). Implementasi K-Nearest Neighbor (KNN) untuk Mengklasifikasi Masyarakat Pra-Sejahtera Desa Sapikerep Kecamatan Sukapura. TRILOGI: Jurnal Ilmu Teknologi, Kesehatan, Dan Humaniora, 2(3), 319–323. https://doi.org/10.33650/trilogi.v2i3.2878
Khan, A., Qureshi, M., Daniyal, M., & Tawiah, K. (2023). A Novel Study on Machine Learning Algorithm-Based Cardiovascular Disease Prediction. Health & Social Care in the Community, 2023(Cvd), 1–10. https://doi.org/10.1155/2023/1406060
Khurana, P., Sharma, S., & Goyal, A. (2021). Heart Disease Diagnosis: Performance Evaluation of Supervised Machine Learning and Feature Selection Techniques. Proceedings of the 8th International Conference on Signal Processing and Integrated Networks, SPIN 2021, August, 510–515. https://doi.org/10.1109/SPIN52536.2021.9565963
Kibria, H. B., & Matin, A. (2022). The severity prediction of the binary and multi-class cardiovascular disease ? A machine learning-based fusion approach. ArXiv, 98, 107672. https://doi.org/10.1016/j.compbiolchem.2022.107672
Li, J. P., Haq, A. U., Din, S. U., Khan, J., Khan, A., & Saboor, A. (2020). Heart Disease Identification Method Using Machine Learning Classification in E-Healthcare. IEEE Access, 8(Ml), 107562–107582. https://doi.org/10.1109/ACCESS.2020.3001149
Narayanan, & Jayashree. (2024). Implementation of Efficient Machine Learning Techniques for Prediction of Cardiac Disease using SMOTE. Elsevier, 233(2023), 558–569. https://doi.org/10.1016/j.procs.2024.03.245
Ogundepo, E. A., & Yahya, W. B. (2023). Performance analysis of supervised classification models on heart disease prediction. Innovations in Systems and Software Engineering, 19(1), 129–144. https://doi.org/10.1007/s11334-022-00524-9
Ozcan, M., & Peker, S. (2023). A classification and regression tree algorithm for heart disease modeling and prediction. Healthcare Analytics, 3(November 2022), 100130. https://doi.org/10.1016/j.health.2022.100130
Plackett, R. L. (1984). Karl Pearson and the Chi-squared Test. International Statistical Institute, 64(1), 50–53. https://doi.org/10.47316/cajmhe.2024.5.1.05
Poslavskaya, E., & Korolev, A. (2023). Encoding categorical data: Is there yet anything “hotter” than one-hot encoding?
Rao, P. V., & Srivastava, K. K. (2024). Extraction and Feature Selection for Precise Cardiovascular Disease Classification. International Journal for Multidimensional Research Perspectives, 2(7), 79–87. https://doi.org/10.61877/ijmrp.v2i7.172
Reddy, K. V. V., Elamvazuthi, I., Aziz, A. A., Paramasivam, S., Chua, H. N., & Pranavanand, S. (2021). Heart disease risk prediction using machine learning classifiers with attribute evaluators. MDPI, 11(18). https://doi.org/10.3390/app11188352
Sarra, R. R., Dinar, A. M., Mohammed, M. A., & Abdulkareem, K. H. (2022). Enhanced Heart Disease Prediction Based on Machine Learning and ?2 Statistical Optimal Feature Selection Model. MDPI, 6(5). https://doi.org/10.3390/designs6050087
Shaon, M. S. H., Karim, T., Shakil, M. S., & Hasan, M. Z. (2024). A comparative study of machine learning models with LASSO and SHAP feature selection for breast cancer prediction. Elsevier, 6(February 2023), 100353. https://doi.org/10.1016/j.health.2024.100353
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Elsevier, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002
Spencer, R., Thabtah, F., Abdelhamid, N., & Thompson, M. (2020). Exploring feature selection and classification methods for predicting heart disease. Digital Health, 6, 1–10. https://doi.org/10.1177/2055207620914777
Sushma, S. J., Assegie, T. A., Vinutha, D. C., & Padmashree, S. (2021). An improved feature selection approach for chronic heart disease detection. Bulletin of Electrical Engineering and Informatics, 10(6), 3501–3506. https://doi.org/10.11591/eei.v10i6.3001
World Health Organization. (2021). Cardiovascular diseases (CVDs). World Health Organixation. https://www.who.int/en/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)
Yahaya, L., Oye, N. D., & Adamu, A. (2020). Performance Analysis of Some Se-Lected Machine Learning Algo-Rithms on Heart Disease Predic-Tion Using the Noble Uci Datasets. International Journal of Engineering Applied Sciences and Technology, 5(1), 36–46. https://doi.org/10.33564/ijeast.2020.v05i01.006
Yulianto, S. P. R., Fanani, A. Z., Affandy, A., & Aziz, M. I. (2024). Analisis Metode Smoote pada Klasifikasi Penyakit Jantung Berbasis Random Forest Tree. Jurnal Media Informatika Budidarma, 8(3), 1460. https://doi.org/10.30865/mib.v8i3.7712
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Hirmayanti Hirmayanti, Ema Utami

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright :
By submitting manuscripts to Jurnal Saintekom : Sains, Teknologi, Komputer dan Manajemen, the author agrees with this policy. No specific document approval is required.
- The copyright in each article belongs to the author.
- Authors retain all their rights to the published work, not limited to the rights set forth in this page.
- Authors acknowledge that Saintekom Journal: Science, Technology, Computers and Management as the first to publish under the Creative Commons Attribution 4.0 International license (CC BY-SA).
- The author may submit the paper separately, arrange for non-exclusive distribution of the manuscript that has been published in this journal into other versions (e.g. sent to the author's institutional respository, publication into a book, etc.), by acknowledging that the manuscript has been first published Jurnal Saintekom : Sains, Teknologi, Komputer dan Manajemen;
- The author warrants that the article is original, written by the named author, has not been previously published, contains no unlawful statements, does not infringe the rights of others, is subject to copyright exclusively held by the author.
- If the article is jointly prepared by more than one author, each author submitting the manuscript warrants that he or she has been authorized by all co-authors to agree to copyright and license notices (agreements) on their behalf, and agrees to inform co-authors of the terms of this policy. Jurnal Saintekom : Sains, Teknologi, Komputer dan Manajemen will not be held liable for anything that may arise due to internal author disputes.
Lisensi :
Jurnal Saintekom : Sains, Teknologi, Komputer dan Manajemen is published under the terms of the Creative Commons Attribution 4.0 International License (CC BY-SA). This license permits anyone to:.
- Share - copy and redistribute this material in any form or format;
- Adaptation - modify, alter, and create derivatives of this material for any purpose.
- Attribution - you must give appropriate credit, include a link to the license, and state that changes have been made. You may do this in any appropriate manner, but it does not imply that the licensor endorses you or your use.
- Similar Sharing - If you modify, alter, or create a derivative of this material, you must distribute your contribution under the same license as the original material.
Most read articles by the same author(s)
- Artha Gilang Saputra, Ema Utami, Hanif Al Fatta, Analisis Penerapan Metode Convex Hull Dan Convexity Defects Untuk Pengenalan Isyarat Tangan , Jurnal Saintekom : Sains, Teknologi, Komputer dan Manajemen: Vol. 8 No. 2 (2018): September 2018


