banner

PCSVD: A hybrid feature extraction technique based on principal component analysis and singular value decomposition

Vineeta Gulati, Neeraj Raheja

Abstract


Feature extraction plays an important role in accurate preprocessing and real-world applications. High-dimensional features in the data have a significant impact on the machine learning classification system.Relevant feature extraction is a fundamental step not only to reduce the dimensionality but also to improve the performance of the classifier. In this paper, the author proposes a hybrid dimensionality reduction technique using principal component analysis (PCA) and singular value decomposition (SVD) in a machine classification system with a support vector classifier (SVC). To evaluate the performance of PCSVD, the results are compared without using feature extraction techniques or with existing methods of independent component analysis (ICA), PCA, linear discriminant analysis (LDA), and SVD. In addition, the efficiency of the PCSVD method is measured on an increased scale of 1.54% accuracy, 2.70% sensitivity, 3.71% specificity, and 3.58% precision. In addition, reduce the 15% dimensionality and 40.60% RMSE, which are better than existing techniques found in the literature.

Keywords


support vector classifier; machine learning; independent component analysis; linear discriminant analysis; chronic kidney disease dataset; dimensionality reduction

Full Text:

PDF

References


1. Gulati V, Raheja N. Comparative analysis of machine learning techniques based on chronic kidney disease dataset. IOP Conference Series: Materials Science and Engineering 2021; 1131(1): 012010. doi: 10.1088/1757-899X/1131/1/012010

2. Winter G. Machine learning in healthcare. British Journal of Healthcare Management 2019; 25(2): 100–101. doi: 10.12968/bjhc.2019.25.2.100

3. Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Science 2015; 349(6245): 255–260. doi: 10.1126/science.aaa8415

4. Ayesha S, Hanif MK, Talib R. Overview and comparative study of dimensionality reduction techniques for high dimensional data. Information Fusion 2020; 59: 44–58. doi: 10.1016/j.inffus.2020.01.005

5. Dulhare UN, Ayesha M. Extraction of action rules for chronic kidney disease using Naïve bayes classifier. In: Proceedings of 2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC); 15–17 December 2016; Chennai, India. pp. 1–5.

6. Storcheus D, Rostamizadeh A, Kumar S. A survey of modern questions and challenges in feature extraction. In: Proceedings of the 1st International Workshop “Feature Extraction: Modern Questions and Challenges”; 11 December 2015; Montreal, Canada. pp. 1–18.

7. Velliangiri S, Alagumuthukrishnan S, Joseph SIT. A review of dimensionality reduction techniques for efficient computation. Procedia Computer Science 2019; 165: 104–111. doi: 10.1016/j.procs.2020.01.079

8. Islam MA, Majumder MZH, Hussein MA. Chronic kidney disease prediction based on machine learning algorithms. Journal of Pathology Informatics 2023; 14: 100189. doi: 10.1016/j.jpi.2023.100189

9. Venkatesan VK, Ramakrishna MT, Izonin I, et al. Efficient data preprocessing with ensemble machine learning technique for the early detection of chronic kidney disease. Applied Sciences 2023; 13(5): 2885. doi: 10.3390/app13052885

10. Swain D, Mehta U, Bhatt A, et al. A robust chronic kidney disease classifier using machine learning. Electronics 2023; 12(1): 212. doi: 10.3390/electronics12010212

11. Ebiaredoh-Mienye SA, Swart TG, Esenogho E, Mienye ID. A machine learning method with filter-based feature selection for improved prediction of chronic kidney disease. Bioengineering 2022; 9(8): 350. doi: 10.3390/bioengineering9080350

12. Jerop B, Segera DR. An efficient PCA-GA-HKSVM-based disease diagnostic assistant. BioMed Research International 2021; 2021: 1–10. doi: 10.1155/2021/4784057

13. Navaneeth B, Suchetha M. A dynamic pooling based convolutional neural network approach to detect chronic kidney disease. Biomedical Signal Processing and Control 2020; 62: 102068. doi: 10.1016/j.bspc.2020.102068

14. Inayatullah, Qayyurn H. An improved comparative model for chronic kidney disease (CKD) prediction. In: Proceedings of 2020 14th International Conference on Open Source Systems and Technologies (ICOSST); 16–17 December 2020; Lahore, Pakistan. pp. 1–8.

15. Reddy MP, Devi TU. Prediction of diagnosing chronic kidney disease using machine learning: Classification algorithms. International Journal of Innovation Technology and Exploring Engineering 2020; 9(4): 1922–1924. doi: 10.35940/ijitee.f3989.049620

16. Jain D, Singh V. A two-phase hybrid approach using feature selection and adaptive SVM for chronic disease classification. International Journal of Computers and Applications 2021; 43(6): 524–536. doi: 10.1080/1206212X.2019.1577534

17. Gu S. Applying Machine Learning Algorithms for the Analysis of Biological Sequences and Medical Records [Master’s thesis]. South Dakota State University; 2019.

18. Gharibdousti MS, Azimi K, Hathikal S, Won DH. Prediction of chronic kidney disease using data mining techniques. In: Proceedings of Industrial and Systems Engineering Conference; 20–23 May 2017; Pittsburgh, Pennsylvania. pp. 2135–2140.

19. Bouzalmat A, Kharroubi J, Zarghili A. Comparative study of PCA, ICA, LDA using SVM classifier. Journal of Emerging Technologies in Web Intelligence 2014; 6(1): 64–68. doi: 10.4304/jetwi.6.1.64-68

20. Reza MS, Ma J. ICA and PCA integrated feature extraction for classification. In: Proceedings of 2016 IEEE 13th International Conference on Signal Processing (ICSP); 6–10 November 2016; Chengdu, China. pp. 1083–1088.

21. Ramachandran R, Ravichandran G, Raveendran A. Evaluation of dimensionality reduction techniques for big data. In: Proceedings of 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC); 11–13 March 2020; Erode, India. pp. 226–231.

22. Li L, Wu Y, Ou Y, et al. Research on machine learning algorithms and feature extraction for time series. In: Proceedings of 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC); 8–13 October 2017; Montreal, Canada. pp. 1–5.

23. Tanwar S, Ramani T, Tyagi S. Dimensionality reduction using PCA and SVD in big data: A comparative case study. In: Proceedings of Future Internet Technologies and Trends: First International Conference, ICFITT 2017; 31 August–2 September 2017; Surat, India. pp. 116–125.

24. Gulati V, Raheja N, Gujral RK. Pica-A hybrid feature extraction technique based on principal component analysis and independent component analysis. In: Proceedings of 2022 IEEE 3rd Global Conference for Advancement in Technology (GCAT); 7–9 October 2022; Bangalore, India. pp. 1–6.

25. Almeida AR, Almeida OM, Junior BFS, et al. ICA feature extraction for the location and classification of faults in high-voltage transmission lines. Electric Power Systems Research 2017; 148: 254–263. doi: 10.1016/j.epsr.2017.03.030

26. Sarhan M, Layeghy S, Moustafa N, et al. Feature extraction for machine learning-based intrusion detection in IoT networks. Digital Communications and Networks 2022; in press.

27. Kadhim AI, Cheah YN, Hieder IA, Ali RA. Improving TF-IDF with singular value decomposition (SVD) for feature extraction on Twitter. In: Proceedings of 3rd International Engineering Conference on Developments in Civil and Computer Engineering Applications; 26–27 February 2017; Erbil, Iraq.

28. Sujatha R, Ephzibah EP, Dharinya S, et al. Comparative study on dimensionality reduction for disease diagnosis using fuzzy classifier. International Journal of Engineering and Technology 2018; 7(1): 79–84. doi: 10.14419/ijet.v7i1.8652

29. Janani J, Sathyaraj R. Diagnosing chronic kidney disease using hybrid machine learning techniques. Turkish Journal of Computer and Mathematics Education (TURCOMAT) 2021; 12(13): 6383–6390.

30. Chittora P, Chaurasia S, Chakrabarti P, et al. Prediction of chronic kidney disease—A machine learning perspective. IEEE Access 2021; 9: 17312–17334. doi: 10.1109/ACCESS.2021.3053763

31. Zelaya CVG. Towards explaining the effects of data preprocessing on machine learning. In: Proceedings of 2019 IEEE 35th International Conference on Data Engineering (ICDE); 8–11 April 2019; Macao, China. pp. 2086–2090.

32. Alam S, Yao N. The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis. Computational and Mathematical Organization Theory 2019; 25(3): 319–335. doi: 10.1007/s10588-018-9266-8

33. Huang J, Li Y, Xie M. An empirical analysis of data preprocessing for machine learning-based software cost estimation. Information and Software Technology 2015; 67: 108–127. doi: 10.1016/j.infsof.2015.07.004




DOI: https://doi.org/10.32629/jai.v6i2.586

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Vineeta Gulati, Neeraj Raheja

License URL: https://creativecommons.org/licenses/by-nc/4.0