Sentiment analysis and classification of COVID-19 tweets using machine learning classifier
Abstract
In March of 2020, the World Health Organization identified COVID-19 as a new pandemic and issued a statement to that effect. This fatal virus was able to disperse and propagate throughout several countries all over the world. During the progression of the pandemic, social networking sites like Twitter generated significant and substantial volumes of data that helped improve the quality of decisions pertaining to health care applications. In this paper, we proposed a sentiment classification using various feature extraction and machine leavening techniques for social media dataset. The system has divided into four phase data collection, preprocessing and normalization, feature extraction and feature selection and finally classification. In first phase we collect data from social media sources such as twitter using Twitter API. In second phase the tweets, data was ready for preprocessing and it was sorted into three categories: positive, neutral, and negative. During the third phase, various features were extracted from the tweets by employing a number of widely utilized approaches, including as bag of words (BoW), Term Frequency-Inverse Document Frequency (TF-IDF), Word2Vec, and FastText, to gather feature datasets. These methods were employed to extract distinct datasets for the features. The final phase different machine learning classification algorithms are applied for detection of sentiment using machine learning. In the extensive experimental analysis, the BoW performed better results with modified support vector machine (mSVM) than existing machine learning algorithms. The proposed mSVM performed superiorly to the other classifiers by 98.15% accuracy rate. Once the tweets are correctly classified as COVID-19 tweets, it is further categorized into three sentiments that is positive, negative and neural. Proposed mSVM achieves 93% of accuracy rate for positive sentiment which better as compared to other Machine Learning (ML) classifiers.
Keywords
Full Text:
PDFReferences
1. Zope T, Rajeswari K. Sentiment analysis of Covid-19 tweets using Twitter database—A global scenario. In: Proceedings of the 2022 IEEE 4th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA); 8–9 October 2022; Goa, India. pp. 27–30.
2. Chitra K, Tamilarasi A, Hemalatha S, et al. Sentiment analysis on Covid-19 vaccine. In: Proceedings of the 2022 3rd International Conference on Electronics and Sustainable Communication Systems (ICESC); 17–19 August 2022; Coimbatore, India. pp. 745–750.
3. Soomro ZT, Ilyas SHW, Yaqub U. Sentiment, count and cases: Analysis of Twitter discussions during COVID-19 pandemic. In: Proceedings of the 2020 7th International Conference on Behavioural and Social Computing (BESC); 5–7 November 2020; Bournemouth, United Kingdom. pp. 1–4.
4. Adamu H, Jiran MJBM, Gan KH, Samsudin NH. Text analytics on Twitter text-based public sentiment for Covid-19 vaccine: A machine learning approach. In: Proceedings of the 2021 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET); 13–15 September 2021; Kota Kinabalu, Malaysia. pp. 1–6.
5. Tareq A, Hewahi N. Sentiment analysis of tweets during COVID-19 pandemic using BLSTM. In: Proceedings of the 2021 International Conference on Data Analytics for Business and Industry (ICDABI; 25–26 October 2021; Sakheer, Bahrain. pp. 245–249.
6. Tao A, Qi K, Che D, et al. Comparison of media sources for COVID-19 by machine learning sentiment analysis. In: Proceedings of the 2021 International Symposium on Networks, Computers and Communications (ISNCC); 31 October 2021–2 November 2021; Dubai, United Arab Emirates. pp. 1–4.
7. Khan R, Rustam F, Kanwal K, et al. US based COVID-19 tweets sentiment analysis using TextBlob and supervised machine learning algorithms. In: Proceedings of the 2021 International Conference on Artificial Intelligence (ICAI); 5–7 April 2021; Islamabad, Pakistan. pp. 1–8.
8. Jannah HA, Hermawan D. Analysis of Indonesian society’s perceptions of the COVID-19 vaccine in Youtube comments using machine learning algorithms. In: Proceedings of the 2022 3rd International Conference on Artificial Intelligence and Data Sciences (AiDAS); 7–8 September 2022; IPOH, Malaysia. pp. 72–77.
9. Patravali SD, Algur SP. Sentimental analysis of COVID-19 tweets using semantic approach. In: Proceedings of the 2022 3rd International Conference for Emerging Technology (INCET); 27–29 May 2022; Belgaum, India. pp. 1–4.
10. Sancoko SD, Diwandari S, Fachrie M. Ensemble learning for sentiment analysis on Twitter data related to Covid-19 preventions. In: Proceedings of the 2022 International Conference on Information Technology Research and Innovation (ICITRI); 10 November 2022; Jakarta, Indonesia. pp. 89–94.
11. Balaji TK, Bablani A, Sreeja SR. Opinion mining on COVID-19 vaccines in India using deep and machine learning approaches. In: Proceedings of the 2022 International Conference on Innovative Trends in Information Technology (ICITIIT); 12–13 February 2022; Kottayam, India. pp. 1–6.
12. Islam MN, Khan NI, Roy A, et al. Sentiment analysis of Bangladesh-specific COVID-19 tweets using deep neural network. In: Proceedings of the 2021 62nd International Scientific Conference on Information Technology and Management Science of Riga Technical University (ITMS); 14–15 October 2021; Riga, Latvia. pp. 1–6.
13. Baker O, Liu J, Gosai M, Sitoula S. Twitter sentiment analysis using machine learning algorithms for COVID-19 outbreak in New Zealand. In: Proceedings of the 2021 IEEE 11th International Conference on System Engineering and Technology (ICSET); 6 November 2021; Shah Alam, Malaysia. pp. 286–291.
14. Nwafor E, Vaughan R, Kolimago C. Covid vaccine sentiment analysis by geographic region. In: Proceedings of the 2021 IEEE International Conference on Big Data (Big Data); 15–18 December 2021; Orlando, FL, USA. pp. 4401–4404.
15. Andhale S, Mane P, Vaingankar M, et al. Twitter sentiment analysis for COVID-19. In: Proceedings of the 2021 International Conference on Communication information and Computing Technology (ICCICT); 25–27 June 2021; Mumbai, India. pp. 1–12.
16. Mohsen A, Ali Y, Al-Sorori W, et al. A performance comparison of machine learning classifiers for Covid-19 Arabic Quarantine tweets sentiment analysis. In: Proceedings of the 2021 1st International Conference on Emerging Smart Technologies and Applications (eSmarTA); 10–12 August 2021; Sana’a, Yemen. pp. 1–8.
17. Kumari KR, Gayathri T, Madhavi T. Machine learning technique with spider monkey optimization for COVID-19 sentiment analysis. In: Proceedings of the 2022 International Conference on Computing, Communication and Power Technology (IC3P); 7–8 January 2022; Visakhapatnam, India. pp. 303–307.
18. Senadhira KI, Rupasingha RAHM, Kumara BTGS. Sentiment analysis on Twitter data related to online learning during the Covid-19 pandemic. In: Proceedings of the 2022 International Research Conference on Smart Computing and Systems Engineering (SCSE); 1 September 2022; Colombo, Sri Lanka. pp. 131–136.
19. Aminuddin R, Bistamam MA, Ibrahim S, et al. A sentiment analysis framework on COVID-19 in major cities of Malaysia based on tweets using machine learning classification model. In: Proceedings of the 2021 IEEE 11th International Conference on System Engineering and Technology (ICSET); 6 November 2021; Shah Alam, Malaysia. pp. 25–30.
20. Dangi D, Dixit DK, Bhagat A, et al. Analyzing the sentiments by classifying the tweets based on COVID-19 using machine learning classifiers. In: Proceedings of the 2021 IEEE International Conference on Technology, Research, and Innovation for Betterment of Society (TRIBES); 17–19 December 2021; Raipur, India. pp. 1–6.
21. Pane SF, Prastya R, Putrada AG, et al. Reevaluating synthesizing sentiment analysis on COVID-19 fake news detection using spark dataframe. In: Proceedings of the 2022 International Conference on Information Technology Systems and Innovation (ICITSI); 8–9 November 2022; Bandung, Indonesia. pp. 269–274.
22. Guo R, Xu K. A large-scale analysis of COVID-19 Twitter dataset in a new phase of the pandemic. In: Proceedings of the 2022 IEEE 12th International Conference on Electronics Information and Emergency Communication (ICEIEC); 15–17 July 2022; Beijing, China, 2022. pp. 276–281.
DOI: https://doi.org/10.32629/jai.v7i2.801
Refbacks
- There are currently no refbacks.
Copyright (c) 2023 Chataparti Suvarna Lakshmi, Sameer Saxena, Billakurthi Suresh Kumar
License URL: https://creativecommons.org/licenses/by-nc/4.0/