An Ensemble Machine Learning Approach for Fake News Detection and Classification Using a Soft Voting Classifier
##plugins.themes.bootstrap3.article.main##
Fake news has grown in popularity and spread as a result of increased insecurity, political events, and pandemics, among other things. This study used an ensemble machine learning technique to better predict fake news on social media based on the content of news articles. The proposed model used a soft voting classifier to aggregate four machine learning algorithms, namely, Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression, for the classification of news articles as fake or real. GridSearchCV was used to fine-tune the algorithms to get the optimal results during the training process. A Kaggle dataset was used for the experiment; it was comprised of both false and true news. Performance evaluation metrics were used to measure the performance of the base learners and our proposed ensemble technique on the dataset. The results of our experiment show that the proposed ensemble approach produced the highest accuracy, precision, recall, and F1_score values of 93%, 94%, 92%, and 93%, respectively, on the dataset as compared to the individual learners. This approach may also be used in other classification techniques for spam detection, sentiment analysis, and prediction of loan eligibility, among other things.
Downloads
References
-
Zhang Y, Su Y, Weigang L, Liu H. Rumor and authoritative information propagation model considering super spreading in complex social networks. Physica A. 2018; 395-411.
Google Scholar
1
-
Conroy NJ, Rubin VL, Chen Y. Automatic Deception Detection: Methods for Finding Fake News. in ASIS&T Annual Meeting: Information Science with Impact. St. Louis, MO, USA, 2015.
Google Scholar
2
-
Balmas M. When fake news becomes real: Combined exposure to multiple news sources and political attitudes of inefficacy, alienation, and cynicism. Communication Research. 2014; 41(3): 430-454.
Google Scholar
3
-
Rubin VL, Chen Y, Conroy NJ. Deception detection for news: three types of fakes. Proceedings of the Association for Information Science and Technology, 52. 2015.
Google Scholar
4
-
Shu, K, Sliva A, Wang S, Tang J, Liu H. Fake news detection on social media: A data mining perspective. 2017; KDD exploration newsletter.
Google Scholar
5
-
Ahmed H, Traore I, Saad, S. Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques. 2017; Security and Privacy.
Google Scholar
6
-
Ahmad I, Yousaf M, Yousaf S, Ahmad, MO. Fake news detection using machine learning ensemble methods. Hindawi. 2020.
Google Scholar
7
-
Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PK, Khan WZ. An ensemble machine learning approach through effective feature extraction to classify fake news. Elsevier. 2021; 47-58.
Google Scholar
8
-
Bovet A, Makse HA. Influence of fake news in Twitter during the 2016 US presidential election. Nature Communications. 2019; 10(1): 1-14.
Google Scholar
9
-
Khan YJ, Khondaker ST, Iqbal A, Afroz SA. Benchmark Study on Machine Learning Methods for Fake News Detection. 2019.
Google Scholar
10
-
Apuwabi OO. The Effects of Religious Crisis on Economic Development in Nigeria. International Journal of Academic Research in Business and Social Sciences. 2018; 8(6): 321?330.
Google Scholar
11
-
Elgeldawi E, Sayed, A, Galal AR, Zaki AM. Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics 2021, 8, 79. https://doi.org/10.3390/informatics8040079. 2021.
Google Scholar
12
-
Ali S, Tirumala SS, Sarrafzadeh A. Ensemble learning methods for decision making: status and future prospects. 2015.
Google Scholar
13
-
Kumari S, Kumar D, Mittal M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering. 2021; 40-46.
Google Scholar
14
-
Ngoc T, Jean-Guy S, Ingo W. Hyper-parameter Optimization in Classification: To-do or Not-to-do. Article in Pattern Recognition ? July 2020. DOI: 10.1016/j.patcog.2020.107245.
Google Scholar
15
-
Probierz B, Stefa?nski P, Kozaka J. Rapid detection of fake news based on machine learning methods. 25th International Conference on Knowledge-Based and Intelligent Information & Engineering System, pp. 2893?2902, Poland, 2021.
Google Scholar
16
-
P?rez-Rosas V, Kleinberg B, Lefevre A, Mihalcea R. Automatic Detection of Fake News. 2017.
Google Scholar
17
-
Guo C, Cao J, Zhang X, Shu K, Yu M. Exploiting emotions for fake news detection on social media.2019.
Google Scholar
18
-
Horne BD, Adali S. This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. 2017.
Google Scholar
19
-
Girgis S, Amer E, Gadallah M. Deep Learning Algorithms for Detecting Fake News in Online Text. 2018.
Google Scholar
20
-
Gurav S, Sase W, Shinde S, Wabale P, Hirve S. Survey on Automated System for Fake News Detection using NLP & Machine Learning Approach. International Research Journal of Engineering and Technology (IRJET). 2019; 6(1).
Google Scholar
21
-
Poovaraghan RJ, Priya MV, Vamsi PV, Mewara M, Loganatha S. Fake news accuracy using naive bayes classifier. International Journal of Recent Technology and Engineering (IJRTE). 2019; 8(1C2): 2277-3878.
Google Scholar
22
-
Gadekar PS. Fake News Identification using Machine Learning. International Journal for Research in Applied Science & Engineering Technology (IJRASET). 2019; 7(V): 2321-9653.
Google Scholar
23
-
Gaydhani A, Doma V, Kendre S, Bhagwat L. Detecting Hate Speech and Offensive Language on Twitter using Machine Learning: An N-gram and TFIDF based Approach. 2018.
Google Scholar
24
-
Dharmendra S, Suresh J. Evaluation of stemming and stop word techniques on text classification problem. International Journal of Scientific Research in Computer Science and Engineering. 2015; 3: 1-4.
Google Scholar
25
-
Reddy GT, Reddy MPK, Lakshmanna KVR. Rajput DS, Srivastava G, Baker T. (2020). Analysis of dimensionality reduction techniques on big data. IEEE Access. 2020; 8: 54776?54788.
Google Scholar
26
-
Ray S, Srivastava T, Dar P, Shaikh F. Understanding Support Vector Machine algorithm from examples (along with code). Available from https://www.analyticsvidhya.com/blog/2020/09/understaing-support-vector-machine-example-code/. [Accessed 15th August, 2020].
Google Scholar
27
-
Pal M. Random forest classifier for remote sensing classification. International Journal of Remote Sensing. 2005; 26(1): 217?222.
Google Scholar
28
-
Ruta D, Gabrys B. Classifier selection for majority voting. Information Fusion. 2005; 6(1): 63?81.
Google Scholar
29
-
Lam L, Suen SY. (1997). Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Transactions on Systems, Man, and Cybernetics - Part A. 1997; 27(5): 553?568.
Google Scholar
30
Most read articles by the same author(s)
-
E. Lawrence,
E. J. Garba,
Y. M. Malgwi,
M. A. Hambali,
An Application of Artificial Neural Network for Wind Speeds and Directions Forecasts in Airports , European Journal of Electrical Engineering and Computer Science: Vol. 6 No. 1 (2022)