##plugins.themes.bootstrap3.article.main##

Fake news has grown in popularity and spread as a result of increased insecurity, political events, and pandemics, among other things. This study used an ensemble machine learning technique to better predict fake news on social media based on the content of news articles. The proposed model used a soft voting classifier to aggregate four machine learning algorithms, namely, Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression, for the classification of news articles as fake or real. GridSearchCV was used to fine-tune the algorithms to get the optimal results during the training process. A Kaggle dataset was used for the experiment; it was comprised of both false and true news. Performance evaluation metrics were used to measure the performance of the base learners and our proposed ensemble technique on the dataset. The results of our experiment show that the proposed ensemble approach produced the highest accuracy, precision, recall, and F1_score values of 93%, 94%, 92%, and 93%, respectively, on the dataset as compared to the individual learners.  This approach may also be used in other classification techniques for spam detection, sentiment analysis, and prediction of loan eligibility, among other things.

Downloads

Download data is not yet available.

References

  1. Zhang Y, Su Y, Weigang L, Liu H. Rumor and authoritative information propagation model considering super spreading in complex social networks. Physica A. 2018; 395-411.
     Google Scholar
  2. Conroy NJ, Rubin VL, Chen Y. Automatic Deception Detection: Methods for Finding Fake News. in ASIS&T Annual Meeting: Information Science with Impact. St. Louis, MO, USA, 2015.
     Google Scholar
  3. Balmas M. When fake news becomes real: Combined exposure to multiple news sources and political attitudes of inefficacy, alienation, and cynicism. Communication Research. 2014; 41(3): 430-454.
     Google Scholar
  4. Rubin VL, Chen Y, Conroy NJ. Deception detection for news: three types of fakes. Proceedings of the Association for Information Science and Technology, 52. 2015.
     Google Scholar
  5. Shu, K, Sliva A, Wang S, Tang J, Liu H. Fake news detection on social media: A data mining perspective. 2017; KDD exploration newsletter.
     Google Scholar
  6. Ahmed H, Traore I, Saad, S. Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques. 2017; Security and Privacy.
     Google Scholar
  7. Ahmad I, Yousaf M, Yousaf S, Ahmad, MO. Fake news detection using machine learning ensemble methods. Hindawi. 2020.
     Google Scholar
  8. Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PK, Khan WZ. An ensemble machine learning approach through effective feature extraction to classify fake news. Elsevier. 2021; 47-58.
     Google Scholar
  9. Bovet A, Makse HA. Influence of fake news in Twitter during the 2016 US presidential election. Nature Communications. 2019; 10(1): 1-14.
     Google Scholar
  10. Khan YJ, Khondaker ST, Iqbal A, Afroz SA. Benchmark Study on Machine Learning Methods for Fake News Detection. 2019.
     Google Scholar
  11. Apuwabi OO. The Effects of Religious Crisis on Economic Development in Nigeria. International Journal of Academic Research in Business and Social Sciences. 2018; 8(6): 321?330.
     Google Scholar
  12. Elgeldawi E, Sayed, A, Galal AR, Zaki AM. Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics 2021, 8, 79. https://doi.org/10.3390/informatics8040079. 2021.
     Google Scholar
  13. Ali S, Tirumala SS, Sarrafzadeh A. Ensemble learning methods for decision making: status and future prospects. 2015.
     Google Scholar
  14. Kumari S, Kumar D, Mittal M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering. 2021; 40-46.
     Google Scholar
  15. Ngoc T, Jean-Guy S, Ingo W. Hyper-parameter Optimization in Classification: To-do or Not-to-do. Article in Pattern Recognition ? July 2020. DOI: 10.1016/j.patcog.2020.107245.
     Google Scholar
  16. Probierz B, Stefa?nski P, Kozaka J. Rapid detection of fake news based on machine learning methods. 25th International Conference on Knowledge-Based and Intelligent Information & Engineering System, pp. 2893?2902, Poland, 2021.
     Google Scholar
  17. P?rez-Rosas V, Kleinberg B, Lefevre A, Mihalcea R. Automatic Detection of Fake News. 2017.
     Google Scholar
  18. Guo C, Cao J, Zhang X, Shu K, Yu M. Exploiting emotions for fake news detection on social media.2019.
     Google Scholar
  19. Horne BD, Adali S. This just in: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. 2017.
     Google Scholar
  20. Girgis S, Amer E, Gadallah M. Deep Learning Algorithms for Detecting Fake News in Online Text. 2018.
     Google Scholar
  21. Gurav S, Sase W, Shinde S, Wabale P, Hirve S. Survey on Automated System for Fake News Detection using NLP & Machine Learning Approach. International Research Journal of Engineering and Technology (IRJET). 2019; 6(1).
     Google Scholar
  22. Poovaraghan RJ, Priya MV, Vamsi PV, Mewara M, Loganatha S. Fake news accuracy using naive bayes classifier. International Journal of Recent Technology and Engineering (IJRTE). 2019; 8(1C2): 2277-3878.
     Google Scholar
  23. Gadekar PS. Fake News Identification using Machine Learning. International Journal for Research in Applied Science & Engineering Technology (IJRASET). 2019; 7(V): 2321-9653.
     Google Scholar
  24. Gaydhani A, Doma V, Kendre S, Bhagwat L. Detecting Hate Speech and Offensive Language on Twitter using Machine Learning: An N-gram and TFIDF based Approach. 2018.
     Google Scholar
  25. Dharmendra S, Suresh J. Evaluation of stemming and stop word techniques on text classification problem. International Journal of Scientific Research in Computer Science and Engineering. 2015; 3: 1-4.
     Google Scholar
  26. Reddy GT, Reddy MPK, Lakshmanna KVR. Rajput DS, Srivastava G, Baker T. (2020). Analysis of dimensionality reduction techniques on big data. IEEE Access. 2020; 8: 54776?54788.
     Google Scholar
  27. Ray S, Srivastava T, Dar P, Shaikh F. Understanding Support Vector Machine algorithm from examples (along with code). Available from https://www.analyticsvidhya.com/blog/2020/09/understaing-support-vector-machine-example-code/. [Accessed 15th August, 2020].
     Google Scholar
  28. Pal M. Random forest classifier for remote sensing classification. International Journal of Remote Sensing. 2005; 26(1): 217?222.
     Google Scholar
  29. Ruta D, Gabrys B. Classifier selection for majority voting. Information Fusion. 2005; 6(1): 63?81.
     Google Scholar
  30. Lam L, Suen SY. (1997). Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Transactions on Systems, Man, and Cybernetics - Part A. 1997; 27(5): 553?568.
     Google Scholar