Mitigating Risk in Financial Industry by Analyzing Social-Media with Machine Learning Technology

##plugins.themes.bootstrap3.article.main##

A large amount of data is available on Twitter that can be used to manage different types of risks in financial institutions. This paper shows how machine learning algorithms can be applied to analyze large unstructured data and train a model to make a future prediction on tweets to categorize them by risk type and use sentiment analysis to understand the risk type. This model reads each tweet and categorizes them by risk using a specified dictionary and adds sentiment analysis to show the risk type seen in each tweet. Logistic regression used in this research helped to formulate the prediction model. Twitter data from 2019 was used to train and test a supervised machine learning algorithm and once the model started predicting tweets efficiently, it was used to predict twitter data from 2022 in our experimental research. Our experiment confirmed that Twitter data can be used to manage risk with the right type of modeling using machine learning techniques.

  1. Ratner B. Statistical and machine-learning data mining: Techniques for better predictive modeling and analysis of big data. CRC Press; 2017 Jul 12.
     Google Scholar 
  2. Al-Gethami KM, Al-Akhras MT, Alawairdhi M. Empirical evaluation of noise influence on supervised machine learning algorithms using intrusion detection datasets. Security and Communication Networks, 2021 Jan 15.
     Google Scholar 
  3. Statista. Countries with the most Twitter users 2021. Published by Statista Research Department, Jan 28, 2022. Statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries
     Google Scholar 
  4. Krishna D. Big Data in risk management. Journal of Risk Management in Financial Institutions 2016 Jan 1;9(1):46-52.
     Google Scholar 
  5. Haile I. & QU Y. Quantitatively Examining the Relationship between Social Media Messages and the Risk Management at Financial Institutions. The 17th International Conference on Data Science (ICDATA'21: July 26-29, 2021, USA).
     Google Scholar 
  6. Park PH. Big data war: how to survive global big data competition. Business Expert Press; 2016 Aug 26.
     Google Scholar 
  7. Haile IM. Data Analytics in Financial Institutions: How Text Analytics Can Help in Risk Management (Doctoral dissertation, Colorado Technical University).
     Google Scholar 
  8. Quillo-Espino J, Romero-González RM, Paulin-Martinez FJ. Text mining preprocessing in Times of Python vs MVCS. International Journal of Computer Science and Software Engineering, 2019 Nov 1;8(11):266-75.
     Google Scholar 
  9. Ribeiro FN, Araújo M, Gonçalves P, Gonçalves MA, Benevenuto F. Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Science,. 2016 Dec 1;5(1):1-29.
     Google Scholar 
  10. Pulakkazhy S, Balan RS. Data mining in banking and its applications-a review.
     Google Scholar 
  11. Kaddouri A. The role of human expertise in enhancing data mining. Capella University; 2011.
     Google Scholar 
  12. Fry, A. The role of AI in decision making. AI and Big Data Expo, World Series. AI & Big Data Expo. February 2018. Retrieved from: Ai-expo.net/role-ai-decision-making.
     Google Scholar 
  13. Hryniewicz, R. Three things CEOs should know about the use of artificial intelligence in decision-making. Cloudera, July 2018 Retrieved from: hortonworks.com/blog/three-things-ceos-should-know-about-the-use-of-artificial-intelligence-in-decision-making.
     Google Scholar 
  14. Wanganga G, Qu Y. An Auto Optimized Payment Service Requests Scheduling Algorithm via Data Analytics through Machine Learning. In 2020 International Conference on Computational Science and Computational Intelligence (CSCI) 2020 Dec 16 (pp. 1498-1502). IEEE.
     Google Scholar 
  15. Ali SS, Mubeen M, Lal I, Hussain A. Prediction of stock performance by using logistic regression model: evidence from Pakistan Stock Exchange (PSX). Asian Journal of Empirical Research, 2018 Jul 6;8(7):247-58.
     Google Scholar 
  16. Muhammad LJ, Algehyne EA, Usman SS, Ahmad A, Chakraborty C, Mohammed IA. Supervised machine learning models for prediction of COVID-19 infection using epidemiology dataset. SN computer science, 2021 Feb;2(1):1-3.
     Google Scholar 
  17. Schober P, Vetter TR. Logistic regression in medical research. Anesthesia and analgesia, 2021 Feb;132(2):365.
     Google Scholar 

How to Cite

[1]
Haile, I.M. and Qu, Y. 2022. Mitigating Risk in Financial Industry by Analyzing Social-Media with Machine Learning Technology. European Journal of Electrical Engineering and Computer Science. 6, 2 (Mar. 2022), 33–37. DOI:https://doi.org/10.24018/ejece.2022.6.2.428.

Search Panel