Twitter Data Classification by Applying and Comparing Multiple Machine Learning Techniques

Authors

  • Ananya Sarker Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh, Author
  • Shahid Uz Zaman Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh, Author
  • Azmain Yakin Srizon Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh Author

Keywords:

Classification, Machine Learning, Social Media, Twitter Data

Abstract

Having an average of five hundred million  tweets sent out per day, twitter has become one of the  largest platforms of data analysis for the researchers.  Previously, various researches have been conducted on  twitter data i.e., sentimental analysis. However, not  much research has been done to classify the tweets in  terms of categories so that tweets can be distributed as  per user preferences. In this research we started by  creating four broad categories: politics, sports, crime  and natural. After that, we applied different machine  learning techniques (Random Forest, K-Nearest  Neighbors, Naïve Bayes, Logistic Regression, Decision  Tree and Support Vector Machine) to classify the  twitter data. Finally, we compared the results in terms  of sensitivity, specificity, precision, false positive rate  and accuracy. We found that Support Vector Machine  (SVM) produced the best results in terms of sensitivity,  specificity, precision, false positive rate and accuracy.  Hence, we concluded that a machine learning approach  (Support Vector Machine) can certainly be used to  classify twitter data. Constructed dataset, all the  programs, figures and snippets can be found at  https://github.com/ananyasarkertonu/Twitter-Dataset 

Downloads

Download data is not yet available.

References

Vishal A. Kharde, S.S. Sonawane, Sentiment Analysis of Twitter Data, International Journal of Computer Applications (0975 – 8887) Volume 139 – No.11, April 2016.

Neha Upadhyay1, Prof. Angad Singh2, Sentiment Analysis on Twitter by using Machine Learning Technique, International Journal for Research in Applied Science & Engineering Technology (IJRASET), Volume 4 Issue V, May 2016.

Ankita Gupta1, Jyotika Pruthi2, Neha Sahu, Sentiment Analysis of Tweets using Machine Learning Approach, IJCSMC, Vol. 6, Issue. 4, April 2017, pg.444 – 458.

K. Kaviya 1, K.K. Shanthini1, Dr.M. Sujithra2, “Micro-blogging Sentimental Analysis on Twitter Data Using Naïve Bayes Machine Learning Algorithm in Python”, International Journal on Future Revolution in Computer Science & Communication Engineering, Volume: 4 Issue: 4, April, 2018.

Bhagyashri Wagh1, J. V. Shinde2, N. R. Wankhade3, Sentiment Analysis on Twitter Data Using Naïve Bayes, International Journal of Advanced Research in Computer and Communication Engineering, Vol. 5, Issue 12, December 2016.

Vikrant Hole1, Mukta Takalikar, Real Time Tweet Summarization and Sentiment Analysis of Game Tournament International Journal of Science and Research (IJSR), 2013.

Bharati S. Kannolli1, Prabhu R. Bevinmarad2 “Analysis and Prediction of Sentiments for Cricket Tweets Using Hadoop”, International Research Journal of Engineering and Technology (IRJET), Volume: 04 Issue: 10, oct 2017.

Ankita Rane1, Dr. Anand Kumar2, “Sentiment Classification System of Twitter Data for US Airline Service Analysis”, 42nd IEEE International Conference on Computer Software & Applications, 2018.

Nazim Razali1, Aida Mustapha1, Faiz Ahmad Yatim2, Ruhaya Ab Aziz1 “Predicting Football Matches Results using Bayesian Networks for English Premier League (EPL)” International Research and Innovation Summit (IRIS 2017).

Downloads

Published

2019-11-01

How to Cite

Twitter Data Classification by Applying and Comparing Multiple Machine Learning Techniques . (2019). International Journal of Innovative Research in Computer Science & Technology, 7(6), 147–152. Retrieved from https://acspublisher.com/journals/index.php/ijircst/article/view/13212