Book Details

Fake News Detection

International Journal of Computer Science (IJCS) Published by SK Research Group of Companies (SKRGC).

Download this PDF format

Abstract

The purpose of this thesis is to assist in automating the detection of Fake News by identifying which features are more useful for different classifiers. The effectiveness of different extracted features for Fake News detection are going to be examined. When classifying text with machine learning algorithms features have to be extracted from the articles for the classifiers to be trained on. In this thesis, several different features are extracted: word counts, ngram counts, term frequency-inverse document frequency, sentiment analysis, lemmatization, and named entity recognition to train the classifiers. Two classifiers are used, a Random Forest classifier and a Naïve Bayes classifier. Training on different features combined with different machine learning algorithms yields different accuracies. By testing the different features on different classifiers, it can be determined which features are the best for Fake News detection. Classifying news articles as either Fake News or as not Fake News is explored using three datasets, which in total contains over 40,000 articles. One of the datasets is used to partly to train the classifiers and partly to test the classifiers. The remaining two datasets are used purely for testing the classifiers.

References

[1] A. Gelfert, “Fake news: A definition,” Informal Log., vol. 38, no. 1, pp. 84–117, 2018.

[2] K. Rogers and J. Bromwich, “The Hoaxes, Fake News and Misinformation We Saw on Election Day,” NY Times, New York, 08-Nov-2016.

[3] I. Witten, E. Frank, and M. Hall, Data Mining 4th Edition. 2016.

[4] W. Y. Wang, “‘Liar, Liar Pants on Fire’: A New Benchmark Dataset for Fake News Detection,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2017, pp. 422–426.

[5] D. Byrd, “The science of fake news gets a boost,” 2018. [Online]. Available: http://earthsky.org/human-world/fake-news-mar-2018-article-science-calling-for-studies.

[6] K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu, “FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media,” 2018.

[7] “BS DETECTOR.” [Online]. Available: https://github.com/selfagency/bs-detector.

[8] H. Ahmed, I. Traore, and S. Saad, “Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques,” in Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments, 2017, pp. 127–138.

Keywords

Random Forest classifier, Naïve Bayes classifier, Datasets.

Image
  • Format Volume 10, Issue 2, No 1, 2022.
  • Copyright All Rights Reserved ©2022
  • Year of Publication 2022
  • Author Mohamed Abdullah, Mr.N.Ganapathiram
  • Reference IJCS-423
  • Page No 2838-2842

Copyright 2024 SK Research Group of Companies. All Rights Reserved.