Unveiling Criminal Activity: a Social Media Mining Approach to Crime Prediction

Sheeba Armoogum; Deshinta Arrova Dewi; Vinaye Armoogum; Nicolas Melanie; Tri Basuki Kurniawan

doi:10.47738/jads.v5i3.350

Unveiling Criminal Activity: a Social Media Mining Approach to Crime Prediction

Sheeba Armoogum, Deshinta Arrova Dewi, Vinaye Armoogum, Nicolas Melanie, Tri Basuki Kurniawan

Abstract

Social media platforms have become breeding grounds for abusive comments, necessitating the use of machine learning to detect harmful content. This study aims to predict abusive comments within a Mauritian context, focusing specifically on comments written in Mauritian Kreol, a language with limited natural language processing tools. The objective was to build and evaluate four machine learning models—Decision Tree, Random Forest, Naïve Bayes, and Support Vector Machine (SVM)—to accurately classify comments as abusive or non-abusive. The models were trained and tested using k-fold cross-validation, and the Decision Tree model outperformed others with 100% precision and recall, while Random Forest followed with 99% accuracy. Naïve Bayes and SVM, although achieving 100% precision, had lower recall rates of 35% and 16%, respectively, due to imbalanced data in the training set. Pre-processing steps, including stop-word removal and a custom Kreol spell checker, were key in enhancing model performance. The study provides a novel contribution by applying machine learning in a Mauritian context, demonstrating the potential of AI in detecting abusive language in underrepresented languages. Despite limitations such as the absence of a Kreol lemmatization tool and incomplete coverage of Kreol spelling variations, the models show promise for wider application in social media crime detection. Future research could explore expanding this approach to other languages and domains of social media crimes.

Article Metrics

Abstract: 570 Viewers PDF: 177 Viewers

Keywords

Abusive Comment Detection; Machine Learning in social media; Mauritian Kreol Natural Language Processing; Decision Tree Classification; Cybersecurity in social media; Process Innovation

Cite:

How to cite item

Full Text:

PDF

DOI: https://doi.org/10.47738/jads.v5i3.350

Citation Analysis:

Refbacks

There are currently no refbacks.

Journal of Applied Data Sciences

ISSN	:	2723-6471 (Online)
Collaborated with	:	Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia.
Publisher	:	Bright Publisher
Website	:	http://bright-journal.org/JADS
Email	:	taqwa@amikompurwokerto.ac.id (principal contact)
		support@bright-journal.org (technical issues)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0

Username
Password
Remember me