DEA-RNN A Hybrid Deep Learning Approach for Cyberbullying Detection in Twitter Social Media Platform Academic Project

Project

DEA-RNN A Hybrid Deep Learning Approach for Cyberbullying Detection in Twitter Social Media Platform

Posted by Admin: System Admin

Beginner

Abstract

Cyberbullying (CB) has become increasingly prevalent in social media platforms. With the popularity and widespread use of social media by individuals of all ages, it is vital to make social media platforms safer from cyberbullying. This paper presents a hybrid deep learning model, called DEA-RNN, to detect CB on Twitter social media network. The proposed DEA-RNN model combines Elman type Recurrent Neural Networks (RNN) with an optimized Dolphin Echolocation Algorithm (DEA) for fine tuning the Elman RNN's parameters and reducing training time. We evaluated DEA-RNN thoroughly utilizing a dataset of 10000 tweets and compared its performance to those of state-of-the-art algorithms such as Bi-directional long short term memory (Bi-LSTM), RNN, SVM, Multinomial Naive Bayes (MNB), Random Forests (RF). The experimental results show that DEA-RNN was found to be superior in all the scenarios. It outperformed the considered existing approaches in detecting CB on Twitter platform. DEA-RNN was more efficient in scenario 3, where it has achieved an average of 90.45% accuracy, 89.52% precision, 88.98% recall, 89.25% F1-score, and 90.94% specificity...

Existing System & Flaws

Purnamasari et al. [26] utilized the SVM and Information Gain(IG) based feature selection method for detecting cyberbullying events in tweets. Muneer and Fati [11] used various classifiers, namely AdaBoost(ADB), Light Gradient Boosting Machine (LGBM), SVM, RF, Stochastic Gradient Descent (SGD), Logistic Regression (LR), and MNB, and for cyberbullying events identification in tweets. This study extracted features using Word2Vec and TF-IDF methods. Dalvi et al. [12] [27] used SVM and Random Forests (RF) models with TF-IDF for feature extraction for detecting cyberbullying in tweets. Although SVM in these models achieved high performance, the model complexity increases when the class labels are increased. Al-garadi et al. [28] investigated cyberbullying identification using different ML classifiers such as RF, Naïve Bayes (NB), and SVM based on various extracted features from Twitter such as (tweet content, activity, network, and user). Huang et al. [29] suggested an approach for identifying CB from social media, which integrated the social media features and textual content features. The features are ranked using IG method. Well-known classifies such as NB, J48, and Bagging and Dagging are utilized. The findings implied that social characteristics could aid in increasing the accuracy of cyberbullying detection. Squicciarini et al. [30] utilized a decision tree (C4.5) classifier with a social network, personal and textual features to identify Cyberbullying and cyberbullying prediction on social networks like spring.me, and MySpace. Balakrishnan et al. [31] utilized different ML algorithms such as RF, NB, and J48 to detect cyberbullying events from tweets and classify tweets to different cyberbullying classes such as aggressors, spammer, bully, and normal. The study concluded that the emotional feature does not impact the detection rate. Despite its efficiency, this model is limited to a small dataset with fewer class labels. Alam et al. [32] proposed an ensemble-based classification approach using the single and double ensemble-based voting model. These ensemble-based voting models utilized decision tree, LR, and Bagging ensemble model classifiers for the classification while utilizing mutual information bigrams and unigram TF-IDF as feature extraction models. On analysis over the Twitter dataset, the Bagging ensemble model provided the best precision but considered other parameters. Although, these ensemble models reduced the training and execution time for classification, the major limitation comes when utilized sarcasm tweets and multiple-meaning acronym terms. Chia et al. [8] also utilized different ML and feature engineering-based approaches to classify irony and sarcasm from cyber-bullying tweets. In this approach, many classifiers and feature selection methods were tested; while this approach greatly detects the sarcasm and irony terms among cyber-bullying tweets, the detection rate is still very low [33]. Disadvantages ? The system is not implemented cyberbullying detection due to absence of an effective ML classifiers. ? The system is not implemented DEA-RNN techniques which lead very less prediction.

Proposed System & Advantages

In this article, we propose a hybrid deep learning-based approach, called DEA-RNN, which automatically detects bullying from tweets. The DEA-RNN approach combines Elman type Recurrent Neural Networks (RNN) with an improved Dolphin Echolocation Algorithm (DEA) for fine tuning the Elman RNN's parameters. DEA-RNN can handle the dynamic nature of short texts and can cope with the topic models for the effective extraction of trending topics. DEA-RNN outperformed the considered existing approaches in detecting cyberbullying on the Twitter platform in all scenarios and with various evaluation metrics. The contributions of this article can be summarized as the following: _ Develop an improved optimization model of DEA for use to automatically tune the RNN parameters to enhance the performance; _ Propose DEA-RNN by combining the Elman type RNN and the improved DEA for optimal classification of tweets; _ A new Twitter dataset is collected based on cyberbullying keywords for evaluating the performance of DEA-RNN and the existing methods; and _ The efficiency of DEA-RNN in recognizing and classifying cyberbullying tweets is assessed using Twitter datasets. The thorough experimental results reveal that DEA-RNN outperforms other competing models in terms of recall, precision, accuracy, F1 score, and specificity. Advantages ? The proposed system effectively identifies the trending topics from tweets and extracts them for further processing. An effective models help in leveraging the bidirectional processing to extract meaningful topics. ? An effective system which is mainly tested and trained by SVM, Multinomial Naive Bayes (MNB),Random Forests (RF) classifiers

Software Requirements

? Operating system : Windows 7 Ultimate.
? Coding Language : Python.
? Front-End : Python.
? Back-End : Django-ORM
? Designing : Html, css, javascript.
? Data Base : MySQL (WAMP Server).

Hardware Requirements

? Processor - Pentium –IV
? RAM - 4 GB (min)
? Hard Disk - 20 GB
? Key Board - Standard Windows Keyboard
? Mouse - Two or Three Button Mouse
? Monitor - SVGA

Interested in this Project?

You need an active student profile to apply for this project.

Need help? Contact Support