Hate Detection in COVID-19 Tweets in the Arab Region using Deep learning and Topic Modeling

Wednesday 28 October 2020, 1.30PM to 2:30pm

Speaker(s): Hend Al-Khalifa

The massive scale of social media platforms requires an automatic solution for detecting hate speech. Such solutions will help in reducing the manual analysis of content. Most of the past literature has casted the hate speech detection problem as a supervised text classification task, whether by using classical machine learning methods or, more recently, using deep learning methods. However, previous works investigated this problem in Arabic cyberspace is still limited compared to the published works in English.

This study aims to identify hate-speech posted by Twitter users in the Arab region related to COVID-19 pandemic and discover the main issues discussed among them.We used ArCOV-19 dataset, which is an ongoing collection of Arabic tweets related to the novel Coronavirus COVID-19, starting from January 27, 2020. Tweets were analyzed for hate speech using pretrained Convolutional neural network (CNN) model, and the results of the tweets classification were given a score varied between 0 to 1, with 1 being the most hateful text. We also utilized Non-negative Matrix Factorization (NMF) to discover main issues and topics in hate tweets .

Analysis of hate-speech in Twitter data in the Arab region has identified that the number of non-hate tweets by far exceeded the number of hate tweets, where the percentage of hate tweets in COVID-19 related tweets is 3.2%. It also revealed that the majority of hate tweets (71.4%) are in the low level of hate. This study has identified Saudi Arabia as the highest Arab country in spreading COVID-19 hate tweets during the pandemic. Furthermore, it has shown that the second time period (Mar 1- Mar 30) has the largest number of hate tweets which represents 51.9% of all hate tweets. Conflicting to what was anticipated, in the Arab region, it has been found that the spread of COVID-19 hate-speech in Twitter is weakly related with the dissemination of the pandemic based on Pearson correlation coefficient test (r value is 0.1982). The study has also identified the discussed topics in hate tweets during the pandemic. Analysis of 7 extracted topics showed that 6 of the 7 identified topics involved topics related to hate against China and Iran. Arab users also discussed topics related to political conflicts in Arab region during the COVID-19 pandemic .

Biography

Hend S. Al-Khalifa, is a professor at the Information Technology Department, CCIS, King Saud University, Riyadh, Saudi Arabia and head of iWAN research group. She received her PhD in Computer Science from Southampton University, UK.

Professor Hend has participated with more than 150 research papers in symposiums, workshops and conferences and published many journal articles. She also won Outstanding paper Award 2012 from Emerald Publishing and won several research grants such as Google CS4HS, GDRG and NPST. She has also translated into Arabic a seminal book called "Introduction to Natural language processing" By Professor Nizar Habsh. The translation recently won King Abdullah International Prize for Translation.

Currently, Professor Hend serves as an associate editor at the Journal of King Saud University - Computer and Information Sciences, Elsevier. She also served as a program committee in many national and international conferences and reviewer for several journals, such as: SOMET, IEEE ICALT, EMNLP, ACM SIGITE, IEEE TALE, International Journal of Electronic Governance (IJEG) Language Resources and Evaluation Journal, and ACM Transactions on Asian Language Information Processing. She also chaired and organized several international workshops and sessions including: the Workshop international on Free/Open-Source Arabic Corpora and Corpora Processing Tools at the 9th, 10th, 11th and 12th edition of the Language Resources and Evaluation Conference (LREC).

Professor Hend areas of interest include Arabic natural language processing, web technologies (i.e. Semantic Web/Web 2.0), Technology Enhanced Learning, HCI and computers for people with special needs.

Join on Google Meet

Location: Online