Machine Learning based Detection of Twitter Spam A Cyber Security Seminar
Robustifying Machine Learning (ML) based Detection Systems of Twitter Spam
Investigating the vulnerability of Machine Learning (ML) to adversarial examples has attracted researchers' attention in recent year. Specifically, when utilizing these automated systems in security applications. Although detecting malicious massages (e.g., spam or offensive tweets) in Twitter has been widely investigated in the literature, the robustness of these models to adversarial examples is often overlooked. These adversarial examples can change the data distribution (i.e., Adversarial concept drift), and make the ML-based detectors useless.
The main goal of this work is to robustify ML-based detection systems of Twitter spam by preforming three steps: identifying possible adversarial attacks, improving the robustness to adversarial examples and handling adversarial drift. A case study of ongoing spam campaigns that spreads untrustworthy healthcare advertisements in Twitter trending hashtags was used to identify possible adversarial attacks in Twitter. The analysis of these spam campaigns help us to identify three adversarial attacks and develop three adversary-aware spam detectors. The key novelty of the first detector is that it was built on the observation that the targeted campaigns were found to be using unique hijacked accounts (i.e., inactive accounts) as adversarial examples to fool the deployed spam detectors. We designed a new feature, which is faster to compute compared to features used in the literature, and which also improves the accuracy with which hijacked accounts can be identified to 73%. Additionally, we improve the robustness of spam image detectors to identified images with embedded spam content. The proposed OCR-based detector outperforms two state-of-the-art OCRs in recognizing Arabic and English text embedded in images uploaded into Twitter. The key novelty is that our detector uses black/ white list with human-in-the-loop approach to ensure the robustness and adaptability. We further propose an OCR post-correction algorithm, which improves the robustness of OCR-based detectors with at least 10% against identified adversarial text images.
Meeting ID: 960 3167 2642, Passcode: 048900
Niddal Imam, University of York
Niddal Imam is a PhD candidate at University of York. His area of research interest is the study of security problems related to Machine Learning (ML) based security applications, Twitter spam detection and network security.