ICWSM 2014 Tutorial: Social Media Threats and Countermeasures
Tutorial slides are here.
Presenters
Kyumin Lee, Assistant Professor, Department of Computer Science, Utah State University, kyumin.lee [at] usu.edu
James Caverlee, Associate Professor, Department of Computer Science and Engineering, Texas A&M University, caverlee [at] cse.tamu.edu
Calton Pu, Professor, School of Computer Science, Georgia Institute of Technology, calton [at] cc.gatech.edu
Time and Location
Sunday, June 1, 9:00am - 12:00pm
Room 2255, North Quad Complex, University of Michigan, 105 South State Street, Ann Arbor
Tutorial Summary
The past few years have seen the rapid rise of many successful social systems – from Web-based social networks (e.g., Facebook, LinkedIn) to online social media sites (e.g., Twitter, YouTube) to large-scale information sharing communities (e.g., reddit, Yahoo! Answers) to crowd-based funding services (e.g., Kickstarter, IndieGoGo) to Web-scale crowdsourcing systems (e.g., Amazon MTurk, Crowdflower).
However, with this success has come a commensurate wave of new threats, including bot-controlled accounts in social media systems for disseminating malware and commercial spam messages, adversarial propaganda campaigns designed to sway public opinion, collective attention spam targeting popular topics and memes, and propagate manipulated contents.
This tutorial will introduce peer-reviewed research work on social media threats and countermeasures. Specifically, we will address new threats such as social spam, campaigns, misinformation and crowdturfing, and overview countermeasures to mitigate and resolve these threats by revealing and detecting malicious participants (e.g., social spammers, content polluters and crowdturfers) and low quality contents. This tutorial will also overview available tools to detect these participants.
Outline
1. Introduction to Social Media Threats
o Overview of this tutorial
o What kinds of social media threats exist in social systems? Introduce social media threats such as social spam, campaigns, misinformation and crowdturfing, and show examples.
o How are these social media threats different from traditional threats? Examples are:
o Openness. Anyone can create an social account. Easy to contact other users.
o URL blacklists are too slow at identifying new threats, allowing more than 90% of visitors to view a page before it becomes blacklisted [1].
o URL shortening services for obfuscation.
o Automatically control bots by using APIs.
2. State-of-the-Art in Research on the Threats and Defenses
2.1. Social Spam
In this session, we will overview various social spam detection approaches:
o How to detect suspicious URLs [2].
o Social capitalists have contributed for spammers to become well established users and get social signals even though they have spread spam over social networks. We will answer why this happened and how to give penalty to not only spammers but also these social capitalists [3].
o Supervised spam detection approach is the most popular approaches to detect social spammers or spam messages. Examples of social spam detected by the classification approach are YouTube video spam [4], Twitter spam [5], Foursquare spam tips [6] and collective attention spam [10].
o Social Honeypot was proposed to monitor spammers' behaviors and collect their information [7].
o Using the crowd wisdom to identify social spammers [8].
o Unsupervised social spam detection approach [9].
2.2 Campaigns
We will introduce how these malicious participants form groups and run campaigns to target social systems more effectively, and overview campaign detection approaches:
o Graph-based social spam campaign detection [11].
o Content-driven campaign detection [12][13].
o Detect and track political campaigns in social media by using a classification approach [14].
o Frequent itemset mining method with behavioral models to detect fake reviewer groups [15].
2.3. Misinformation
Can we trust information generated on social systems? This session will introduce what kind of misinformation exist on social systems, and survey possible approaches to detect the misinformation:
o Measure information credibility on social media by using classification approaches with the crowd power [16].
o Automatic rumor detection approach on Sina Weibo, China's leading micro-blogging service provider [17].
o Identify fake images on Twitter during Hurricane Sandy [18].
o Methods for the information credibility in emergency situation. The methods consist of an unsupervised approach and a supervised approach to detect message credibility [19].
2.4. Crowdturfing
Recently, malicious participants have started to take advantage of the crowd power to spread manipulated information over social systems. This session will overview real examples of weaponizing crowdsourcing and techniques to identify these manipulated contents and crowd workers who spread manipulated contents on behalf of requesters:
o Introduce real examples reported by the news media.
o Understand what kind of crowdturfing tasks are available on crowdsourcing sites [20][21].
o Understand a crowdturfing market size in both eastern and western crowdsourcing sites [21][22].
o Track and reveal crowdsourced manipulation of social media. Especially, focus on the western crowdsourcing sites and overview how to detect crowdturfers on social media [21].
3. Challenges, Opportunities and Tools
o Review of open research challenges: need for large, accurate, up-to-date data sets, integration of multiple techniques and areas.
o Data management challenge: user protection in ethics, privacy and related areas of public data sets
o Introduce useful tools for conducting research in the area:
o Big data analysis (e.g., MapReduce, Pig, Hive)
o Machine learning (e.g., Weka, Mallet)
o Visualization (e.g., Matplotlib, Graphviz)
Prerequisites and Outcomes
The tutorial is intended for a technically knowledgeable and interested audience that may or may not have any experience in social spam, campaigns, misinformation and crowdturfing. At the base level, the audience will receive a general and broad introduction to a range of social media threats in a variety of social systems including online social networks, social media sites and crowdsourcing sites with emphasis on the technical challenges in the area. At the social system protection level from these malicious participants, in-depth analyses of social spam, campaigns, misinformation and crowdturfing will illustrate and compare some effective techniques. At the research level, representative research challenges will be outlined with introduction of tools for conducting research in the area.
References
[1] Grier, C., Thomas, K., Paxson, V., and Zhang, M. @spam: the underground on 140 characters or less. In CCS, 2010.
[2] Lee, S., and Kim, J. WarningBird: Detecting suspicious URLs in Twitter stream. In NDSS, 2012.
[6] Aggarwal, A., Almeida, J., and Kumaraguru, P. Detection of spam tipping behaviour on foursquare. In WWW Companion, 2013.
[16] Castillo, C., Mendoza, M., and Poblete, B. Information credibility on twitter. In WWW, 2011.
[17] Yang, F., Liu, Y., Yu, X., and Yang, M. Automatic detection of rumor on Sina Weibo. In SIGKDD Workshop on Mining Data Semantics, 2012.