ICWSM 2014 Tutorial: Social Media Threats and Countermeasures

 

Tutorial slides are here.

Presenters

Kyumin Lee, Assistant Professor, Department of Computer Science, Utah State University, kyumin.lee [at] usu.edu

James Caverlee, Associate Professor, Department of Computer Science and Engineering, Texas A&M University, caverlee [at] cse.tamu.edu

Calton Pu, Professor, School of Computer Science, Georgia Institute of Technology, calton [at] cc.gatech.edu

Time and Location

Sunday, June 1, 9:00am - 12:00pm

Room 2255, North Quad Complex, University of Michigan, 105 South State Street, Ann Arbor

Tutorial Summary

The past few years have seen the rapid rise of many successful social systems – from Web-based social networks (e.g., Facebook, LinkedIn) to online social media sites (e.g., Twitter, YouTube) to large-scale information sharing communities (e.g., reddit, Yahoo! Answers) to crowd-based funding services (e.g., Kickstarter, IndieGoGo) to Web-scale crowdsourcing systems (e.g., Amazon MTurk, Crowdflower).  

However, with this success has come a commensurate wave of new threats, including bot-controlled accounts in social media systems for disseminating malware and commercial spam messages, adversarial propaganda campaigns designed to sway public opinion, collective attention spam targeting popular topics and memes, and propagate manipulated contents.

This tutorial will introduce peer-reviewed research work on social media threats and countermeasures. Specifically, we will address new threats such as social spam, campaigns, misinformation and crowdturfing, and overview countermeasures to mitigate and resolve these threats by revealing and detecting malicious participants (e.g., social spammers, content polluters and crowdturfers) and low quality contents. This tutorial will also overview available tools to detect these participants.

Outline

1. Introduction to Social Media Threats

o   Overview of this tutorial

o   What kinds of social media threats exist in social systems? Introduce social media threats such as social spam, campaigns, misinformation and crowdturfing, and show examples.

o   How are these social media threats different from traditional threats? Examples are:

o   Openness. Anyone can create an social account. Easy to contact other users.

o   URL blacklists are too slow at identifying new threats, allowing more than 90% of visitors to view a page before it becomes blacklisted [1].

o   URL shortening services for obfuscation.

o   Automatically control bots by using APIs.

2. State-of-the-Art in Research on the Threats and Defenses

2.1. Social Spam

In this session, we will overview various social spam detection approaches:

o   How to detect suspicious URLs [2].

o   Social capitalists have contributed for spammers to become well established users and get social signals even though they have spread spam over social networks. We will answer why this happened and how to give penalty to not only spammers but also these social capitalists [3].

o   Supervised spam detection approach is the most popular approaches to detect social spammers or spam messages. Examples of social spam detected by the classification approach are YouTube video spam  [4], Twitter spam [5], Foursquare spam tips [6] and collective attention spam [10].

o   Social Honeypot was proposed to monitor spammers' behaviors and collect their information [7].

o   Using the crowd wisdom to identify social spammers [8].

o   Unsupervised social spam detection approach [9].

2.2 Campaigns

We will introduce how these malicious participants form groups and run campaigns to target social systems more effectively, and overview campaign detection approaches:

o   Graph-based social spam campaign detection [11].

o   Content-driven campaign detection [12][13].

o   Detect and track political campaigns in social media by using a classification approach [14].

o   Frequent itemset mining method with behavioral models to detect fake reviewer groups [15].

2.3. Misinformation

Can we trust information generated on social systems? This session will introduce what kind of misinformation exist on social systems, and survey possible approaches to detect the misinformation:

o   Measure information credibility on social media by using classification approaches with the crowd power [16].

o   Automatic rumor detection approach on Sina Weibo, China's leading micro-blogging service provider [17].

o   Identify fake images on Twitter during Hurricane Sandy [18].

o   Methods for the information credibility in emergency situation. The methods consist of an unsupervised approach and a supervised approach to detect message credibility [19].

2.4. Crowdturfing

Recently, malicious participants have started to take advantage of the crowd power to spread manipulated information over social systems. This session will overview real examples of weaponizing crowdsourcing and techniques to identify these manipulated contents and crowd workers who spread manipulated contents on behalf of requesters:

o   Introduce real examples reported by the news media.

o   Understand what kind of crowdturfing tasks are available on crowdsourcing sites [20][21].

o   Understand a crowdturfing market size in both eastern and western crowdsourcing sites [21][22].

o   Track and reveal crowdsourced manipulation of social media. Especially, focus on the western crowdsourcing sites and overview how to detect crowdturfers on social media [21].

3. Challenges, Opportunities and Tools

o   Review of open research challenges: need for large, accurate, up-to-date data sets, integration of multiple techniques and areas.

o   Data management challenge: user protection in ethics, privacy and related areas of public data sets

o   Introduce useful tools for conducting research in the area:

o   Big data analysis (e.g., MapReduce, Pig, Hive)

o   Machine learning (e.g., Weka, Mallet)

o   Visualization (e.g., Matplotlib, Graphviz)

Prerequisites and Outcomes

The tutorial is intended for a technically knowledgeable and interested audience that may or may not have any experience in social spam, campaigns, misinformation and crowdturfing. At the base level, the audience will receive a general and broad introduction to a range of social media threats in a variety of social systems including online social networks, social media sites and crowdsourcing sites with emphasis on the technical challenges in the area. At the social system protection level from these malicious participants, in-depth analyses of social spam, campaigns, misinformation and crowdturfing will illustrate and compare some effective techniques. At the research level, representative research challenges will be outlined with introduction of tools for conducting research in the area.  

References

[1]     Grier, C., Thomas, K., Paxson, V., and Zhang, M. @spam: the underground on 140 characters or less. In CCS, 2010.

[2]     Lee, S., and Kim, J. WarningBird: Detecting suspicious URLs in Twitter stream. In NDSS, 2012.

[3]     Ghosh, S., Viswanath, B., Kooti, F., Sharma, N. K., Korlam, G., Benevenuto, F., Ganguly, N., and Gummadi, P. K. Understanding and combating link farming in the twitter social network. In WWW, 2012.

[4]     Benevenuto, F., Rodrigues T., Almeida V., Almeida, J., and Gonçalves, M. Detecting spammers and content promoters in online video social networks. In SIGIR, 2009.

[5]     Lee, K., Eoff, B., and Caverlee, J. Seven Months with the Devils: A Long-Term Study of Content Polluters on Twitter. In ICWSM, 2011.

[6]     Aggarwal, A., Almeida, J., and Kumaraguru, P. Detection of spam tipping behaviour on foursquare. In WWW Companion, 2013.

[7]     Lee., K., Caverlee., J., and Webb, S. Uncovering Social Spammers: Social Honeypots + Machine Learning. In SIGIR, 2010.

[8]     Wang, G., Mohanlal, M., Wilson, C., Wang, X., Metzger, M. J., Zheng, H., and Zhao, B. Y. Social Turing Tests: Crowdsourcing Sybil Detection. In NDSS, 2013.

[9]     Tan, E., Guo, L., Chen, S., Zhang, X., and Zhao, Y. UNIK: Unsupervised Social Network Spam Detection. In CIKM, 2013

[10] Lee, K., Kamath, K., and Caverlee, J. Combating Threats to Collective Attention in Social Media: An Evaluation. In ICWSM, 2013.

[11] Gao, H., Hu J., Wilson, C., Li, Z., Chen, Y., and Zhao, B. Detecting and characterizing social spam campaigns. In IMC, 2010.

[12] Lee, K., Caverlee, J., Cheng,  Z., and Sui, D. Content-Driven Detection of Campaigns in Social Media. In CIKM, 2011

[13] Lee, K., Caverlee, J., Cheng,  Z., and Sui, D. Campaign Extraction from Social Media. In ACM TIST, Vol. 5, No. 1, January 2014.

[14] Ratkiewicz, J., Conover, M., Meiss, M., Gonçalves, B., Flammini, A., and Menczer, F. Detecting and Tracking Political Abuse in Social Media. In ICWSM, 2011.

[15] Mukherjee, A., Liu, B., and Glance, N. Spotting fake reviewer groups in consumer reviews. In WWW, 2012.

[16] Castillo, C., Mendoza, M., and Poblete, B. Information credibility on twitter. In WWW, 2011.

[17] Yang, F., Liu, Y., Yu, X., and Yang, M. Automatic detection of rumor on Sina Weibo. In SIGKDD Workshop on Mining Data Semantics, 2012.

[18] Gupta, A., Lamba, H., Kumaraguru, P., and Joshi, A. Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy. In WWW Companion, 2013.

[19] Xia, X., Yang, X., Wu, C., Li, S., and Bao, L. Information credibility on twitter in emergency situation. In Proceedings of the 2012 Pacific Asia conference on Intelligence and Security Informatics (PAISI), 2012.

[20] Motoyama, M., McCoy, D., Levchenko, K., Savage, S., and Voelker, G. M. Dirty jobs: the role of freelance labor in web service abuse. In Proceedings of the 20th USENIX conference on Security (SEC), 2011.

[21] Lee, K., Tamilarasan, P., and Caverlee, J. Crowdturfers, Campaigns, and Social Media: Tracking and Revealing Crowdsourced Manipulation of Social Media. In ICWSM, 2013.

[22] Wang, G., Wilson, C., Zhao, X., Zhu, Y., Mohanlal, M., Zheng, H., and Zhao, B. Y. Zhao. Serf and turf: crowdturfing for fun and profit. In WWW, 2012.