Researchers Develop Technique to Identify Spam Twitter Accounts
August 14, 2013
ICSI researchers and their collaborators at UC Berkeley, Twitter, and George Mason University have developed a technique to identify fraudulent Twitter accounts that are mass-created and then sold through an underground marketplace. Such accounts are used primarily to send spam but have also been employed to silence political protests. The approach could seriously undermine spam sent by social media; in the researchers’ study, it correctly identified millions of fraudulent accounts and helped Twitter disable 95 percent of those registered by 27 merchants tracked for the study. The researchers estimate these merchants are responsible for between 10 and 20 percent of all accounts flagged as spamming by Twitter. They are now working with Twitter to integrate the findings and other recommendations into the account creation process and Twitter's spam detection techniques.
The paper about the work, "Trafficking Fraudulent Accounts: The Role of the Underground Market in Twitter Spam and Abuse," was presented today at the USENIX Security Symposium.
To build the classifier, the researchers purchased more than 100,000 fraudulent accounts over the course of 10 months. The merchants used a variety of online retail strategies, from operating online storefronts with automated purchase forms to advertising on black hat forums and freelance labor Web sites. The median price of a fraudulent account purchased by the researchers was just 4 cents. Most of these accounts were confirmed through unique email addresses and had at least minimally completed profiles.
The researchers worked closely with Twitter to analyze the process used to register the accounts, determine patterns reflected in the account names, and identify particular behaviors that occur at the point of registration. From this analysis, the researchers developed a classifier that was able to retroactively identify millions of accounts that had been flagged as spam and also to flag other accounts that were eventually disabled. Based on the number of legitimate requests for account reactivation that Twitter subsequently received, the researchers estimate that the precision of the classifier - the percentage of flagged accounts that are actually fraudulent - is 99.9942 percent.
In addition to the classifier, the researchers' study of the market for social media accounts pointed to other techniques that might limit spam. For example, accounts confirmed by email cost significantly more than those that are unconfirmed, and because merchants often resell the email addresses used to confirm Twitter accounts, only 47 percent of the accounts the researchers purchased came with the email address and password used to confirm them. Email confirmation may increase the price of - and therefore limit the demand for - spam accounts, and re-confirmation may limit the ability of those who purchase accounts to use them.
The researchers also found that merchants registered accounts from thousands of unique IP addresses, suggesting they have access to large numbers of compromised machines. This makes traditional IP blacklisting difficult. However, a fraction of IP addresses are used to register thousands of accounts, an aspect that social media sites could exploit by generating IP blacklists in real time.
The work is part of the group's ongoing study of the online underground economy, a vast marketplace that supports a complicated network of vendors specializing in a wide range of products and services, including custom malware, stolen accounts, toolkits, spam mailing lists, freelance hacking, and money laundering. Last year, Networking and Security researchers began working on the National Science Foundation project "Beyond Technical Security: Developing an Empirical Basis for Socio-Economic Perspectives," a collaboration among ICSI, UC San Diego, and George Mason University to broaden the scope of the team's work to include attacks on social media. Read more about the project on the Web site for the Center for Evidence-Based Security Research.
In additional to the NSF grant, funding for this work is also provided by the Office of Naval Research under MURI grant N000140911081 and by a gift from Microsoft Research. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors or originators and do not necessarily reflect the views of the sponsors.