Spam Filtering in Twitter using Sender-Receiver Relationship

Jonghyuk Song; Sangho Lee; Jong Kim

Spam Filtering in Twitter using Sender-Receiver Relationship

Jonghyuk Song ,
Sangho Lee ,
Jong Kim

14th International Symposium on Recent Advances in Intrusion Detection (RAID 2011) | September 2011

Download BibTex

Twitter is one of the most visited sites in these days. Twitter spam, however, is constantly increasing. Since Twitter spam is diﬀerent from traditional spam such as email and blog spam, conventional spam ﬁltering methods are inappropriate to detect it. Thus, many researchers have proposed schemes to detect spammers in Twitter. These schemes are based on the features of spam accounts such as content similarity, age and the ratio of URLs. However, there are two signiﬁcant problems in using account features to detect spam. First, account features can easily be fabricated by spammers. Second, account features cannot be collected until a number of malicious activities have been done by spammers. This means that spammers will be detected only after they send a number of spam messages. In this paper, we propose a novel spam ﬁltering system that detects spam messages in Twitter. Instead of using account features, we use relation features, such as the distance and connectivity between a message sender and a message receiver, to decide whether the current message is spam or not. Unlike account features, relation features are diﬃcult for spammers to manipulate and can be collected immediately. We collected a large number of spam and non-spam Twitter messages, and then built and compared several classiﬁers. From our analysis we found that most spam comes from an account that has less relation with a receiver. Also, we show that our scheme is more suitable to detect Twitter spam than the previous schemes.