Biterm for spam filtering in short message service text
Due to rapid growth in mobile phones usage and reducing cost of sending text messages across mobile networks, short message service has become the most popular communication mode. This move has attracted spammers to mobile networks. Although several machine learning methods have been developed to filter out SMS spam from mobile phone users inboxes, Short Messaging Service has issues that posse challenges to the use conventional document models that rely on proportion of word distribution. For instance, SMSs suffer from severe sparse context information, which hampers classification of content based on proportion of word distribution. This paper proposes an algorithm that uses biterm topic model (BTM) to model SMS text message. Biterm topic model directly models the generation of word co-occurrence patterns (i.e. biterms)in the whole document. Finally, support vector machine (SVM) was used classification. The algorithm has proved that it can effectively model SMSs for classification using SVM.
Keywords: Support Vector Machine, Biterm Topic Model, Short Message Service, Spam Filtering
Download Full-Text
ABOUT THE AUTHORS
Richard Omolo Midigo
R. O. Midigo was born in 1972 in Nyanja Province of Kenya and is currently pursuing MSc. Degree in Computer Systems at Jomo Kenyatta University of Agriculture and Technology (JKUAT). He received Bachelor of Science degree in Information Science (IT option) from Moi University, Eldoret (Kenya) 2008. In 1996, he joined JKUAT Library Department as an employee. He has risen within the ranks in the library and serves as Systems Librarian.
Waweru Mwangi
Prof. Waweru Mwangi is an associate professor and a Senior Lecturer in the School of Computing and Information Technology, Jomo Kenyatta University of Agriculture and Technology. He holds PhD in Information Systems Engineering from Hokkaido University (Japan) 2004, MSc Operations Research and Cybernetics, Shanghai University (China)1995. Bed Mathematics Kenyatta University (Kenya) 1989.
George Onyango Okeyo
Dr. G. O. Okeyo is a holder of PhD in Activity Recognition in Smart Environments from the University of Ulster (UK) 2013, Mater of Science in Information Technology from University of Nairobi (Kenya) 2007. He is a lecturer in the School of Computing and Information Technology, Jomo Kenyatta University of Agriculture and Technology.
Richard Omolo Midigo
R. O. Midigo was born in 1972 in Nyanja Province of Kenya and is currently pursuing MSc. Degree in Computer Systems at Jomo Kenyatta University of Agriculture and Technology (JKUAT). He received Bachelor of Science degree in Information Science (IT option) from Moi University, Eldoret (Kenya) 2008. In 1996, he joined JKUAT Library Department as an employee. He has risen within the ranks in the library and serves as Systems Librarian.
Waweru Mwangi
Prof. Waweru Mwangi is an associate professor and a Senior Lecturer in the School of Computing and Information Technology, Jomo Kenyatta University of Agriculture and Technology. He holds PhD in Information Systems Engineering from Hokkaido University (Japan) 2004, MSc Operations Research and Cybernetics, Shanghai University (China)1995. Bed Mathematics Kenyatta University (Kenya) 1989.
George Onyango Okeyo
Dr. G. O. Okeyo is a holder of PhD in Activity Recognition in Smart Environments from the University of Ulster (UK) 2013, Mater of Science in Information Technology from University of Nairobi (Kenya) 2007. He is a lecturer in the School of Computing and Information Technology, Jomo Kenyatta University of Agriculture and Technology.