Thursday 28th of March 2024
 

Spam Filtering by Quantitative Profiles


Marian Grendar, Jana Skutova and Vladimir Spitalsky

Instead of the \'bag-of-words\' representation, in the quantitative profile approach to spam filtering and email categorization, an email is represented by an m-dimensional vector of numbers, with m fixed in advance. Inspired by email shape analysis proposed recently by Sroufe et al., two instances of quantitative profiles are considered: line profile and character profile. Performance of these profiles is studied on the TREC 2007, CEAS 2008 and a private corpuses. At low computational costs, the two quantitative profiles achieve performance that is at least comparable to that of heuristic rules and naive Bayes.

Keywords: Email Categorization, Spam Filtering, Quantitative Profile, Character Profile, Line Profile, Random Forest

Download Full-Text


ABOUT THE AUTHORS

Marian Grendar
Slovanet, a.s., Zahradnicka 151, 821 08 Bratislava, Slovakia

Jana Skutova
Slovanet, a.s., Zahradnicka 151, 821 08 Bratislava, Slovakia

Vladimir Spitalsky
Slovanet, a.s., Zahradnicka 151, 821 08 Bratislava, Slovakia


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »