2nd February 2004 — Spam Filtering
Iain and Jaz are going to lead a look at Spam Filtering. This is intended as an informal look at an interesting adversarial classification problem . Please bring your ideas so that we can not only look at where spam filtering is, but where, in our humble opinions, we think it should go.
Here is one overview of some of the filtering techniques out in the wild:
This might be of interest: Conference on Email and Anti-Spam (CEAS)
Look at a list of tricks employed by spammers. While it isn't necessarily insurmountable, the adversarial aspect of the problem cannot be ignored!
So-called "Bayesian Spam Filtering" has got a lot of attention. In fact the number-one hit on Google for "Bayesian" is currently about spam-filtering. It's sad but true that many people have only heard of "Bayesian" in this context and are unaware of what Bayesian statistics are actually about.