<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Smart SPAM &#038; Fighting it</title>
	<atom:link href="http://sbutler.com/blog/2006/05/smart-spam/feed/" rel="self" type="application/rss+xml" />
	<link>http://sbutler.com/blog/2006/05/smart-spam/</link>
	<description>data mining and things i find interesting</description>
	<pubDate>Fri, 21 Nov 2008 22:10:29 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
		<item>
		<title>By: John Aitchison</title>
		<link>http://sbutler.com/blog/2006/05/smart-spam/#comment-3905</link>
		<dc:creator>John Aitchison</dc:creator>
		<pubDate>Thu, 22 Mar 2007 00:10:41 +0000</pubDate>
		<guid isPermaLink="false">http://sbutler.com/blog/2006/05/smart-spam/#comment-3905</guid>
		<description>Interesting ideas. 

I guess I am not as hopeful as you that a centralized solution like Gmail would be smart enough quickly enough to defeat spammers. 

For example I get a lot of spam emails with near duplicate but natural enough looking content - the problem is that the content (which has been harvested from the web) is not the real message (which is often male performance enhancing drugs) and swamps the Bayesian filters.  I am not sure that "Bayesian like" classification tools will ever be smart enough for this.

I think I have a blog entry on this .. no, I just checked, it is not up, so I will put it up now. Here it is http://dsanalytics.com/dsblog/why-bayesian-spam-filters-are-doomed_92


btw, the medium of choice for people under 20 seems to be texting.. email is seen as old hat and when it is used it is usually through eg hotmail who do seem to do a good job of spam suppression.  

Email is still, however, the medium of business communication although I am intrigued by the gmail idea of conversations (not universally loved though - see http://philwilson.org/blog/2005/06/gmail-conversations.html)</description>
		<content:encoded><![CDATA[<p>Interesting ideas. </p>
<p>I guess I am not as hopeful as you that a centralized solution like Gmail would be smart enough quickly enough to defeat spammers. </p>
<p>For example I get a lot of spam emails with near duplicate but natural enough looking content - the problem is that the content (which has been harvested from the web) is not the real message (which is often male performance enhancing drugs) and swamps the Bayesian filters.  I am not sure that &#8220;Bayesian like&#8221; classification tools will ever be smart enough for this.</p>
<p>I think I have a blog entry on this .. no, I just checked, it is not up, so I will put it up now. Here it is <a href="http://dsanalytics.com/dsblog/why-bayesian-spam-filters-are-doomed_92" rel="nofollow">http://dsanalytics.com/dsblog/why-bayesian-spam-filters-are-doomed_92</a></p>
<p>btw, the medium of choice for people under 20 seems to be texting.. email is seen as old hat and when it is used it is usually through eg hotmail who do seem to do a good job of spam suppression.  </p>
<p>Email is still, however, the medium of business communication although I am intrigued by the gmail idea of conversations (not universally loved though - see <a href="http://philwilson.org/blog/2005/06/gmail-conversations.html" rel="nofollow">http://philwilson.org/blog/2005/06/gmail-conversations.html</a>)</p>
]]></content:encoded>
	</item>
</channel>
</rss>
