sbutler.com

Archive for May, 2006

Data Mining with Oracle

Tuesday, May 30th, 2006

If you are interested in data mining and haven’t already seen the Oracle Data Mining and Analytics blog, it is worth checking out. It has some great how to’s, including time series forcasting (parts 1, 2, 3) and real-time scoring & model management (parts 1, 2, 3).

Smart SPAM & Fighting it

Saturday, May 13th, 2006

For any machine learning based SPAM filters, such as the popular Bayesian methods, the key to success is the body of previously identified SPAM and HAM (valid emails) or training data. In order for the spammer to trick the filter, they must try to be more HAM-like. The way to beat this is by giving […]

Data Mining Cup 2006

Friday, May 5th, 2006

The Data Mining Cup (DMC2006), has launched for 2006. This year the competition focuses on eBay auctions. The target is to predict for each new auction whether the actual sales revenue is higher than the average sales revenue of the product category.

DARPA Grand Challenge

Thursday, May 4th, 2006

Start your engines, the DARPA Grand Challenge is on again only this time its an urban challenge! The last two competitions were to race an autonomous vehicle through a desert, with the 2005 winner, Standford, taking home a US$2 million prize.
Stanford’s software in action: Input from GPS and many sensors […]

Using Gmail for Backups

Wednesday, May 3rd, 2006

While writing a thesis it is obviously imperative to have foolproof backups in place. So why not backup to that free 2.7Gb Gmail account? Here’s what you have to do:

Install “email” (Gentoo users: emerge net-mail/email)
Edit /etc/email/email.conf (Gentoo users: as a minimum you must set REPLY_TO)
Test the commands. They are:
cd /path/to/your/thesis/
tar -czf /tmp/thesis.tar.gz *.*
email –blank-mail –smtp-server […]

Visualising Digg

Wednesday, May 3rd, 2006

Digg, The Blog has info on a nice visualisation of activity on digg.com. Kevin mentions the zip-line effect in the videos are probably bots. Pretty cool!