sbutler.com

Archive for the 'Data Mining' Category

Data Mining the Financial Markets

Friday, April 25th, 2008

Thomas A. Rathburn has written a series of three articles on data mining the financial markets. Rathburn takes a detailed look into the success and failures of his efforts in the markets and with 10 year US bonds in particular. You can check it out here part 1, part 2, and part 3. […]

Article: HCF gets a helping hand from predictive analytics

Tuesday, June 13th, 2006

From the ComputerWorld article:
Private health insurer HCF has implemented a predictive analytics suite to help weed out fraudulent claims, target individual members and streamline the monotonous labour of data analysis.

Data Mining with Oracle

Tuesday, May 30th, 2006

If you are interested in data mining and haven’t already seen the Oracle Data Mining and Analytics blog, it is worth checking out. It has some great how to’s, including time series forcasting (parts 1, 2, 3) and real-time scoring & model management (parts 1, 2, 3).

Smart SPAM & Fighting it

Saturday, May 13th, 2006

For any machine learning based SPAM filters, such as the popular Bayesian methods, the key to success is the body of previously identified SPAM and HAM (valid emails) or training data. In order for the spammer to trick the filter, they must try to be more HAM-like. The way to beat this is by giving […]

Data Mining Cup 2006

Friday, May 5th, 2006

The Data Mining Cup (DMC2006), has launched for 2006. This year the competition focuses on eBay auctions. The target is to predict for each new auction whether the actual sales revenue is higher than the average sales revenue of the product category.

DARPA Grand Challenge

Thursday, May 4th, 2006

Start your engines, the DARPA Grand Challenge is on again only this time its an urban challenge! The last two competitions were to race an autonomous vehicle through a desert, with the 2005 winner, Standford, taking home a US$2 million prize.
Stanford’s software in action: Input from GPS and many sensors […]

Getting to know R Graphs

Friday, April 7th, 2006

Check out the R Graph Gallery which includes not only detailed descriptions of graphs you can produce in R, but also R source! Props to Martin for the link.

Future of Radio

Wednesday, March 29th, 2006

You may have listened to Internet radio before, but Pandora is a station of a different kind - totally personalised. Its a Flash based player (sorry Andy!) that sits inside your browser, so no problems with firewalls. But the real innovation is that when you start it up, you tell it the artists you like, […]

YALE Data Mining Environment

Friday, March 24th, 2006

YALE is a data mining and machine learning environment that integrates WEKA and some other SVM related tools into one GUI tool. Looks pretty spiffy - the GUI looks much better than Weka’s, and its Java/cross-platform also. Screenshots here.

WEKA in Jython or even C#

Wednesday, March 8th, 2006

I was very excited to find out that Python scripts can access Java APIs if you run on them Jython interpreter. Jython is a Python interpretor written in Java which some people have put to good use for fast prototyping of WEKA applications. I built a simple classifier using Jython and weka classes and everything […]