sbutler.com

Visualising Digg

May 3rd, 2006

Digg, The Blog has info on a nice visualisation of activity on digg.com. Kevin mentions the zip-line effect in the videos are probably bots. Pretty cool!

Google Scholar

April 30th, 2006

I must say, having been a long time CiteSeer user, Google Scholar is a real breath of fresh air. It is yet another academic search interface, although this time its done right (unlike Rexa which is waaaay too inaccurate). Its a great interface and you can actually find whatever your looking for, its quite amazing!

Library support is on the way, too. At the moment though, I could only find the National Library of Australia and Deakin University, but the level of integration is very promising.

Tip: Maintaining your BibTeX reference database can be a pain sometimes. When using Google Scholar, make sure you enable BibTeX export in the preferences, it will save you heaps of time. Even when you already have the PDF, it is easy to do quick search, click “Import into BibTeX” then the entry into copy & paste into your .bib file. There is a slight bug with the field “authors” instead of “author” but that is easy to fix on the fly.

LaTeX Presentations

April 18th, 2006

Now you can design presentation templates in Inkscape and use them as a LaTeX style. From the LaTeX Presentation Designer website:

The package contains a documentclass called “presentation” which takes as an argument a slide style. The package also provides a simple python program that can interpret SVG files generated by Inkscape and build a slide style, usable by the “presentation” document class, directly from it. This means that creating new custom slide designs is as simple as drawing what you want your slides to look like in Inkscape.

lpd-gradient-screenshot.png LaTeX presentation designer screenshot LaTeX presentation designer screenshot

LaTeX based presentation are handy when you want nice looking equations on your slides. Other good alternatives are Beamer and Prosper.

Getting to know R Graphs

April 7th, 2006

Check out the R Graph Gallery which includes not only detailed descriptions of graphs you can produce in R, but also R source! Props to Martin for the link.

What’s in a name?

April 5th, 2006

Dennis Forbes gives a fantastic analysis of one of the biggest databases on the Internet - the DNS records. His analysis includes insights into domain name length, personal and family name usage and other characteristics. For example, did you know that all 2- and 3-letter domains are taken? Dennis is planning a second part so keep a look out for that too.

Future of Radio

March 29th, 2006

You may have listened to Internet radio before, but Pandora is a station of a different kind - totally personalised. Its a Flash based player (sorry Andy!) that sits inside your browser, so no problems with firewalls. But the real innovation is that when you start it up, you tell it the artists you like, and it will attempt to determine what other songs you will like too, and play those to your personal audio stream. As time progresses you can give each song played the thumbs-up or thumbs-down which will further refine what music is played to you!! Its not bad, but they should some more advanced techniques like MusicMiner to better adapt to user tastes.

MusicMiner uses a Self-Organising Maps based technique (”Emergent SOM“) to determine and visualise music similarity:

MusicMiner preview

The major advantage of MusicMiner is obviously you can use it on your own music collection and choose to play a particular song, whereas Pandora you can only define your interests and listen to see what is played. There is no guarantee Pandora will actually play that artist although usually it will eventually.

Photoshop Plugins and Gtalk for Linux

March 24th, 2006

One common complaint from Windows enthusiasts is that image editor on Linux, The GIMP, is somehow lacking compared with Photoshop. Those people will be happy to know you can now use Photoshop plugins in The GIMP on Linux. The GIMP really is a fantastic tool for both Windows and Linux, so give it a go now!

Another piece of good news for Linux users is the announcement of the Tapioca Google Talk-compatible client. While Google has made their IM network available to people of all OSes by using the Jabber IM protocol and open sourcing their VoIP backend, Tapioca is the first application to provide both for Linux. Awesome!

YALE Data Mining Environment

March 24th, 2006

YALE is a data mining and machine learning environment that integrates WEKA and some other SVM related tools into one GUI tool. Looks pretty spiffy - the GUI looks much better than Weka’s, and its Java/cross-platform also. Screenshots here.

A Snapshot of Web Development

March 20th, 2006

The Google Web Authoring Statistics site provides real insight into the way web developers are using the web. Of particular interest to me was the most commonly used elements page. The graphs on this site use SVG, although interestingly there is no comparison of image types people are using in their webpages. Anyway you’ll need something like Firefox 1.5 to look at it.

WEKA in Jython or even C#

March 8th, 2006

I was very excited to find out that Python scripts can access Java APIs if you run on them Jython interpreter. Jython is a Python interpretor written in Java which some people have put to good use for fast prototyping of WEKA applications. I built a simple classifier using Jython and weka classes and everything seemed to be going fine. However, complications arose when trying to use databases jython. My intention was to use a sqlite database, but while databases work really well in standard python (or ‘CPython’) via the DB-API2, this is lost in the move to Jython. The web points to zxJDBC which has now been integrated into Jython, but it suffers from being a wrapper for JDBC that feels like a DB-API2 object… meaning you have to install JDBC drivers and all sorts.

On the .NET side of things, apparently IKVM (part of the Mono project), allows programmers to use Java APIs into their .NET applications, so maybe there is some hope for using Weka and .NET. BTW don’t forget there are heaps of free great development tools around if you are taking the .NET path.

Update: Ok it turns out that although zxJDBC is a part of Jython now, it is still not included in the Gentoo package :(