Data clustering is one of unsupervised learning techniques that attempts to find some structure in the data. Given some dataset, we may be interested to see if there are some natural clusters. This is where the classic k-means algorithm kicks in. Given a dataset and desired number of clusters, the algorithm returns the center coordinates of each cluster.
To my knowledge, there are plenty of small and big, simple and complex implementations of the k-means algorithm [WIKI]. Either it is useful only for small instructive datasets, either it is buried well in some framework. All we want is to throw in our data in MATLAB and get the results as fast as possible.
I managed to create a small interface MEX for MATLAB [WWW] which calls the k-means implementation in OpenCV [WWW]. That implementation is well tuned and leverages all your processors. If you have a recent Intel processor and have compiled the OpenCV using the Intel compiler [WWW] + friends (IPP [WWW] and TBB [WWW]), you are owning already a powerful rocket, full of fuel, pointing to outer regions of the known space! Why not to use it?
You can download the source code [HERE] along with pre-compiled MEX files for Mac OSX Lion, Matlab 2012a 64-bit using the Intel compiler. Use the supplied script from terminal to compile your own blob.
If you have installed the OpenCV under /usr/local directory, you can try to compile the MEX file from MATLAB’s command prompt:
mex -I/usr/local/include -L/usr/local/lib -lopencv_core mexKmeans.cpp