vonMises-Fisher Clustering

In this page, we provide the matlab code implementing the mixture model clustering and consistency analysis introduced in Lashkari et al., NeuroImage, 2010 and used for parts of the analysis in Yeo et al., Journal of Neurophysiology, 2011. You can download the code here.

Notes About the Code

Group fMRI data are entered in the code as matrices representing the response of each voxel to different stimuli in a cell structure called dat.The diagram below provides a schematic description of different functions involved in the processing pipeline. More generally, the same code and analysis may apply to other grouped data whenever we wish to find clusters of vectors that consistently appear across the group.

Please note that the code implements a permutation test that is slightly different from the one presented in (Lashkari et al., 2010). In this test, we assume that in the null distribution there is no correspondence between stimulus labels across subjects while the structure is consistent within subjects. This test is both more conservative and faster than the test presented in the paper that assumes there is no stimulus structure in the data by permuting trial labels.

To draw a sample from the null distribution, we randomly permute the stimulus labels in different subjects. Since this permutation is consistent within subject, the results of clustering for the individual data remains the same as the case with no permutation. Therefore, we need to only perform a clustering for the group data for each sample from the null distribution.

The code allows two different ways of defining the consistency scores: 1) based on correlation coefficients between mean cluster vectors (Lashkari et al., 2010), and 2) based on agreement between cluster memberships. Accordingly, the permutation test can derive both statistics from the null distribution.

For the analysis, I use a matlab implementation of the Hungarian algorithm for bipartite graph matching that is available here.


The code comes with a script demo.m to show how every thing works. The script generates a simple synthetic data set with 4 subjects. Since in this case we have the ground truth, we can test it to see if the code works properly on your computer. At the end of the script, when you generate the figures, you should expect to find something that look like this:



This research is supported in part by the NSF IIS/CRCNS grant 0904625, "Finding Structure in the Space of Activation Profiles in fMRI" and the The MIT McGovern Institute Neurotechnology Program.