After reading the ensembles of cluster article recently, I just say another ariticle in IEEE PAMI entitled "Rotation Forest: A New Classifier Ensemble Method". The approach is interesting: much like Random Forests (where the diversity of trees used in the ensemble are developed by both bootstrap sampling and random variable selection), there is a random selection of variables to use in trees. But the twist here (and the "rotation") is by using PCA on the random subset of candidate variables.
I'm sure there are near infinite ways to tweak the ideas of random record/variable selection, but once again the keys to the success of ensembles here and always are:
1) Diversity in information (i.e., data) the modeling algorithm sees.
2) Algorithms that are weak learners benefit from ensembles
Random Forests, it seems to me, works well because not only are trees coarse, blunt (and unstable) predictors, but they are greedy searches that can be fooled into going down a sub-optimal path. By constraining the splits to contain only some of the variables, the tree is forced out of it's greedy perspective to consider other ways to achieve the solution. This new algorithm does the same thing, with the twist of using PCA to develop linear projections of the original data (subsets to be more precise).
I think we'll be seing more and more variations on the same theme in the coming years.