Machine learning: Demo: clustering

Does dimension reduction matter? #

When reducing the dimension of a data set, it seems that information is being lost and so clustering can only be more accurate on the original, unreduced, data. But the answer to this question is more subtle. The following notebook investigates what happens to the accuracy of k-means clustering when there are extraneous variables present in the data.