Data mining user preferences

Using the API of me and a fellow student traversed the proļ¬les of more than 10,000 users for a project in a Data Mining course. We then performed principal component analysis and clustering on their musical tastes, based on the highly noisy data of music tags.

What we found was that tag preference paramenters can be used to guess the sex and age of a user with some success, and that specific tags are related to each other through users. The project provided me with interesting experience in data mining. The screen shot below is a plot of artists along two of the principal component axes, namely those of how "electric" and how "indie" music is.

Plot of artists by two principal componentsPlot of artists by two principal components