Principal component analysis is a mathematical process for finding the underlying structure of data. If you have n dimensions of data -- say you have all the genetic markers in a population -- you can extract the "principal components," which are the ways in which the data tend to vary. That is, for example, if you take genetic markers in a population with European and African ancestry, how curly your hair is will tend to vary in the same way that your skin color varies, which may also vary with how likely you are to have sickle cell genes (an African trait). The "first principal component" would be the sum of all traits that vary with how much African ancestry you have. I bring up ancestry because of the fascinating studies of genetic markers by Luigi Luca Cavalli-Sforza, who found that the first principle component in European populations seems to indicate how much of your ancestry comes from European hunter-gatherers, and how much of it comes from the spreading population of Near Eastern farmers.
It occurs to me that you could extract principle componens from the huge database of song titles and their listening patterns on a site like
Last.fm. That ought to give you a sense what the
real genres of music are -- much like DNA analysis can tell us what the real tree of descent of the animal kingdom is, while observed traits sometimes gets you faux classes like "pachyderms."
If you know what the real musical genres are -- the real streams of musical thought -- you might get a little further figuring out what musical genres are and why they exist, on the esthetic and the biological level.
And now, back to our regularly scheduled programming...
Labels: books, reading