Since I already have some data retrieved from the Spotify Metadata API relating to my playlists, I thought I would have a play about with it. I’ve mainly been collecting release years from the albums of the tracks that appear on a given playlist. So it’s interesting to plot the years as a percentage of tracks in a playlist.
For instance, take my New Music Playlist:
Running this data we get:
As you can see, a large percentage of new music I am discovering (which is what that playlist is used for) comes from mostly things released over the past few years.
So what happens if we say compare this to a playlist of a particular decade, lets go for Spotify’s 80’s All Gold: Sophisticated Pop playlist:
Running those tracks we get:
Graph not what you where expecting? Me either! The issue? Oh yes it is known. It has been an ongoing debate on Spotify for years, it is the original release date debate.
And this in itself presents an issue, if all of the tracks in the above playlist (and I haven’t checked) where actually released in the 1980’s, then this “noise” in the results is going to cause issues when attempting to train a classifier to try and estimate someone’s age. However that being said, if everyone see’s the same amount of noise, then that becomes easier but I suspect from being on the community for so long some people go hunting for the original recordings rather than re-releases.
For example, lets take this user made Ultimate 90s Playlist:
What do you notice compared to the Spotify 80’s playlist above?
UPDATE: Shortly after writing this, Spotify have announced they are making changes to release dates and have already started the transition!