How Data is Used Within the Music Industry

How is data used to predict the next big hit?

Veronica Leong
5 min readMay 13, 2021

Many of us may think that a lot of the popular music on the charts today all sound the quite similar, whether it’s pop, hip hop, or EDM. The reason for that is because people want to hear what they already know, analogous to comfort food, and simply relax and “turn their brains off”. So, what characteristics are taken into account when determining a popular song, and how is the next big hit predicted?

There are many data science projects that analyze the popularity of music. One exploratory data analysis project in particular utilized the Spotify API to explore songs from the 1960s through the 2010s. The most common genre and words in popular songs, as well as interpreting a song’s energy, danceability, and tempo were evaluated, for example. Energy would measure the intensity of the music, where songs with higher intensity are generally louder and faster. The tempo of a song is the speed of a song in beats per minute (BPM). Danceability describes how fitting a song is for dancing, considering the song’s tempo, beat strength, rhythm stability, and overall regularity. Although there are many more factors that are taken into consideration when determining a song’s popularity, these are just a few factors. The data science project concluded that most popular songs contain the words “baby” and “love” in them, with no specific genre of music, however surrounding the theme of romance. In terms of energy/tempo/danceability, most people preferred music with higher energy and danceability, and a happier sound.

Source: https://www.shazam.com/company

Shazam, a music recognition app that was founded in 1999 and quickly grew popular in the 2000’s, is an example of an app that uses its data to determine the next big hit on Billboard’s Hot 100. Let’s first understand how the app works — In short, a user would simply record a snippet of a song anywhere, whether it’s from the radio, TV, or a public area, and the app would create a fingerprint based off the recorded sample’s frequency domain, search its database, and return the exact song title and artist. (This article details just how Shazam measures a song’s frequency, hertz, and other components to return the song’s information.) Since users “shazam” almost 15 million songs a day, the app uses that data as well as song reviews to predict artists that may receive attention in the near future. In addition to that, the app also released a real-time interactive map feature that shows popular songs around the world that users “shazam’d”. Not only does this contribute to the app’s big data, but this also allows talent agencies to identify up-and-coming unsigned artists that may soon top the charts. Some concert promoters even look at Shazam data to identify potential cities with the most fans of that artist to create tour stops.

Before Shazam and all other fancy apps, Billboard had to rely on radio stations and record store owners to report the most popular song played or bought. Now, with all this technology, the information helps music labels identify what songs produced would get on the radio before anyone would’ve even heard of it. In a way, this simulates audience reactions in a form of numbers. This leads to music producers and artists nowadays relying on song rankings, since songs stay on the top charts for months, and more people generally listen to the same songs over and over again.

This brings us to the question — How do music apps use data to determine what songs to recommend to users?

Do you ever find yourself looking for new songs to listen to? Why does it seem that some of the recommended songs end up getting added to your playlist? The obvious ways music apps suggest new music is based on the user’s song interactions (skips, repeat, play), preferences of similar users’ behaviors, and as mentioned briefly above, analyzing song components of a user’s historical listening. On the backend, there are numerous music apps that utilize music recommender systems.

Source: https://www.linkedin.com/pulse/how-spotify-recommender-system-works-daniel-roy-cfa/?trk=read_related_article-card_title

Spotify uses deep learning to understand user behavior against songs to identify and recommend similar artists or genres of music. One recommendation model used is collaborative filtering, which analyzes the behavior of similar users, using a nearest-neighbors model. For example, if User A likes a list of songs, and User B likes another list of songs, and there’s an identical enjoyed song between the two users’, then it’s likely that both users would enjoy the other songs in the other person’s list. This process is done on a significantly larger scale, against other Spotify users to create a user’s Discover Weekly playlist, which is a core example of collaborative filtering.

Source: https://medium.datadriveninvestor.com/behind-spotify-recommendation-engine-a9b5a27a935

Another classic recommendation model used is Natural Language Processing, which looks at a user’s playlist and web critiques about a song or artist. Every song and artist has a variety of words and pattern of words that describe them, along with a score that determines how correlated the word is to the song/artist. Looking at a user’s playlist, the playlist itself is the corpus/document and the songs within the playlist are the words used within the NLP model to identify other potential similar songs. Songs are recommended based on the song within the playlist and the highly correlated pattern of words against that song.

Lastly, raw audio modeling is used within neural networks to understand and compare the similarities of songs to other songs. Similar to components that make up the next big hit song, the model processes a song’s raw audio and analyzes features, such as a song’s length, loudness, beat, and tempo, to group songs of similar characteristics in the same category. For example, users who enjoy songs with a higher beat and tempo might enjoy songs in the disco genre. This modeling helps brand new songs added to Spotify get recognized on other playlists (and maybe even garner a new fan for that artist!).

How does this affect users moving forward?

Overall, we see how data plays a huge role in the music industry — from determining what makes a song popular and how to create the next top charted song, to how music apps serve users related songs to their liking through data science models and algorithms. Despite the advanced automation done on the backend, more data gathered means more data sold towards advertising, potentially riddling music apps with even more ads — a potential downfall. We see that over time, data usage has evolved to complex models, and it’ll only become more advanced from here.

--

--