Album sales and streams#
Here we look at the album sales from RIAA Artists By Certified Album Units Sold, and we used Spotify Weekly Top 200 Songs Streaming Data for the amount of streams. Here we tried to look for a correlation between the album sales and the amount of streams. Looking at the graph it looks like there might be some form of a logarithmic trend in the data. Which shows that the more streams you have the less album sales. This isn’t that weird if you think about it, since album sales ended with streaming services like Spotify. So artists that were popular before spotify have more album sales and artists that are more popular today have less album sales, because everyone can just listen to the albums on Spotify.
Show code cell source
import plotly.graph_objs as go
import plotly.express as px
import pandas as pd
streams = pd.read_csv('../dataset/streams_p_artists.csv')
streams = streams.rename(columns={'artist_individual': 'Artist'})
streams['Artist'] = streams['Artist'].str.upper()
album_sales = pd.read_csv('../cleaned/riaakaggle.csv')
df = pd.merge(streams, album_sales, on='Artist')
px.scatter(df, x='streams', y='Certified Units', hover_data='Artist', title='Album sales and streams per artist').show()
print(df['streams'].corr(df['Certified Units']))
-0.011337900102258495