Indian Journal of Science and Technology
DOI: 10.17485/IJST/v15i36.2332
Year: 2022, Volume: 15, Issue: 36, Pages: 1786-1799
Original Article
Yap Kah Yee1*, Mafas Raheem1
1School of Computing, Asia Pacific University of Technology & Innovation, Kuala Lumpur, Malaysia
*Correponding Author
Email: [email protected]
Received Date:13 December 2022, Accepted Date:23 August 2022, Published Date:21 September 2022
Objectives: To examine whether the integration of Social Media features from YouTube videos and Spotify audio features can effectively predict music popularity. Methods: A dataset is constructed by collecting newly released tracks from May to August 2021. Audio features are acquired from Spotify while social media features are obtained from the official videos on YouTube. Music popularity is defined using five metrics derived from the Spotify Top 200 daily chart performance to measure diverse aspects of the songs’ success (Length, Max, Sum, Mean, and Debut). The predicted popularity has three target variables, ranging from Low, Medium to High popularity. During model implementation, four machine learning models were trained on the dataset in two different stages such as purely audio features and both audio and social media features respectively. Findings: At the second stage, random forest outperformed the other three models with the best results for the four-evaluation metrics. In detail, the model generated accuracy of 79.6%, macro-precision of 74.5%, macro-recall of 73.2%, and macro F1-scores of 73.1% on average across the five-popularity metrics used. Moreover, the results from both experimental stages showed that the incorporation of social media variables significantly increased the model performances relative to the use of audio features only, with the margins of improvement ranging from 10% to 60%. This demonstrates that YouTube-based social media features are beneficial for the use of industry practitioners to identify potentially popular hits. Novelty: This research appears to be the first study to date in the Hit Song Science domain that utilizes Social Media data from YouTube for the prediction of hit songs. Furthermore, it promotes the prediction of potential hits by using audio features and social media data jointly.
Keywords: hit song science; machine learning; audio features; social media features; Spotify; YouTube
© 2022 Yee & Raheem. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.