Analysis of Music Features and Song Popularity Trends on Spotify Using K-Means and CRISP-DM

Sari Marlia, Kiki Setiawan, Christina Juliane


Spotify, known as one of the best music streaming platforms, has played an important role in changing how listeners access, enjoy and interact with music. With millions of songs and extensive user data, Spotify provides an opportunity to understand listener behavior and the factors that contribute to a song's success and popularity. This research aims to examine the relationship between music features and the popularity of songs on the Spotify music platform by analyzing SSE values, Euclidean distance values, and cluster center values on the dataset attributes loudness, danceability, and energy. The framework used in this research is CRISP-DM (Cross-Industry Standard Process for Data Mining). The K-Means clustering algorithm and the Weka data mining application are used to decipher the features that influence the success and popularity of songs on Spotify. The research results show that groups/clusters 1, 2, and 3 are groups/clusters with songs that have high, medium, and low loudness, danceability, and energy respectively. Popular songs on Spotify are currently increasingly focused on loudness, danceability, and energy with a prominent trend, namely songs with high loudness, danceability, and energy are becoming more popular, while songs with low loudness, danceability, and energy are becoming less popular.

Full Text:



H. Martopo, “Sejarah Musik Sebagai Sumber Pengetahuan Ilmiah Untuk Belajar Teori, Komposisi, Dan Praktik Musik,” Harmonia: Journal of Arts Research and Education, vol. 13, no. 2, 2013.

I. Ruddin, H. Santoso, and R. E. Indrajit, “Digitalisasi Musik Industri: Bagaimana Teknologi Informasi Mempengaruhi Industri Musik di Indonesia,” Jurnal Pendidikan Sains dan Komputer, vol. 2, no. 01, 2022, doi: 10.47709/jpsk.v2i01.1395.

Billboard, “Billboard Charts.” Accessed: Nov. 15, 2023. [Online]. Available:

U. L. Musyarofah, S. N. Alima, and D. S. Y. Kartika, “KLASIFIKASI TOP 50 SPOTIFY TAHUN 2010-2019 MENGGUNAKAN METODE K-MEANS CLUSTERING,” Prosiding Seminar Nasional Teknologi dan Sistem Informasi, vol. 2, no. 1, 2022, doi: 10.33005/sitasi.v2i1.300.

S. Navisa, Luqman Hakim, and Aulia Nabilah, “Komparasi Algoritma Klasifikasi Genre Musik pada Spotify Menggunakan CRISP-DM,” Jurnal Sistem Cerdas, vol. 4, no. 2, 2021, doi: 10.37396/jsc.v4i2.162.

M. Interiano, K. Kazemi, L. Wang, J. Yang, Z. Yu, and N. L. Komarova, “Musical trends and predictability of success in contemporary songs in and out of the top charts,” R Soc Open Sci, vol. 5, no. 5, 2018, doi: 10.1098/rsos.171274.

S. Y. M. Netti and I. Irwansyah, “Spotify: Aplikasi Music Streaming untuk Generasi Milenial,” Jurnal Komunikasi, vol. 10, no. 1, 2018, doi: 10.24912/jk.v10i1.1102.

Spotify, “Apa itu Spotify?,” Accessed: Nov. 15, 2023. [Online]. Available:

SAS Institute, “Data Mining: what it is & why it matters,” SAS Insights: Analytics and Data Science Insights.

S. Agarwal, “Data mining: Data mining concepts and techniques,” in Proceedings - 2013 International Conference on Machine Intelligence Research and Advancement, ICMIRA 2013, 2014. doi: 10.1109/ICMIRA.2013.45.

P. Chapman et al., “CRISP-DM -Cross-Industry Standard Process for Data Mining- 1.0 Step-by-step data mining guide.,” CRISP-DM Consortium, 2000.

A. M. Ikotun, M. S. Almutari, and A. E. Ezugwu, “K‐means‐based nature‐inspired metaheuristic algorithms for automatic data clustering problems: Recent advances and future directions,” Applied Sciences (Switzerland), vol. 11, no. 23. 2021. doi: 10.3390/app112311246.

C. Tan, H. Zhao, and H. Ding, “Statistical initialization of intrinsic K-means clustering on homogeneous manifolds,” Applied Intelligence, vol. 53, no. 5, 2023, doi: 10.1007/s10489-022-03698-8.

M. Faid, M. Jasri, and T. Rahmawati, “Perbandingan Kinerja Tool Data Mining Weka dan Rapidminer Dalam Algoritma Klasifikasi,” Teknika, vol. 8, no. 1, 2019, doi: 10.34148/teknika.v8i1.95.

L. Medeiros, “The CRISP-DM methodology,” Medium.

A. Asaniczka, “Top Spotify Songs in 73 Countries (Daily Updated),” Kaggle.

A. Rahmawati and E. Setyowati, “K-Means Cluster Analysis for District or City Clustering in Bengkulu Province based on The Number of Base Transceiver Stations and The Strength of Cell Phone Signal,” CESS (Journal of Computer Engineering, System and Science), vol. 8, no. 1, 2023, doi: 10.24114/cess.v8i1.40913.

F. Ridzuan and W. M. N. Wan Zainon, “A review on data cleansing methods for big data,” in Procedia Computer Science, 2019. doi: 10.1016/j.procs.2019.11.177.

M. S. Pangestu and M. A. Fitriani, “Perbandingan Perhitungan Jarak Euclidean Distance, Manhattan Distance, dan Cosine Similarity dalam Pengelompokan Data Bibit Padi Menggunakan Algoritma K-Means,” Sainteks, vol. 19, no. 2, 2022, doi: 10.30595/sainteks.v19i2.14495.

S. E. Damayanti and S. K. Kuswayati, “Analisis Dan Implementasi Framework CRISP-DM (Cross Industry Standard Process For Data Mining) Untuk Clustering Perguruan Tinggi Swasta,” ejournal sttbandung, 2018.

A. Pambudi, “PENERAPAN CRISP-DM MENGGUNAKAN MLR K-FOLD PADA DATA SAHAM PT. TELKOM INDONESIA (PERSERO) TBK (TLKM) (STUDI KASUS: BURSA EFEK INDONESIA TAHUN 2015-2022),” Jurnal Data Mining dan Sistem Informasi, vol. 4, no. 1, p. 1, Mar. 2023, doi: 10.33365/jdmsi.v4i1.2462.

Billboard, “Billboard Hot 100TM,” billboard.


Article Metrics

Abstract view : 116 times
PDF - 36 times


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.