Analysis of Higher Education Alumni Careers using LinkedIn Web Scraping and K-Means Clustering

Qurrotul Aini (SCOPUS ID: 54974128700), Eri Rustamaji, Denina Nastiti Putri Amani, Elvi Fetrina

Abstract


The use of alumni data to support curriculum evaluation continues to face challenges due to the limitations of conventional data collection methods, such as manual surveys, which often result in low response rates. Meanwhile, LinkedIn provides relatively comprehensive and up-to-date alumni career information; however, its potential for supporting tracer studies remains underutilized. This study aims to analyze the career patterns of higher education alumni using LinkedIn data collected through web scraping and analyzed with the K-Means clustering algorithm within the Knowledge Discovery in Databases (KDD) framework. The proposed approach applies the KDD process to generate a data-driven mapping of alumni career patterns as a complement to conventional tracer studies. The dataset consisted of 133 alumni profiles, which were processed through the stages of data selection, preprocessing, transformation, clustering, and evaluation. The results indicate that the majority of alumni are employed in the technology sector and occupy mid-level or specialist positions. The K-Means algorithm identified three distinct career clusters, representing career tendencies in business process and operations, systems and technology development, and data utilization and software quality assurance. These findings reveal the distribution of alumni competencies across business, data, and technology domains. However, the clustering quality was relatively low, as indicated by a Silhouette Score of 0.0321 and a Davies-Bouldin Index of 3.0487, suggesting limited separation among the identified clusters. Therefore, the clustering results should be interpreted as an initial mapping of alumni career patterns rather than definitive classifications. Overall, this study demonstrates the potential of professional social media data as a valuable resource for supporting data-driven alumni career analysis and complementing traditional tracer study practices.

Keywords


careers pattern; data alumni; K-means clustering; linkedin; web scraping

Full Text:

PDF

References


V. Y. Saki, R. Ambarsarie, D. Soemantri, H. R. Ashan, and R. Anggraini, “Profil Alumni dan Pengguna Lulusan: Analisis Tracer Study,” Jurnal Kesehatan Tambusai, Vol. 4, No. 4, pp. 6829–6836, Dec. 2023, DOI: 10.31004/jkt.v4i4.22451.

R. P. Elburdah, “Tracer Study Analysis of Accounting Program Graduates’ Employment Outcomes at Universitas XXX,” International Journal of Scientific Multidisciplinary Research, Vol. 3, No. 7, pp. 1147–1156, Aug. 2025, DOI: 10.55927/ijsmr.v3i7.488.

M. Z. A. Islam and A. Sudrajat, “Sistem Informasi Tracer Study pada Politeknik TEDC Bandung berbasis Web,” Jurnal Informatika dan Teknik Elektro Terapan, Vol. 13, No. 3S1, pp. 1409–1416, Oct. 2025, DOI: 10.23960/jitet.v13i3s1.7637.

H. Zou, “From School to Work: How Alumni Networks on LinkedIn Shape PhD Journeys,” Economic Modelling, Vol. 155, Art. No. 107419, Dec. 2025, DOI: 10.1016/j.econmod.2025.107419.

L. Pena, M. Oliveira, and C. Curado, “Understanding Linkedin use and Its Relationship with Career Performance Expectations: Are there Gender Differences?,” Journal of Organizational Effectiveness People and Performance, pp. 1–19, Sep. 2025, DOI: 10.1108/joepp-01-2025-0004.

Y. A. Mahmood and B. Mahmood, “A Web Scraper for Data Mining Purposes,” SISTEMASI, Vol. 13, No. 3, p. 1243–1252, May 2024, DOI: 10.32520/stmsi.v13i3.4107.

J. Liu, Y. C. Ng, K. L. Wood, and K. H. Lim, “IPOD: An Industrial and Professional Occupations Dataset and Its Applications to Occupational Data Mining and Analysis,” arXiv (Cornell University), Oct. 2019, DOI: 10.48550/arxiv.1910.10495.

R. Ariyanto, C. Rachmad, and A. R. Syulistyo, “Automatically Collect Alumni Data on Social Media,” IOP Conference Series Materials Science and Engineering, Vol. 732, No. 1, Art. No. 012071, Jan. 2020, DOI: 10.1088/1757-899x/732/1/012071.

F. A. Fernaldy, A. A. Arifiyanti, and D. S. Y. Kartika, “Klasterisasi Tracer Study Alumni Universitas XYZ menggunakan Algoritma K-Means,” Jurnal Informatika dan Teknik Elektro Terapan, Vol. 13, No. 1, pp. 270–279, Jan. 2025, DOI: 10.23960/jitet.v13i1.5581.

G. B. Kaligis and S. Yulianto, “Analisa Perbandingan Algoritma K-Means, K-Medoids, dan X-Means untuk Pengelompokan Kinerja Pegawai,” IT-Explore Jurnal Penerapan Teknologi Informasi dan Komunikasi, Vol. 1, No. 3, pp. 179–193, Oct. 2022, DOI: 10.24246/itexplore.v1i3.2022.pp179-193.

M. S. Hasibuan, A. H. Lubis, and M. N. Sari, “Perbandingan Algoritma Clustering DBSCAN dan K-Means dalam Pengelompokan Siswa Terbaik,” INFOTECH Jurnal Informatika & Teknologi, Vol. 5, No. 2, pp. 301–309, Dec. 2024, DOI: 10.37373/infotech.v5i2.1457.

K. Dai, C. G. Nespereira, A. F. Vilas, and R. P. D. Redondo, “Scraping and Clustering Techniques for the Characterization of LinkedIn Profiles,” arXiv (Cornell University), May 2015, DOI: 10.5121/csit.2015.50101.

L. Li, S. Peltsverger, J., Zheng, L. Le, & M. Handlin, “Retrieving and Classifying LinkedIn Job Titles for Alumni Career Analysis,” in Proceedings of the 22st Annual Conference on Information Technology Education, pp. 85–90, 2021, DOI: 10.1145/3450329.3476858.

S. Lade, A. Billade, A. Chandrapatle, S. Chenna, & G. Chinchalpalle, “LinkedIn Alumni Profile Data Extraction,” in Proceedings of 4th International Conference on Pervasive Computing and Social Networking (ICPCSN), 2024, pp. 174–178, DOI: 10.1109/ICPCSN62568.2024.00037.

P. Chaparala, A. Jukuntla, V. S. Reddy, V. V. Vinayak, & T. P. Sudha, “Extraction and Updation of Alumni Information from Web Profiles using Web Scraping,” in Proceedings of International Conference on Quantum Technologies, Communications, Computing, Hardware and Embedded Systems Security (iQ-CCHESS), 2023, pp. 1–7, DOI: 10.1109/iQ-CCHESS56596.2023.10391404.

K. S. Kumar, P. Srihari, & C. J. Raman, “AI for Career Growth: Advanced Resume Analysis and Linkedin Scraping for Personalized Job Recommendations,” in 2024 2nd International Conference on Self Sustainable Artificial Intelligence Systems (ICSSAS), 2024, pp. 1287–1293, DOI: 10.1109/ICSSAS64001.2024.10760972.

P. J. Hickey, A. Erfani, and Q. Cui, “Use of LinkedIn Data and Machine Learning to Analyze Gender Differences in Construction Career Paths,” Journal of Management in Engineering, Vol. 38, No. 6, Aug. 2022, DOI: 10.1061/(asce)me.1943-5479.0001087.

H. Subaekti, L. Hakim, H. Khaulasari, and D. Yuliati, “An Integrated K-Means++–Davies–Bouldin Index Approach for Educational Resource-based District Clustering: A Case Study of Districts in Surabaya,” Jambura Journal of Mathematics, Vol. 8, No. 1, pp. 111–120, Feb. 2026, DOI: 10.37905/jjom.v8i1.35412.




DOI: https://doi.org/10.32520/stmsi.v15i6.6437

Article Metrics

Abstract view : 0 times
PDF - 0 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.