A Web Scraper for Data Mining Purposes
Abstract
Full Text:
PDFReferences
J. Lin, “A proposed conceptual framework for a representational approach to information retrieval,” SIGIR Forum, vol. 55, no. 2, pp. 1–29, 2021.
M. Khder, “Web scraping or Web Crawling: State of art, techniques, approaches and application,” Int. J. Adv. Soft Comput. Appl., vol. 13, no. 3, pp. 145–168, 2021.
Niu, Qingli and Kandhro, Irfan Ali and Kumar, Anil and Shah, Shahnawaz and Hasan, Muhammad and Ahmed, Hifza Mehfooz and Liang, Fei, “Web Scraping Tool For Newspapers And Images Data Using Jsonify,” Journal of Applied Science and Engineering, vol. 26, no. 4, pp. 465–474.
R. Diouf, E. N. Sarr, O. Sall, B. Birregah, M. Bousso, and S. N. Mbaye, “Web scraping: State-of-the-art and areas of application,” in 2019 IEEE International Conference on Big Data (Big Data), 2019.
V. Singrodia, A. Mitra, and S. Paul, “A Review on Web Scrapping and its Applications,” in 2019 International Conference on Computer Communication and Informatics (ICCCI), 2019.
J. Hillen, “Web scraping for food price research,” Br. Food J., vol. 121, no. 12, pp. 3350–3361, 2019.
K. N. Sharma S, “WEB SCRAPPING TOOLS,” Journal of Analysis and Computation., 2019.
A. K. Sharma, V. Shrivastava, and H. Singh, “Experimental performance analysis of web crawlers using single and Multi-Threaded web crawling and indexing algorithm for the application of smart web contents,” Mater. Today, vol. 37, pp. 1403–1408, 2021.
Y. D. Pramudita, D. R. Anamisa, S. S. Putro, and M. A. Rahmawanto, “Extraction system web content sports new based on web crawler multi thread,” J. Phys. Conf. Ser., vol. 1569, no. 2, p. 022077, 2020.
Range of web Crawling from HTTP Parse and HTML Requests as Static Digreph and Web Pages. .
Zhang N., Wilson S., and Mitra P., “STAPI: An Automatic Scraper for Extracting Iterative Title-Text Structure from Web Documents,” 2022, pp. 3461–3470.
Q. Niu et al., “Web Scraping Tool For Newspapers And Images Data Using Jsonify,” Journal of Applied Science and Engineering, vol. 26, no. 4, pp. 465–474.
J. M. Victoriano, J. P. Pulumbarit, L. L. Lacatan, R. A. S. Salivio, and R. L. A. Barawid, “Data analysis of Bulacan State University faculty scientific publication based on Google Scholar using web data scraping technique,” arXiv [cs.DL], 2022.
A. Rahmatulloh and R. Gunawan, “Web scraping with HTML DOM method for data collection of scientific articles from Google Scholar,” Indonesian J. of Inf. Syst., vol. 2, no. 2, pp. 95–104, 2020.
B. Buyuklieva and J. Raimbault, “Estimating bibliometric links using Google Scholar: A semi-systematic literature mapping of migration and housing,” arXiv [cs.DL], 2023.
D. Murillo, D. Saavedra, and R. Zapata, “Web application in Shiny for the extraction of data from profiles in Google Scholar,” in Proceedings of the 20th LACCEI International Multi-Conference for Engineering, Education and Technology: “Education, Research and Leadership in Post-pandemic Engineering: Resilient, Inclusive and Sustainable Actions,” 2022.
N. Ul Sabah, M. Murad Khan, R. Talib, M. Anwar, M. Sheraz Arshad Malik, and P. Nor Ellyza Nohuddin, “Google scholar university ranking algorithm to evaluate the quality of institutional research,” Comput. Mater. Contin., vol. 75, no. 3, pp. 4955–4972, 2023.
Sultan, N. A., & Abdullah, D. B., “Scraping Google Scholar Data Using Cloud Computing Techniques,” in 8th International Conference on Contemporary Information Technology and Mathematics (ICCITM), 2022, pp. 14–19.
Soriano-Burgos, C. I., Bautista, J. A., & López-Ramírez, M., “Obtención de una base de datos de perfiles de investigadores en Google Scholar basado en web scraping,” JÓVENES EN LA CIENCIA, vol. 18, pp. 1–3, 2022.
M. R. Rafsanjani, “ScrapPaper: A web scrapping method to extract journal information from PubMed and Google Scholar search result using Python,” bioRxiv, 2022.
B. Mahmood, Y. Mahmood, “Network-Based Method for Dynamic Burden-Sharing in the Internet of Things (IoT),” in International Conference on Emerging Technology Trends in Internet of Things and Computing, 2021, pp. 79–90.
N. A. Sultan, B. Mahmood, K. H. Thanoon, and D. S. Khadhim, “Network centralities-based approach for evaluating interdisciplinary collaboration,” in 2020 6th International Engineering Conference “Sustainable Technology and Development" (IEC), 2020.
B. Mahmood, N. A. Sultan, K. H. Thanoon, and D. S. Kadhim, “Measuring scientific collaboration in co-authorship networks,” IAES Int. J. Artif. Intell. (IJ-AI), vol. 10, no. 4, p. 1103, 2021.
DOI: https://doi.org/10.32520/stmsi.v13i3.4107
Article Metrics
Abstract view : 142 timesPDF - 34 times
Refbacks
- There are currently no refbacks.
![Creative Commons License](https://i.creativecommons.org/l/by-sa/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.