Eksplorasi Teknik Web Scraping pada Data Mining: Pendekatan Pencarian Data Berbasis Python
(1) Program Studi Teknologi Informasi, Universitas Timor, Indonesia
(2) Program Studi Pendidikan Matematika, Universitas Timor, Indonesia
(*) Corresponding Author
Abstract
Full Text:
PDFReferences
S. Munzert, C. Rubba, P. Meißner, and D. Nyhuis, Automated data collection with R: A practical guide to web scraping and text mining, no. 1. John Wiley & Sons, 2014.
V. Krotov, L. Johnson, and L. Silva, “Tutorial: Legality and Ethics of Web Scraping,” Communications of the Association for Information Systems, vol. 47, 2020, doi: 10.17705/1CAIS.04724.
V. Franzoni and A. Milani, “A Semantic Comparison of Clustering Algorithms for the Evaluation of Web-Based Similarity Measures. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2016. ICCSA 2016,” in Lecture Notes in Computer Science(), vol 9790, Springer, Cham, 2016. doi: 10.1007/978-3-319-42092-9_34.
D. Chrisinta, I. M. Sumertajaya, and I. Indahwati, “Evaluasi Kinerja Metode Cluster Ensemble Dan Latent Class Clustering Pada Peubah Campuran,” Indonesian Journal of Statistics and Its Applications, vol. 4, no. 3, pp. 448–461, 2020, doi: 10.29244/ijsa.v4i3.630.
D. Chrisinta, “Identifikasi Karakteristik Desa di Provinsi Bengkulu Tahun 2018 Berdasarkan Latent Class Cluster (LCC),” in Seminar Nasional Official Statistics, 2022, pp. 927–936. doi: 10.34123/semnasoffstat.v2022i1.1287.
A. Mabrouk, R. P. D. Redondo, and M. Kayed, “Seopinion: summarization and exploration of opinion from e-commerce websites,” Sensors, vol. 21, no. 2, p. 636, 2021, doi: 10.3390/s21020636.
Z. Zuo, “Sentiment analysis of steam review datasets using naive bayes and decision tree classifier,” no. 3, 2018. doi: 10.30598/barekengvol14iss3pp343-356.
I. N. Husada, E. H. Fernando, H. Sagala, A. E. Budiman, and H. Toba, “Ekstraksi dan Analisis Produk di Marketplace Secara Otomatis dengan Memanfaatkan Teknologi Web Crawling,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 5, no. 3, pp. 2443–2229, 2019, doi: 10.28932/jutisi.v5i3.1977.
A. Priadana and A. W. Murdiyanto, “Analisis Waktu Terbaik untuk Menerbitkan Konten di Instagram untuk Menjangkau Audiens,” Jurnal Penelitian Pers dan Komunikasi Pembangunan, vol. 24, no. 1, pp. 59–70, 2020, doi: 10.46426/jp2kp.v24i1.118.
E. Yuniar, D. Safiroh, D. Wahyuningsih, S. Informasi, S. Ppkia, and P. Paramita, “Implementasi Scrapping Data Untuk Sentiment Analysis Pengguna Dompet Digital dengan Menggunakan Algoritma Machine Learning,” janitra.orgE Yuniar, DS Utsalinah, D WahyuningsihJurnal Janitra Informatika dan Sistem Informasi, 2022•janitra.org, vol. 2, no. 1, pp. 35–42, 2022, doi: 10.25008/janitra.v2i1.145.
T. M. Fahrudin, P. A. Riyantoko, and K. M. Hindrayani, “Implementation of Web Scraping on Google Search Engine for Text Collection Into Structured 2D List,” Telematika: Jurnal Informatika dan Teknologi Informasi, vol. 20, no. 2, pp. 139–152, 2023, doi: 10.31315/telematika.v20i2.9575.
SCM de S Sirisuriya, “A comparative study on web scraping,” in Proceedings of 8th International Research Conference, 2015, pp. 135–140.
J. W. Seifert, Data mining: An overview. National security issues, 2004.
M. F. Sanner, “Python: a programming language for software integration and development,” J Mol Graph Model, vol. 17, no. 1, pp. 57–61, 1999, doi: 10.1016/j.str.2005.01.010.
D. Chrisinta and J. E. Simarmata, “Analisis Sentimen Penilaian Masyarakat Terhadap Pejabat Publik Menggunakan Algoritma Naïve Bayes Classifier,” Komputika: Jurnal Sistem Komputer, vol. 12, no. 1, pp. 93–101, 2023, doi: 10.34010/KOMPUTIKA.V12I1.9638.
DOI: http://dx.doi.org/10.30998/faktorexacta.v17i1.22393
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.