Perbandingan Evaluasi Metode Davies Bouldin, Elbow dan Silhouette pada Model Clustering dengan Menggunakan Algoritma K-Means

Muhammad Sholeh(1*), Khurotul Aeni(2)

(1) 
(2) Universitas Peradaban
(*) Corresponding Author

Abstract


One of data mining model that is often used is the clustering model. The clustering model is used to create a grouping of a datasheet. Data clustering can be used to distinguish data in a datasheet.  Many of the best groupings can be done with the clustering model evaluation process. In the clustering method, the evaluation process can use various evaluation methods. The research will cluster the results of a survey on reviews of tourist destinations consisting of 10 categories. The method used is a data mining method, namely Knowledge Discovery in Database (KDD). Stages in KDD include data selection, data cleaning, transformation, data mining process and evaluation of model results. The process of creating a clustering model uses a public datasheet, namely tripadvisor_review.csv.  The data clustering process uses the K-means algorithm. The result of the clustering will be tested by comparing evaluation methods. This evaluation method is used to select the best amount of clustering. Testing to get the best clustering results is conducted by testing from clustering 2 to 15. The evaluation uses Davies Bouldin, Elbow and Silhouette methods. The result shows that the number of datasheet groupings with the three evaluation methods provides recommendations for grouping as many as 2 groups. 


Keywords


Clustering; evaluation; Davies Bouldin; Elbow; Silhouette

Full Text:

PDF

References


Morrisan, Metode Penelitian Survei. Jakarta: Kencana, 2017.

D. Cielen, A. D. B. Meysman, and M. Ali, Introducing Data Science. 2016.

M. Arhami and M. Nasir, Data Mining - Algoritma dan Implementasi. Yogyakarta: Penerbit Andi, 2020.

M. Sholeh, R. Y. Rachmawati, and E. N. Cahyo, “Penerapan Regresi Linear Ganda Untuk Memprediksi Hasil Nilai Kuesioner Mahasiswa Dengan Menggunakan Python,” vol. 11, no. 1, pp. 13–24, 2022.

C. K. Puteri and L. N. Safitri, “Analysis of linear regression on used car sales in Indonesia,” Journal of Physics: Conference Series, vol. 1469, no. 1, 2020, doi: 10.1088/1742-6596/1469/1/012143.

B. D. F. Kurniatullah and Y. T. C. Pramudi, “Estimation of Students’ Graduation Using Multiple Linear Regression Method,” Journal of Applied Intelligent System, vol. 2, no. 1, pp. 29–36, 2017, doi: 10.33633/jais.v2i1.1415.

S. S. Rahardjo and R. Sanusi, “Linear Regression Analysis on the Determinants of Hypertension Prevention Behavior,” Journal of Health Promotion and Behavior, vol. 4, no. 1, pp. 22–31, 2019, doi: 10.26911/thejhpb.2019.04.01.03.

M. Sholeh and D. Andayati, “Machine Linear untuk Analisis Regresi Linier Biaya Asuransi Kesehatan dengan Menggunakan Python Jupyter Notebook,” Jurnal Edukasi dan Penelitian Informatika, vol. 8, no. 1, pp. 20–27, 2022.

H. K. Pambudi, P. G. A. Kusuma, F. Yulianti, and K. A. Julian, “Prediksi Status Pengiriman Barang Menggunakan Metode Machine Learning,” Jurnal Ilmiah Teknologi Infomasi Terapan, vol. 6, no. 2, pp. 100–109, 2020, doi: 10.33197/jitter.vol6.iss2.2020.396.

C. A. Rahardja, T. Juardi, and H. Agung, “Implementasi Algoritma K-Nearest Neighbor Pada Website Rekomendasi Laptop,” pp. 75–84, 2019.

E. Sutoyo and A. Almaarif, “Educational Data Mining untuk Prediksi Kelulusan Mahasiswa Menggunakan Algoritme Naïve Bayes Classifier,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 4, no. 1, pp. 95–101, 2020, doi: 10.29207/RESTI.V4I1.1502.

Amanda and M. Veronica Sitorus, “Penerapan Algoritma K-Means Clustering Untuk Pengelompokan Konsumsi Produk Kosmetik milik PT Cedefindo,” Jurnal Ilmiah MIKA AMIK Al Muslim, vol. V, no. 2, pp. 63–68, 2021.

T. Hardiani, “Analisis Clustering Kasus Covid 19 Di Indonesia Menggunakan Algoritma K-Means,” Janapati, vol. 11, no. 2, pp. 156–165, 2022.

D. A. Manalu and G. Gunadi, “Implementasi Metode Data Mining K-Means Clustering Terhadap Data Pembayaran Transaksi Menggunakan Bahasa Pemrograman Python pada CV Digital Dimensi,” Infotech: Journal Of Technology Information, vol. 8, no. 1, pp. 45–54, 2022.

A. Maulana, K. Nur Akbar, and Nurahman, “Penerapan Clustering Menggunakan Algoritma K-Means Sebagai Analisis Produksi Komoditas Perikanan Provinsi di Indonesia,” EJECTS : E-Journal Computer, Technology and Informations System, vol. 01, no. 01, pp. 1–6, 2021.

Parjito and Permata;, “Penerapan Data Mining untuk Clustering Data Penduduk Miskin Menggunakan Metode K-Means,” Jurnal Informatika, vol. 3, no. 1, p. 7, 2021.

E. Muningsih, I. Maryani, and V. R. Handayani, “Penerapan Metode K-Means dan Optimasi Jumlah Cluster dengan Index Davies Bouldin untuk Clustering Propinsi Berdasarkan Potensi Desa,” Evolusi: Jurnal Sains dan Manajemen, vol. 9, no. 1, pp. 95–100, 2021.

A. Badruttamam and D. A. I. Maruddani, “Penerapan Analisis Klaster K -Modes dengan Validasi Davies Bouldin Index dalam Menentukan Karakteristik Kanal Youtube di Indonesia (Studi Kasus: 250 Kanal Youtube Indonesia Teratas Menurut Socialblade ),” JURNAL GAUSSIAN, vol. 9, no. 3, pp. 263–272, 2020.

A. Winarta and W. J. Kurniawan, “Optimasi Cluster K-Means Menggunakan Metode Elbow pada Data Pengguna Narkoba dengan Pemrograman Python,” Jurnal Teknik Informatika Kaputama (JTIK), vol. 5, no. 1, pp. 113–119, 2021.

N. T. Hartanti, “Metode Elbow dan K-Means Guna Mengukur Kesiapan Siswa SMK Dalam Ujian Nasional,” vol. 02, pp. 82–89, 2020.

I. Wahyudi et al., “Analisa Penentuan Cluster Terbaik pada Metode K-Means Menggunakan Elbow Terhadap Sentra Industri Produksi di Pamekasan,” Jurnal Aplikasi Teknologi Informasi dan Manajemen (JATIM, vol. 2, no. 2, pp. 72–81, 2021.

S. Paembonan and H. Abduh, “Penerapan Metode Silhouette Coefficient untuk Evaluasi Clustering Obat,” PENA TEKNIK: Jurnal Ilmiah Ilmu-Ilmu Teknik, vol. 6, no. 2, p. 48, 2021, doi: 10.51557/pt_jiit.v6i2.659.

R. Hidayati et al., “Analisis Silhouette Coefficient pada 6 Perhitungan Jarak K-Means Clustering,” Techno.COM, vol. 20, no. 2, pp. 186–197, 2021.

D. A. I. C. Dewi and D. A. K. Pramita, “Analisis Perbandingan Metode Elbow dan Silhouette pada Algoritma Clustering K-Medoids dalam Pengelompokan Produksi Kerajinan Bali,” Matrix : Jurnal Manajemen Teknologi dan Informatika, vol. 9, no. 3, pp. 102–109, 2019, doi: 10.31940/matrix.v9i3.1662.

A. Stewart, Python Programming for beginners. 2016.

B. Santosa and A. Umam, Data Mining dan Big Data Analytics. Bantul: Penebar Media Pustaka, 2018.




DOI: http://dx.doi.org/10.30998/string.v8i1.16388

Refbacks

  • There are currently no refbacks.


Copyright (c) 2023 Muhammad Sholeh Sholeh, Khurotul Aeni

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

 

STRING (Satuan Tulisan Riset dan Inovasi Teknologi) indexed by:



Lisensi Creative Commons
Ciptaan disebarluaskan di bawah Lisensi Creative Commons Atribusi 4.0 Internasional.
View My Stats

Flag Counter