Prediksi Churn Pelanggan B2B dengan Segmentasi Menggunakan Bisecting K-Means dan Long Short-Term Memory

Khairina Wardina(1*), Ade Rahmat(2), Mohammad Syafrullah(3)

(1) Magister Ilmu Komputer University Budiluhur
(2) Universitas Budi Luhur
(3) Universitas Budi Luhur
(*) Corresponding Author

Abstract


In the competitive B2B sector, customer churn is a key challenge, particularly for a Fast-Moving Consumer Goods (FMCG) distribution company in North Sulawesi. From 2019 to 2022, the company's churn rate continued to rise despite efforts to reduce it. This study addresses the challenge of identifying churn in a non-contractual context by combining Bisecting K-Means for customer segmentation and Long Short-Term Memory (LSTM) for churn prediction based on monthly revenue. The Bisecting K-Means algorithm produced three clusters with a Davies-Bouldin Index (DBI) of 0.46, indicating effective segmentation. The LSTM model achieved a validation accuracy of 96% and a test accuracy of 95%, with an AUC-ROC of 97%. Results show that cluster 1 has the highest churn rate at 100%, followed by cluster 2 at 16.31%, and cluster 0 with the lowest at 3.21%. Out of 5,163 customers in the test data, 920 were identified as churned.

Full Text:

PDF (Indonesian)

References


DAFTAR PUSTAKA

T. Hargyatni, K. Djati Purnama, D. Wiratnoko, R. A. Kusumajaya, S. Handoko, and S. Stekom Surakarta, The Framework of Customer Engagement on Customer Satisfaction : The Antecedents and Consequences, 2022. [Online]. Available: http://iconesia.co.id/

J. Saura, A. Reyes-Menendez, N. Matos, M. Correia, and P. Palos-Sanchez, Consumer Behavior in the Digital Age, Journal of Spatial and Organizational Dynamics, vol. 8, pp. 190194, Sep. 2020.

M. Mirkovic, T. Vu?kovi?, D. Stefanovi?, A. Anderla, and D. Gracanin, Customer Churn Prediction in B2B Non-Contractual Business Settings Using Invoice Data, Applied Sciences, vol. 12, p. 5001, May 2022, doi: 10.3390/app12105001.

T. Nanang, Panduan Lengkap Manajemen Distribusi. Yogyakarta: Quadrant, 2019.

C. Grnroos, Relationship Marketing: The Strategy Continuum, 1995.

M. Peji? Bach, J. Pivar, and B. Jakovi?, Churn Management in Telecommunications: Hybrid Approach Using Cluster Analysis and Decision Trees, Journal of Risk and Financial Management, vol. 14, no. 11, Nov. 2021, doi: 10.3390/jrfm14110544.

X. Xiahou and Y. Harada, B2C E-Commerce Customer Churn Prediction Based on K-Means and SVM, Journal of Theoretical and Applied Electronic Commerce Research, vol. 17, no. 2, pp. 458475, Jun. 2022, doi: 10.3390/jtaer17020024.

H. Tran, N. Le, and V. H. Nguyen, Customer Churn Prediction in The Banking Sector using Machine Learning-Based Classification Models, Interdisciplinary Journal of Information, Knowledge, and Management, vol. 18, pp. 87105, 2023, doi: 10.28945/5086.

W. Bhaya, Review of Data Preprocessing Techniques in Data Mining, Journal of Engineering and Applied Sciences, vol. 12, pp. 41024107, Jan. 2017, doi: 10.3923/jeasci.2017.4102.4107.

A. Aldoseri, K. N. Al-Khalifa, and A. M. Hamouda, Re-Thinking Data Strategy and Integration for Artificial Intelligence: Concepts, Opportunities, and Challenges, Jun. 01, 2023, MDPI. doi: 10.3390/app13127082.

D. Singh and B. Singh, Investigating the Impact of Data Normalization on Classification Performance, Appl Soft Comput, vol. 97, no. xxxx, p. 105524, 2020, doi: 10.1016/j.asoc.2019.105524.

S. M. Savaresi and D. L. Boley, A comparative analysis on the bisecting K-means and the PDDP clustering algorithms, Intelligent Data Analysis, vol. 8, no. 4, pp. 345362, 2004, doi: 10.3233/ida-2004-8403.

Z. Zhou et al., A Fast Screening Framework for Second-Life Batteries Based on an Improved Bisecting K-Means Algorithm Combined with Fast Pulse Test, J Energy Storage, vol. 31, no. May, p. 101739, 2020, doi: 10.1016/j.est.2020.101739.

N. R. Pal and J. Biswas, Cluster validation using graph theoretic concepts, Pattern Recognit, vol. 30, no. 6, pp. 847857, 1997, doi: https://doi.org/10.1016/S0031-3203(96)00127-6.

G. Gan, C. Ma, and J. Wu, Density-Based Clustering Algorithms. in ASA-SIAM Series on Statistics and Applied Probability. Society for Industrial and Applied Mathematics, 2007. [Online]. Available: https://books.google.co.id/books?id=rlQZAQAAIAAJ

L. Skovajsov, Long short-term memory description and its application in text processing, in 2017 Communication and Information Technologies (KIT), 2017, pp. 14. doi: 10.23919/KIT.2017.8109465.

A. Arfan and L. Etp, Perbandingan Algoritma Long Short-Term Memory dengan SVR Pada Prediksi Harga Saham di Indonesia, PETIR, vol. 13, pp. 3343, Mar. 2020, doi: 10.33322/petir.v13i1.858.

R. Vinayakumar, K. P. Soman, and P. Poornachandran, Long short-term memory based operation log anomaly detection, in 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2017, pp. 236242. doi: 10.1109/ICACCI.2017.8125846.

T. Gattermann-Itschert, U. W. Thonemann, and T. Gattermann, Proactive customer retention management in a non-contractual B2B setting based on churn prediction with random forests, 2022. [Online]. Available: https://www.researchgate.net/publication/353794359

R. Ghawi and J. Pfeffer, Efficient Hyperparameter Tuning with Grid Search for Text Categorization Using kNN Approach with BM25 Similarity, vol. 9, no. 1, pp. 160180, 2019, doi: doi:10.1515/comp-2019-0011.

J. Bergstra and Y. Bengio, Random Search for Hyper-Parameter Optimization, Journal of Machine Learning Research, vol. 13, no. 10, pp. 281305, 2012, [Online]. Available: http://jmlr.org/papers/v13/bergstra12a.html




DOI: http://dx.doi.org/10.30998/faktorexacta.v18i3.27087

Refbacks

  • There are currently no refbacks.




Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

template doaj grammarly tools mendeley crossref SINTA sinta faktor exacta   Garuda Garuda Garuda Garuda Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Flag Counter

site
stats View Faktor Exacta Stats


pkp index