Optimasi Model Semi-Supervised Learning dengan SVM dan Naïve Bayes

Muhammad Siddik Hasibuan(1*), Yusuf Ramadhan Nasution(2)

(1) Universitas Islam Negeri Sumatera Utara
(2) Universitas Islam Negeri Sumatera Utara
(*) Corresponding Author

Abstract


Determining labels in text mining is the most important thing, apart from using lexicon based, you can also use human understanding to interpret sentences into positive, negative and neutral categories. Sentiment analysis is to measure an opinion taken from tweets to be able to analyze the text. Researchers carry out a learning process and data testing. The focus of this research is to be able to determine the polarization of text into positive, negative and neutral forms using a semi-supervised machine learning model. In the previous data training process, labeling was carried out using human understanding to obtain positive, negative and neutral labels. Next, in the testing process, the data is not labeled, the role of the learning machine is so that the test data gets a label. The semi-supervised learning (SSL) technique is used to label unlabeled data with the algorithm used to process the training data using SVM and NB. The statistical evaluation used is cross validation and to measure the level of accuracy of the two algorithms using a confusion matrix. SVM received a high accuracy score in this study compared to NB, SVM got a high accuracy score in this study compared to NB, SVM got 88.97% accuracy and NB 83.02%


Keywords


semi-supervised learning; SVM; NB

Full Text:

PDF

References


K. Arun and A. Srinagesh, “Multi-lingual Twitter sentiment analysis using machine learning,” International Journal of Electrical and Computer Engineering, vol. 10, no. 6, 2020, doi: 10.11591/ijece.v10i6.pp5992-6000.

V. A. and S. S. Sonawane, “Sentiment Analysis of Twitter Data: A Survey of Techniques,” Int J Comput Appl, vol. 139, no. 11, 2016, doi: 10.5120/ijca2016908625.

M. J. C. Samonte, J. M. R. Garcia, V. J. L. Lucero, and S. C. B. Santos, “Sentiment and opinion analysis on twitter about local airlines,” in ACM International Conference Proceeding Series, 2017. doi: 10.1145/3162957.3163029.

R. Syahputra, G. J. Yanris, and D. Irmayani, “SVM and Naïve Bayes Algorithm Comparison for User Sentiment Analysis on Twitter,” Sinkron, vol. 7, no. 2, 2022, doi: 10.33395/sinkron.v7i2.11430.

G. K. Shahi, A. Dirkson, and T. A. Majchrzak, “An exploratory study of COVID-19 misinformation on Twitter,” Online Soc Netw Media, vol. 22, 2021, doi: 10.1016/j.osnem.2020.100104.

A. Giachanou and F. Crestani, “Like it or not: A survey of Twitter sentiment analysis methods,” 2016. doi: 10.1145/2938640.

A. P. Nardilasari, A. L. Hananto, S. S. Hilabi, T. Tukino, and B. Priyatna, “Analisis Sentimen Calon Presiden 2024 Menggunakan Algoritma SVM Pada Media Sosial Twitter,” JOINTECS (Journal of Information Technology and Computer Science), vol. 8, no. 1, 2023, doi: 10.31328/jointecs.v8i1.4265.

D. S. Utami and A. Erfina, “Analisis Sentimen Pinjaman Online di Twitter Menggunakan Algoritma Support Vector Machine (SVM),” SISMATIK (Seminar Nasional Sistem Informasi dan Manajemen Informatika), vol. 1, no. 1, 2021.

H. Saif, Y. He, M. Fernandez, and H. Alani, “Contextual semantics for sentiment analysis of Twitter,” Inf Process Manag, vol. 52, no. 1, 2016, doi: 10.1016/j.ipm.2015.01.005.

L. Zhang, R. Ghosh, M. Dekhil, M. Hsu, and B. Liu, “Combining lexicon-based and learning-based methods for twitter sentiment analysis,” HP Laboratories Technical Report, 2011.

S. Singh and A. Mahmood, “The NLP Cookbook: Modern Recipes for Transformer Based Deep Learning Architectures,” IEEE Access, vol. 9, 2021, doi: 10.1109/ACCESS.2021.3077350.

S. H. Sahir, R. S. Ayu Ramadhana, M. F. Romadhon Marpaung, S. R. Munthe, and R. Watrianthos, “Online learning sentiment analysis during the covid-19 Indonesia pandemic using twitter data,” IOP Conf Ser Mater Sci Eng, vol. 1156, no. 1, 2021, doi: 10.1088/1757-899x/1156/1/012011.

C. Villavicencio, J. J. Macrohon, X. A. Inbaraj, J. H. Jeng, and J. G. Hsieh, “Twitter sentiment analysis towards covid-19 vaccines in the Philippines using naïve bayes,” Information (Switzerland), vol. 12, no. 5, 2021, doi: 10.3390/info12050204.

S. Boon-Itt and Y. Skunkan, “Public perception of the COVID-19 pandemic on twitter: Sentiment analysis and topic modeling study,” JMIR Public Health Surveill, vol. 6, no. 4, 2020, doi: 10.2196/21978.

A. C. Sanders et al., “Unmasking the conversation on masks: Natural language processing for topical sentiment analysis of COVID-19 Twitter discourse,” AMIA Annu Symp Proc, vol. 2021, 2021.

M. Zhou, N. Duan, S. Liu, and H. Y. Shum, “Progress in Neural NLP: Modeling, Learning, and Reasoning,” 2020. doi: 10.1016/j.eng.2019.12.014.

L. Deng and Y. Liu, “Epilogue: Frontiers of NLP in the deep learning era,” in Deep Learning in Natural Language Processing, 2018. doi: 10.1007/978-981-10-5209-5_11.

L. K. Ramasamy, S. Kadry, Y. Nam, and M. N. Meqdad, “Performance analysis of sentiments in Twitter dataset using SVM models,” International Journal of Electrical and Computer Engineering, vol. 11, no. 3, 2021, doi: 10.11591/ijece.v11i3.pp2275-2284.

N. A. S. Abdullah and N. I. A. Rusli, “Multilingual sentiment analysis: A systematic literature review,” 2021. doi: 10.47836/pjst.29.1.25.

A. Ligthart, C. Catal, and B. Tekinerdogan, “Systematic reviews in sentiment analysis: a tertiary study,” Artif Intell Rev, vol. 54, no. 7, 2021, doi: 10.1007/s10462-021-09973-3.

K. L. Tan, C. P. Lee, and K. M. Lim, “A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research,” 2023. doi: 10.3390/app13074550.

Z. Qiu, E. Cho, X. Ma, and W. M. Campbell, “Graph-based semi-supervised learning for natural language understanding,” in EMNLP-IJCNLP 2019 - Graph-Based Methods for Natural Language Processing - Proceedings of the 13th Workshop, 2019. doi: 10.18653/v1/d19-5318.

A. S. Aribowo, H. Basiron, and N. F. A. Yusof, “Semi-supervised learning for sentiment classification with ensemble multi-classifier approach,” International Journal of Advances in Intelligent Informatics, vol. 8, no. 3, 2022, doi: 10.26555/ijain.v8i3.929.

O. Y. Adwan, M. Al-Tawil, A. M. Huneiti, R. A. Shahin, A. A. Abu Zayed, and R. H. Al-Dibsi, “Twitter sentiment analysis approaches: A survey,” International Journal of Emerging Technologies in Learning, vol. 15, no. 15, 2020, doi: 10.3991/ijet.v15i15.14467.

S. E. Saad and J. Yang, “Twitter Sentiment Analysis Based on Ordinal Regression,” IEEE Access, vol. 7, 2019, doi: 10.1109/ACCESS.2019.2952127.

C. P. D. Cyril, J. R. Beulah, N. Subramani, P. Mohan, A. Harshavardhan, and D. Sivabalaselvamani, “An automated learning model for sentiment analysis and data classification of Twitter data using balanced CA-SVM,” Concurr Eng Res Appl, vol. 29, no. 4, 2021, doi: 10.1177/1063293X211031485.




DOI: http://dx.doi.org/10.30998/string.v9i2.24556

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Muhammad Siddik Hasibuan, Yusuf Ramadhan Nasution

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

 

STRING (Satuan Tulisan Riset dan Inovasi Teknologi) indexed by:



Lisensi Creative Commons
Ciptaan disebarluaskan di bawah Lisensi Creative Commons Atribusi 4.0 Internasional.
View My Stats

Flag Counter