A new AI-based method for clustering survey responses

PL EN

O czasopiśmie Kolegium Redakcyjne Rada Naukowa Recenzenci Kodeks Etyczny Wydania Specjalne Terminy czasopisma RODO - informacje o przetwarzaniu danych osobowych Indeksacja Licencje i dostęp Archiwum Dla autorów Zasady publikacji Techniczna instrukcja dla autorów Umowa o udzielenie nieodpłatnej licencji CC BY-SA Oświadczenie autora o prawach autorskich Procedura publikacji Kontakt

PL EN

SZUKAJ

Archiwum

Procedura publikacji

Kontakt

O czasopiśmie Kolegium Redakcyjne Rada Naukowa Recenzenci Kodeks Etyczny Wydania Specjalne Terminy czasopisma RODO - informacje o przetwarzaniu danych osobowych Indeksacja Licencje i dostęp

Archiwum

Dla autorów Zasady publikacji Techniczna instrukcja dla autorów Umowa o udzielenie nieodpłatnej licencji CC BY-SA Oświadczenie autora o prawach autorskich

Procedura publikacji

Kontakt

Numer specjalny 5/2023 vol. 54

Pobierz cytowanie

A new AI-based method for clustering survey responses

Jan Franciszek Laskowski ¹

,

Paweł Tomiło ¹

1

Lublin University of Technology

Data nadesłania: 25-07-2023

Data akceptacji: 01-12-2023

Data publikacji: 18-12-2023

Autor do korespondencji

Jan Franciszek Laskowski

Lublin University of Technology

JoMS 2023;54(Numer specjalny 5):355-377

DOI: https://doi.org/10.13166/jms/176171

Referencje (34)

SŁOWA KLUCZOWE

Survey data analysis

artificial intelligence

variational autoencoder (VAE)

machine learning

pattern discovery

exploratory data analysis

DZIEDZINY

Nauki o zarządzaniu

STRESZCZENIE

Objectives:
Many research projects, particularly in social science research, depend on clustering survey responses. When analyzing survey data, traditional clustering algorithms have several drawbacks. The ability to analyze survey data more effectively has been made possible by recent developments in artificial intelligence (AI) and machine learning (ML). The aim of this article is to present a new, AI-based method of clustering survey responses using a Variational Autoencoder (VAE).

Material and methods:
To determine the effectiveness of grouping, the new VAE clustering method was compared with K-means, PCA and k-means, and Agglomerative Hierarchical Clustering methods by applying the Silhouette score, the Calinski-Harabasz score, and the Davies-Bouldin score metrics.

Results:
In the case of the Silhouette Score, the developed VAE method obtained a 69% higher average effectiveness of clustering survey responses than the others. For the Calinski-Harabasz Score and the Davies-Bouldin Score, respectively, the VAE method outperformed the other methods by 164% and 111%, respectively.

Conclusions:
The VAE method allowed for the most effective grouping of responses given by respondents. It has made it possible to capture complex relationships and patterns in the data. In addition, the method is suitable for analyzing different types of survey data (continuous, categorical, and mixed data) and is resistant to noise and missing data.

Licencja

Ta praca jest dostępna na licencji Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA)

REFERENCJE (34)

1.

Arthur, D., Vassilvitskii, S. (2007). K-means. the advantages of careful seeding. Symposium on Discrete Algorithms. Accessed 20.04.2023 at https://forge.agroparistech.fr....

2.

/tree/670/biblio/clustering/kMeansPP-soda.pdf.

3.

Arturo, A., Scuola, V., Santanna, S., Binaghi, E., Vergani, A. A. (2018). A soft davies-bouldin separation measure. 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). https://doi.org/10.1109/FUZZ-I....

4.

Asadoorian, M., Kantarelis, D. (2005). Essentials of inferential statistics. Accessed 23.04.2023 at https://www.google.com/books?h....

5.

Bock, H. (2007). Clustering Methods: A History of k-Means Algorithms. Selected Contributions in Data Analysis. Accessed 12.05.2023 at https://link.springer.com/cont....

6.

Caliński, T. (1974). A dendrite method for cluster analysis. Taylor & Francis, 1–27. https://doi.org/10.1080/036109....

7.

Campello, R. J. G. B., Moulavi, D., Sander, J. (2013). Density-based clustering based on hierarchical density estimates, 7819 LNAI(PART 2), 160–172. Lecture Notes in Computer Science. https://doi.org/10.1007/978-3-....

8.

Davies, D. L., Bouldin, D. W. (1979). A Cluster Separation Measure, PAMI-1(2), 224–227. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.....

9.

Day, W. H. E., Edelsbrunner, H. (1984). Efficient algorithms for agglomerative hierarchical clustering methods., 1(1), 7–24. Journal of Classification. https://doi.org/10.1007/BF0189....

10.

Doersch, C. (2016). Tutorial on Variational Autoencoders. Accessed 20.04.2023 at https://arxiv.org/abs/1606.059....

11.

Fowler, F. J. (2013). Survey research methods. Taylor & Francis.

12.

Fraley, C., Raftery, A. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal. Accessed 19.04.2023 at https://academic.oup.com/comjn....

13.

Holcomb, Z. (2016). Fundamentals of descriptive statistics. Accessed 22.04.2023 at https://www.google.com/books?h....

14.

Jollife, I. T., Cadima, J. (2016). Principal component analysis: a review and recent developments. 374(2065). Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. https://doi.org/10.1098/RSTA.2....

15.

Kingma, D. P., Welling, M. (2019). An Introduction to Variational Autoencoders, 12(4), 307–392. Foundations and Trends® in Machine Learning. https://doi.org/10.1561/220000....

16.

Kleinbaum, D., Kupper, L., Nizam, A., Rosenberg, E. (2013). Applied regression analysis and other multivariable methods. Cengage Learning.

17.

Kriegel, H. P., Kröger, P., Sander, J., Zimek, A. (2011). Density-based clustering, 1(3), 231–240. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. https://doi.org/10.1002/WIDM.3....

18.

Laskowska, A., Laskowski, J. F. (2022). Silver Generation at Work – Implications for Sustainable Human Capital Management in the Industry 5.0 Era, 15(1), 194. Sustainability. https://doi.org/10.3390/SU1501....

19.

Likas, A., Vlassis, N., Verbeek, J. (2003). The global k-means clustering algorithm. Pattern Recognition. Accessed 19.04.2023 at https://www.sciencedirect.com/....

20.

Lima, S., Aplicada, M. C. (2020). A genetic algorithm using Calinski-Harabasz index for automatic clustering problem, 12(3), 97–106. Revista Brasileira de Computação. https://doi.org/10.5335/rbca.v....

21.

Manning, C. (2009). An introduction to information retrieval. Accessed 11.04.2023 at https://ds.amu.edu.et/xmlui/bi....

22.

Murtagh, F., Contreras, P. (2012). Algorithms for hierarchical clustering: An overview, 2(1), 86–97. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. https://doi.org/10.1002/WIDM.5....

23.

Ng, A., Jordan, M., Weiss, Y. (2001). On Spectral Clustering: Analysis and an algorithm, 14. Advances in Neural Information Processing Systems.

24.

Osgood, C. E. (1964). Semantic Differential Technique in the Comparative Study of Cultures, 66(3), 171-200. American Anthropologist.

25.

Petrovic, S. (2006). A comparison between the silhouette index and the davies-bouldin index in labelling ids clusters. Proceedings of the 11th Nordic Workshop of Secure. Accessed 15.04.2023 at https://citeseerx.ist.psu.edu/... 12e97cfdaefbb2fefc253b.

26.

Punj, G., Stewart, D. W. (1983). Cluster Analysis in Marketing Research: Review and Suggestions for Application, 20(2), 134–148. Journal of Marketing Research. https://doi.org/10.1177/002224....

27.

Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, 20(C), 53–65. Journal of Computational and Applied Mathematics. https://doi.org/10.1016/0377-0....

28.

Schwartz, S. H., Cieciuch, J., Vecchione, M., Davidov, E., Fischer, R., Beierlein, C., Ramos, A., Verkasalo, M., Lönnqvist, J. E., Demirutku, K., Dirilen-Gumus, O., Konty, M. (2012). Refining the theory of basic individual values, 103(4), 663-688. Journal of Personality and Social Psychology. https://doi.org/10.1037/A00293....

29.

Shahapure, K., Nicholas, C. (2020). Cluster quality analysis using silhouette score. 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). Accessed 11.04.2023 at https://ieeexplore.ieee.org/ab....

30.

Shutaywi, M., Kachouie, N. N., Scarfone, M. (2021). Silhouette analysis for performance evaluation in machine learning with applications to clustering, 6(23), 759. Entropy, https://doi.org/10.3390/e23060....

31.

Themistocleous, C., Pagiaslis, A., Smith, A., Wagner, C. (2019). A comparison of scale attributes between interval-valued and semantic differential scales, 61(4), 394-407. International Journal of Market Research. https://doi.org/10.1177/147078....

32.

Tucker, L. (1951). A method for synthesis of factor analysis studies. ETS Program Report. Accessed 21.04.2023 at https://apps.dtic.mil/sti/pdfs....

33.

Wang, K. J., Zhang, J. Y., Li, D., Zhang, X. N., Guo, T. (2007). Adaptive affinity propagation clustering. 33(12), 1242–1246. Acta Automatica Sinica. https://doi.org/10.1360/aas-00....

34.

Ward, J. H. (1963). Hierarchical Grouping to Optimize an Objective Function, 58(301), 236–244. Journal of the American Statistical Association. https://doi.org/10.1080/016214....

Wyślij swój artykuł

Udostępnij

ARTYKUŁ POWIĄZANY

Możliwości wykorzystania sztucznej inteligencji w analizie bezpieczeństwa

AI-powered digital transformation – organizational perspective. Literature review

The Use of Artificial Intelligence in Distance Education

Artificial Intelligence and Social-Emotional Learning: what relationship?

Advanced bladder analysis using ultrasonic and electrical impedance tomography with machine learning algorithms

Indeksy

Indeks słów kluczowych

Indeks dziedzin

Indeks autorów

eISSN:	2391-789X
ISSN:	1734-2031

© 2006-2025 Journal hosting platform by Bentus

Scroll to top