A Fast Clustering Method for Large Data Sets

Omar Kettani


In this paper a deterministic clustering method based on the Katsavounidis, Kuo & Zhang (KKZ) seed procedure, is proposed. This approach has a lower computational complexity than the prominent k-means algorithm. We compared our method with a related deterministic clustering method: KKZ_ k-means (k-means initialized by KKZ). Performance evaluation demonstrates its effectiveness in term of average Silhouette index in various benchmark datasets.

Full Text:



Ankerst, M., M. Breunig, H.P. Kriegel and J. Sander, 1999.OPTICS: Ordering points to identify the clustering structure. Proceeding of ACM SIGMOD International Conference Management of Data Mining, May 31-June 3, ACM Press, Philadelphia, United States, pp: 49-60.

Aloise, D.; Deshpande, A.; Hansen, P.; Popat, P. (2009). "NP-hardness of Euclidean sum-of-squares clustering". Machine Learning 75: 245–249. doi:10.1007/s10994-009-5103-0.

Lloyd, S.P., 1982. Least square quantization in PCM. IEEE Trans. Inform. Theor., 28: 129-136.

MacQueen, J.B., 1967. Some Method for Classification and Analysis of Multivariate Observations, Proceeding of the Berkeley Symposium on Mathematical Statistics and Probability, (MSP’67), Berkeley, University of California Press, pp: 281-297.

Katsavounidis, I., C.C.J. Kuo and Z. Zhen, 1994. A new initialization technique for generalized Lloyd iteration. IEEE. Sig. Process. Lett., 1: 144-146.

Asuncion, A. and Newman, D.J. (2007). UCI Machine Learning Repository

[http://www.ics.uci.edu/~mlearn/MLRepository.html] Irvine, CA: University of California, School of Information and Computer Science.

L. Kaufman and P. J. Rousseeuw. Finding groups in Data: “an Introduction to Cluster Analysis”. Wiley, 1990.


  • There are currently no refbacks.

Copyright (c) 2021 Journal of Electrical Engineering, Electronics, Control and Computer Science

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.