Electronic Journal of Statistics

Statistical properties of convex clustering

Kean Ming Tan and Daniela Witten

Full-text: Open access

Abstract

In this manuscript, we study the statistical properties of convex clustering. We establish that convex clustering is closely related to single linkage hierarchical clustering and $k$-means clustering. In addition, we derive the range of the tuning parameter for convex clustering that yields a non-trivial solution. We also provide an unbiased estimator of the degrees of freedom, and provide a finite sample bound for the prediction error for convex clustering. We compare convex clustering to some traditional clustering methods in simulation studies.

Article information

Source
Electron. J. Statist., Volume 9, Number 2 (2015), 2324-2347.

Dates
Received: March 2015
First available in Project Euclid: 14 October 2015

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1444828331

Digital Object Identifier
doi:10.1214/15-EJS1074

Mathematical Reviews number (MathSciNet)
MR3411231

Zentralblatt MATH identifier
1336.62193

Keywords
Degrees of freedom fusion penalty hierarchical clustering $k$-means prediction error single linkage

Citation

Tan, Kean Ming; Witten, Daniela. Statistical properties of convex clustering. Electron. J. Statist. 9 (2015), no. 2, 2324--2347. doi:10.1214/15-EJS1074. https://projecteuclid.org/euclid.ejs/1444828331


Export citation

References

  • Bach, F., Jenatton, R., Mairal, J. and Obozinski, G. (2011). Convex optimization with sparsity-inducing norms., Optimization for Machine Learning 19–53.
  • Boucheron, S., Lugosi, G. and Massart, P. (2013)., Concentration Inequalities: a Nonasymptotic Theory of Independence. OUP Oxford.
  • Boyd, S. and Vandenberghe, L. (2004)., Convex Optimization. Cambridge university press.
  • Chen, J. and Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces., Biometrika 95 759–771.
  • Chen, J. and Chen, Z. (2012). Extended BIC for small-$n$-large-$P$ sparse GLM., Statistica Sinica 22 555.
  • Chi, E. C., Allen, G. I. and Baraniuk, R. G. (2014). Convex biclustering., arXiv preprint arXiv:1408.0856.
  • Chi, E. and Lange, K. (2014a). Splitting methods for convex clustering., Journal of Computational and Graphical Statistics. in press.
  • Chi, E. and Lange, K. (2014b). cvxclustr: Splitting methods for convex clustering, http://cran.r-project.org/web/packages/cvxclustr. R package version 1.1.1.
  • Duchi, J. and Singer, Y. (2009). Efficient online and batch learning using forward backward splitting., The Journal of Machine Learning Research 10 2899–2934.
  • Efron, B. (1986). How biased is the apparent error rate of a prediction rule?, Journal of the American Statistical Association 81 461–470.
  • Hanson, D. L. and Wright, F. T. (1971). A bound on tail probabilities for quadratic forms in independent random variables., The Annals of Mathematical Statistics 42 1079–1083.
  • Haris, A., Witten, D. and Simon, N. (2015). Convex modeling of interactions with strong heredity., Journal of Computational and Graphical Statistics. in press.
  • Hastie, T., Tibshirani, R. and Friedman, J. (2009)., The Elements of Statistical Learning; Data Mining, Inference and Prediction. Springer Verlag, New York.
  • Hocking, T. D., Joulin, A., Bach, F., Vert, J.-P. et al. (2011). Clusterpath: an algorithm for clustering using convex fusion penalties. In, 28th International Conference on Machine Learning.
  • Jain, A. K. and Dubes, R. C. (1988)., Algorithms for Clustering Data. Prentice-Hall.
  • Lindsten, F., Ohlsson, H. and Ljung, L. (2011). Clustering using sum-of-norms regularization: with application to particle filter output computation. In, Statistical Signal Processing Workshop (SSP) 201–204. IEEE.
  • Liu, J., Yuan, L. and Ye, J. (2013). Guaranteed sparse recovery under linear transformation. In, Proceedings of the 30th International Conference on Machine Learning (ICML-13) 91–99.
  • Lloyd, S. (1982). Least squares quantization in PCM., IEEE Transactions on Information Theory 28 129–137.
  • Ng, A. Y., Jordan, M. I. and Weiss, Y. (2002). On spectral clustering: analysis and an algorithm., Advances in Neural Information Processing Systems.
  • Pelckmans, K., De Brabanter, J., Suykens, J. and De Moor, B. (2005). Convex clustering shrinkage. In, PASCAL Workshop on Statistics and Optimization of Clustering Workshop.
  • Radchenko, P. and Mukherjee, G. (2014). Consistent clustering using $\ell_1$ fusion penalty., arXiv preprint arXiv:1412.0753.
  • Rand, W. M. (1971). Objective criteria for the evaluation of clustering methods., Journal of the American Statistical association 66 846–850.
  • Schwarz, G. (1978). Estimating the dimension of a model., The Annals of Statistics 6 461–464.
  • Tan, K. M. and Witten, D. M. (2014). Sparse biclustering of transposable data., Journal of Computational and Graphical Statistics 23 985–1008.
  • Tibshirani, R. J. and Taylor, J. (2011). The solution path of the generalized lasso., The Annals of Statistics 39 1335–1371.
  • Tibshirani, R. J. and Taylor, J. (2012). Degrees of freedom in lasso problems., The Annals of Statistics 40 1198–1232.
  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J. and Knight, K. (2005). Sparsity and smoothness via the fused lasso., Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67 91–108.
  • Vaiter, S., Deledalle, C.-A., Peyré, G., Fadili, J. M. and Dossal, C. (2014). The degrees of freedom of partly smooth regularizers., arXiv preprint arXiv:1404.5557.
  • Witten, D. M. and Tibshirani, R. (2010). A framework for feature selection in clustering., Journal of the American Statistical Association 105 713–726.
  • Zhu, C., Xu, H., Leng, C. and Yan, S. (2014). Convex optimization procedure for clustering: theoretical revisit. In, Advances in Neural Information Processing Systems.