## Electronic Journal of Statistics

### Finite mixture regression: A sparse variable selection by model selection for clustering

Emilie Devijver

#### Abstract

We consider a finite mixture of Gaussian regression models for high-dimensional data, where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by a maximum likelihood estimator, restricted on relevant variables selected by an $\ell_{1}$-penalized maximum likelihood estimator. We get an oracle inequality satisfied by this estimator with a Jensen-Kullback-Leibler type loss. Our oracle inequality is deduced from a general model selection theorem for maximum likelihood estimators on a random model subcollection. We can derive the penalty shape of the criterion, which depends on the complexity of the random model collection.

Electron. J. Statist., Volume 9, Number 2 (2015), 2642-2674.

First available in Project Euclid: 8 December 2015

https://projecteuclid.org/euclid.ejs/1449582158

doi:10.1214/15-EJS1082

MR3432429

1329.62279

Devijver, Emilie. Finite mixture regression: A sparse variable selection by model selection for clustering. Electron. J. Statist. 9 (2015), no. 2, 2642--2674. doi:10.1214/15-EJS1082. https://projecteuclid.org/euclid.ejs/1449582158