African Journal of Applied Statistics

A comparative study on the regularized versions of discriminant analysis: An application to gene expression data

Olusola Samuel MAKINDE

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Discriminant analysis has been used in many application for classification and dimension reduction when the ratio of sample size to dimension diverges. However, the applicability of this method is almost impossible whenever sample size is bigger than dimension of the data. Efforts have been made to circumvent this problem by either regularise or penalise sample covariance matrices of the competing classes of observations. However, presence of redundant features in the data raises misclassification rates of discriminant rule. In this paper, we explore shrunken centroid regularised discriminant analysis for gene selection and regularised discriminant analysis as classification method based on various versions of regularised covariance matrices of competing classes of gene expression levels. The performance of the regularised linear and quadratic discriminant analysis in comparison with some other classification methods is illustrated using some gene expression data sets as well as simulated data.

Résumé

L'analyse discriminante a été utilisée dans beaucoup d'application pour la classification and das la réduction de dimension lorsque le rapport taille de l'échantillon/Dimension diverge. Toutefois, l'applicability de cette méthode est problèmatique si la taille de l'échantillon est plus grande que la dimension de données. Des efforts ont été faits pour régler cette difficulté soit régulariser soit pénaliser la matrice empirique des variances-covariances des classes d'observations en compétition. Cependant, la présence de caractéristiques redonnantes conduit à accroitre le taux de mal classements dans la discrmination. Dans ce papier, we explorons la méthode dite shrunken centroid regularized discriminant Analysis pour l'expression des gènes et celle dela méthode de l'analyse discriminate régulaizée come outil de classements relatives à plusieurs versions de régularisatin des matrices de covariances des classes en compétitions relatives aux niveau d'espression des gènes. La performance de la régularization linéaire et quadratique de l'analyse discriminante en comparaison avec certaines autres méthodes de classification est illustrée par une pratique avec des jeux de données et une étude de simulation.

Article information

Source
Afr. J. Appl. Stat., Volume 4, Number 1 (2017), 273-287.

Dates
First available in Project Euclid: 16 May 2019

Permanent link to this document
https://projecteuclid.org/euclid.ajas/1557972223

Digital Object Identifier
doi:10.16929/ajas/2017.273.215

Subjects
Primary: 62H30: Classification and discrimination; cluster analysis [See also 68T10, 91C20] 62H10: Distribution of statistics 60E05: Distributions: general theory

Keywords
regularised covariance matrices discriminant analysis high dimensional data shrunken centroid gene expression data

Citation

MAKINDE, Olusola Samuel. A comparative study on the regularized versions of discriminant analysis: An application to gene expression data. Afr. J. Appl. Stat. 4 (2017), no. 1, 273--287. doi:10.16929/ajas/2017.273.215. https://projecteuclid.org/euclid.ajas/1557972223


Export citation