## The Annals of Applied Statistics

- Ann. Appl. Stat.
- Volume 8, Number 3 (2014), 1416-1442.

### Functional clustering in nested designs: Modeling variability in reproductive epidemiology studies

Abel Rodriguez and David B. Dunson

#### Abstract

We discuss functional clustering procedures for nested designs, where multiple curves are collected for each subject in the study. We start by considering the application of standard functional clustering tools to this problem, which leads to groupings based on the average profile for each subject. After discussing some of the shortcomings of this approach, we present a mixture model based on a generalization of the nested Dirichlet process that clusters subjects based on the distribution of their curves. By using mixtures of generalized Dirichlet processes, the model induces a much more flexible prior on the partition structure than other popular model-based clustering methods, allowing for different rates of introduction of new clusters as the number of observations increases. The methods are illustrated using hormone profiles from multiple menstrual cycles collected for women in the Early Pregnancy Study.

#### Article information

**Source**

Ann. Appl. Stat. Volume 8, Number 3 (2014), 1416-1442.

**Dates**

First available in Project Euclid: 23 October 2014

**Permanent link to this document**

https://projecteuclid.org/euclid.aoas/1414091219

**Digital Object Identifier**

doi:10.1214/14-AOAS751

**Mathematical Reviews number (MathSciNet)**

MR3271338

**Zentralblatt MATH identifier**

1303.62040

**Keywords**

Nonparametric Bayes nested Dirichlet process functional clustering hierarchical functional data hormone profile

#### Citation

Rodriguez, Abel; Dunson, David B. Functional clustering in nested designs: Modeling variability in reproductive epidemiology studies. Ann. Appl. Stat. 8 (2014), no. 3, 1416--1442. doi:10.1214/14-AOAS751. https://projecteuclid.org/euclid.aoas/1414091219

#### Supplemental materials

- Supplementary material: Supplement to “Functional clustering in nested designs: Modeling variability in reproductive epidemiology studies”. The supplementary materials contain the details of the Markov chain Monte Carlo algorithm used to fit the models introduced in the paper.Digital Object Identifier: doi:10.1214/14-AOAS751SUPP