Adaptive estimation for Hawkes processes; application to genome analysis

Patricia Reynaud-Bouret; Sophie Schbath

doi:10.1214/10-AOS806

October 2010 Adaptive estimation for Hawkes processes; application to genome analysis

Patricia Reynaud-Bouret, Sophie Schbath

Ann. Statist. 38(5): 2781-2822 (October 2010). DOI: 10.1214/10-AOS806

Abstract

The aim of this paper is to provide a new method for the detection of either favored or avoided distances between genomic events along DNA sequences. These events are modeled by a Hawkes process. The biological problem is actually complex enough to need a nonasymptotic penalized model selection approach. We provide a theoretical penalty that satisfies an oracle inequality even for quite complex families of models. The consecutive theoretical estimator is shown to be adaptive minimax for Hölderian functions with regularity in (1/2, 1]: those aspects have not yet been studied for the Hawkes’ process. Moreover, we introduce an efficient strategy, named Islands, which is not classically used in model selection, but that happens to be particularly relevant to the biological question we want to answer. Since a multiplicative constant in the theoretical penalty is not computable in practice, we provide extensive simulations to find a data-driven calibration of this constant. The results obtained on real genomic data are coherent with biological knowledge and eventually refine them.

Citation

Download Citation

Patricia Reynaud-Bouret. Sophie Schbath. "Adaptive estimation for Hawkes processes; application to genome analysis." Ann. Statist. 38 (5) 2781 - 2822, October 2010. https://doi.org/10.1214/10-AOS806

Information

Published: October 2010

First available in Project Euclid: 20 July 2010

zbMATH: 1200.62135

MathSciNet: MR2722456

Digital Object Identifier: 10.1214/10-AOS806

Subjects:

Primary: 62G05 , 62G20

Secondary: 46N60 , 65C60

Keywords: adaptive estimation , data-driven penalty , genome analysis , Hawkes process , minimax risk , Model selection , Oracle inequalities , unknown support

Access the abstract

JOURNAL ARTICLE
42 PAGES

DOWNLOAD PDF + SAVE TO MY LIBRARY