Open Access
2024 High-dimensional data segmentation in regression settings permitting temporal dependence and non-Gaussianity
Haeran Cho, Dom Owens
Author Affiliations +
Electron. J. Statist. 18(1): 2620-2664 (2024). DOI: 10.1214/24-EJS2259

Abstract

We propose a data segmentation methodology for the high-dimensional linear regression problem where regression parameters are allowed to undergo multiple changes. The proposed methodology, MOSEG, proceeds in two stages: first, the data are scanned for multiple change points using a moving window-based procedure, which is followed by a location refinement stage. MOSEG enjoys computational efficiency thanks to the adoption of a coarse grid in the first stage, and achieves theoretical consistency in estimating both the total number and the locations of the change points, under general conditions permitting serial dependence and non-Gaussianity. We also propose MOSEG.MS, a multiscale extension of MOSEG which, while comparable to MOSEG in terms of computational complexity, achieves theoretical consistency for a broader parameter space where large parameter shifts over short intervals and small changes over long stretches of stationarity are simultaneously allowed. We demonstrate good performance of the proposed methods in comparative simulation studies and in an application to predicting the equity premium.

Funding Statement

Haeran Cho was supported by the Leverhulme Trust (RPG-2019-390). Dom Owens was supported by EPSRC Centre for Doctoral Training in Computational Statistics and Data Science (EP/S023569/1).

Citation

Download Citation

Haeran Cho. Dom Owens. "High-dimensional data segmentation in regression settings permitting temporal dependence and non-Gaussianity." Electron. J. Statist. 18 (1) 2620 - 2664, 2024. https://doi.org/10.1214/24-EJS2259

Information

Received: 1 May 2023; Published: 2024
First available in Project Euclid: 1 July 2024

Digital Object Identifier: 10.1214/24-EJS2259

Keywords: change point , Data segmentation , high-dimensional regression , multiscale , time series analysis

Vol.18 • No. 1 • 2024
Back to Top