Efficient Change Point Detection and Estimation in High-Dimensional Correlation Matrices

This paper considers the problems of detecting a change point and estimating the location in the correlation matrices of a sequence of high-dimensional vectors, where the dimension is large enough to be comparable to the sample size or even much larger. A new break test is proposed based on signflip parallel analysis to detect the existence of change points. Furthermore, a two-step approach combining a signflip permutation dimension reduction step and a CUSUM statistic is proposed to estimate the change point's location and recover the support of changes. The consistency of the estimator is constructed. Simulation examples and real data applications illustrate the superior empirical performance of the proposed methods. Especially, the proposed methods outperform existing ones for non-Gaussian data and the change point in the extreme tail of a sequence and become more accurate as the dimension p increases. Supplementary materials for this article are available online.


Introduction
Change point detection is a classical statistical problem aiming to detect if there is a change in the mean, covariance structure, and distribution along a sequence of time-ordered observations.It has been an active field of research in many scientific fields, such as quality control [21], financial market analysis [28], genetics and medicine [7], psychopathology [22], signal processing [18] and machine learning [5].
Driven by a wide range of modern scientific applications, change point inference in high-dimensional data is of significant current interest.A vast part of the existing literature focuses on detecting changes in the mean [17,8,26] and in the covariance structure [1,11,2,10,27].The literature on detecting structural breaks in the correlation matrix is relatively scarce.The correlation coefficient is a standard measure for linear dependency among variables, and correlation analysis is reliable only when the data has no change in correlation during the observation period.Moreover, in a multivariate setting, aside from changes in univariate features, important events are often marked by abrupt correlation changes [4].
Change point analysis of correlation has been well studied in the conventional low-dimensional setting, i.e., the dimension is fixed while the sample size becomes large.[28] constructed a bootstrap variance matrix estimator to detect a change in the correlation matrix.[4] proposed a Gaussian kernel-based change point detection method (KCP).Compared to CUSUM-type methods, KCP can locate multiple change points simultaneously.Change point analysis in time series has been studied by [19] and [12] and the references therein.However, in the high-dimensional setting, where the dimension can be much larger than the sample size, the methods constructed under the low-dimensional setting either perform poorly or are not even well defined.For example, the method proposed by [28] is unstable when the dimension is not small relative to the sample size due to the (near) singularity of the variance estimator.To avoid variance estimation, [9] proposed a break test based on the self-normalization method, which applies to high-dimensional data.However, there is no estimation method proposed to locate the change point.[10] proposed a two-stage approach based on bootstrap to estimate a change point in the high-dimensional covariance structure.This method can be applied to high-dimensional correlation matrices, as correlation can be treated as a particular case of the covariance matrix.However, this method is designed to estimate the location when the change point exists in the middle of a sequence of data, which is restrictive in real applications.This paper proposes a new test based on signflip parallel analysis to detect the existence of changes in the correlation structure.Moreover, we construct a two-step approach combining a signflip permutation dimension reduction step and a CUSUM statistic to estimate the location of change point and simultaneously identify corresponding changing components in the correlation matrix.The theoretical properties analysis is conducted, and the consistency of the proposed estimator is established.In addition, by combing the Synthetic Minority Oversampling Technique (SMOTE), the estimation accuracy is enhanced when the change point exists in the extreme tail of a data sequence.We examine the numerical performance of the proposed detection and estimation methods using both simulated and real datasets.The numerical results show that the proposed methods significantly outperform the existing ones for non-Gaussian data and the change point existing in the extreme tail of a sequence.More importantly, as the dimension p increases, the proposed methods become more accurate while existing methods maybe not.
The remaining sections are organized as follows.In Section 2, we present the first main result of the paper.A new break test is proposed based on signflip parallel analysis to detect change points in the correlation structure of highdimensional data.In Section 3, we present the second main result of the paper.After a dimension reduction step based on signflip permutation, a CUSUM statistic is used to estimate the location of the change point.At the same time, the components of the correlation matrix leading to the change point are identified.The consistency of this estimator is established.In Section 4, an algorithm combing the proposed estimation procedure and SMOTE is designed to locate the change point in the extreme tail of a data sequence.We evaluate the finite performance of two proposed methods compared to several existing methods by a detailed simulation study in Section 5. Real data analyses are carried out in Section 6.Some discussions are offered in the last conclusion section.Due to the limited space, we relegated the proofs and some numerical results to supplementary materials.

Detection of Change Point
Given a sequence of data y 1 , . . ., y t0 , y t0+1 , . . ., y T ∈ R p , as a first step, one may be interested in testing the existence of change points in the correlation structure.In this section, we aim to test for consistency of the correlation matrices of observations {y t } 1≤t≤T : The null hypothesis H 0 indicates the consistency of the correlation matrices during the observation period.The alternative hypothesis allows for one or more change points in the correlation structure.When there exists a change point, the parameter t 0 defines the true change point, that is, the first t 0 samples y 1 , . . ., y t0 have a common correlation matrix R 1 = (ρ 1 (i, j)), i, j = 1, . . ., p, while the last (T − t 0 ) samples y t0+1 , . . ., y T have another common correlation matrix R 2 = (ρ 2 (i, j)), i, j = 1, . . ., p, and R 1 ̸ = R 2 .
For t = 1, . . ., T , denote by y t = (y t1 , . . ., y tp ) ′ the t-th observation, let ȳ = 1 T T i=1 y i = (ȳ 1 , . . ., ȳp ) ′ be the sample mean, S n = 1 T −1 T i=1 (y i − ȳ)(y i − ȳ) ′ the sample covariance matrix with diagonal elements diag(D).Define x t = D −1/2 (y t − ȳ) as the vector of standardized observations.Given a time point t, define the sample correlation matrices ρt 1 (i, j) p×p := Rt In order to quantify the changes in correlations after and before t, a natural approach is to compare the sample correlations {ρ t 1 (i, j)} and {ρ T t+1 (i, j)}, and we consider the following p(p − 1)/2-dimensional vector of the squared (componentwise) differences of the elements of the sample correlations.For large p, it can be calculated efficiently by where vecho(•) indicates the half-vectorization p(p − 1)/2 vector by vectorizing only the lower triangular part without the diagonal of the symmetric matrix, and " • " is the Hadamard product.If a change point exists at t 0 , then one can verify that v t0 is an estimator of vecho(R 1 −R 2 ) 2 , which measures the difference between the two population correlation matrices.In particular, the expectation of the component v t (i, j) of the vector v t corresponding to the position (i, j) in the matrices R 1 and R 2 is (for simplicity, this expectation is calculated under the assumption that original data {y t } have mean zero and variance one.For notational convenience, write ρ 1 = ρ 1 (i, j) and ρ 2 = ρ 2 (i, j) here) where β1 := E(x ki x kj ) 2 for k ≤ t, β2 := E(x ki x kj ) 2 for k > t, and w.l.o.g, we can assume β1 and β2 are uniformly bounded in this paper (see Assumption 1, Lemma A.1, and the Proposition 2.7.1 of [25]).Thus, in each of the three cases of (3), the first term is the main term with order O(1), while the other terms are all o(1).Therefore, the expectation of vt(i, j) consistently achieves the largest value at the true change point position t0 because the coefficients (T −t 0 ) 2 (T −t) 2 and t 2 0 t 2 before (ρ1 − ρ2) 2 are smaller than 1 when t ̸ = t0.Moreover, these coefficients are unrelated to the position (i, j).Therefore, for any fixed t, larger values of vt(i, j) indicate a significant difference between ρ1(i, j) and ρ2(i, j).In contrast, if there is no change point in the sequence of correlation matrices, then all elements in the vectors vt's (1 ≤ t ≤ T ) are considerably small as Evt(i, j) = o(1) for all t.
In addition, instead of investigating all values of t separately, we use a weighted summation to identify the largest components among the p(p − 1)/2 entries.The weights t(T −t) T are introduced to address the different sizes of the variance of vt for different values of t.By examining the largest entries in the vector w, we can detect whether change points exist in the correlation structure.
We propose a threshold based on signflip parallel analysis, and the Algorithm is as follows.

Algorithm 1 : Threshold via Signflip Permutation
Input : Data Matrix Y ∈ R p×T (p variables and T series), number of trials q.
Output : Threshold τ1.Calculate the weighted sum w(m) in (4) from {ṽ Let w(i, j) denote the elements of the vector w corresponding to the position (i, j) in the matrices R1 and R2.We identify all components which are larger than the threshold τ1, to be precise, define as the index set of all corresponding components.Then, for the testing problem in (1), the rejection region is where card(•) is the cardinality of a set.The change point detection approach based on the rejection region C in Equation ( 6) will be referred to as the signflip parallel analysis-based change point detection (SPAD) procedure.
Remark 2.1.To illustrate why the threshold τ1 selected by signflip permutation is valid, we examine the expectation of the component ṽt(i, j) of the vector ṽt for signflipped data Rm • Y (under the same assumption as (3)).We obtain Compared to the expectation of vt(i, j) in (3), the elements in the vectors ṽt's are considerably small for all t both under the null and alternative hypothesis, as the expectation is not related to the difference between ρ1(i, j) and ρ2(i, j), and E ṽt(i, j) = o(1).In addition, we illustrate the phenomenon in Figure 1 through a simulation study.Independent samples are drawn from Gaussian distribution with p = 100 and T = 100.Under H0, the correlation matrices are R1 = R2 = Ip, and under H1, R1 = Ip, and R2(i, j) = 0.5 for all 1 ≤ i ̸ = j ≤ p.With increasing order, we plot the distributions of the elements of w for original data and the elements of w for signflipped data.In Figure 1(a), it is obvious that the two distributions are almost coincident under H0, except that the maximum value of w may be larger than that of w.In contrast, under H1, the distribution of the elements of w is always above that of w, as shown in Figure 1(b).More importantly, the distributions of the elements of w are very similar under both H0 and H1.Thus, the signflip step breaks the changing structure in the original data.See [14] for more signflip parallel analysis properties.

Estimation of Change Point Location
In many applications, if the null hypothesis H0 : corr(y 1 ) = • • • = corr(y T ) is rejected, it is often of significant interest to further estimate the change point's location and investigate at which components the correlation matrices R1 and R2 differ from each other, that is recovering the support of R1 − R2.Suppose the change point location in the sequence is t0 = βT, 0 < β < 1, under the alternative with one change point, we aim to estimate the change point fraction β.
When the null hypothesis H0 in (1) is rejected, the cardinality of wτ 1 is nonzero, and its element {wτ 1 (i, j)} indicates a significant difference between ρ1(i, j) and ρ2(i, j).Thus, based on the components corresponding to wτ 1 , we can estimate the location of the change point.In order to keep more sufficiently significant changes, we use a new threshold τ2, which is slightly smaller than τ1.τ2 is generated the same way as τ1 by Algorithm 1 except for Line 6, τ2 is the α-quantile (95%, 90%) of all elements of W , that is, where vec(•) is the vectorization of a matrix.Then, we reduce the p(p−1)/2-dimensional vector w to a vector of dimension d, and correspondingly, we obtain zi, a d-dimensional subvector of vecho(xix ′ i ) by keeping the d components of the index set wτ 2 .
For the estimation of the change point, we consider the CUSUM statistic where || • ||2 represents the Euclidean norm, and ρt 1 and ρT t+1 represent the vectors containing the elements of the correlation matrice estimators Rt 1 and RT t+1 corresponding to positions identified in the dimension reduction index set wτ 2 , respectively.Then the estimator of the change point fraction β is defined by This estimator β will be referred to as the signflip parallel analysis-based CUSUM estimator (SPACE).The dimension reduction index set wτ 2 denotes the support of R1 − R2, that is, the positions at which the two correlation matrices differ.
To illustrate the effectiveness of dimension reduction in (9) and derive the asymptotic consistency of the proposed estimator β in (11), we need the following assumptions on observations.Assumption 1. Denote xt = (xt1, . . ., xtp) ′ , t = 1, . . ., T .For any 1 ≤ i ≤ p, xti is a sub-Gaussian random variable, that is, there are positive constants C1, C2 (independent of the indices t and i) such that for every ε > 0, Assumption 2. The smallest nonzero entry of the matrix R1 − R2 satisfies where τ2 is given in (8).
Assumption 3.For some positive constant c, we have In the following theorem, we prove that all the entries with no difference are not included in wτ 2 .
Theorem 3.1.Define N = {(i, j) : 1 ≤ i < j ≤ p; ρ1(i, j) = ρ2(i, j)} as the set of indices corresponding to identical elements in the correlation matrices R1 and R2.Then under Assumption 1, ) where c1, c2 and c3 are some constants.In addition, under Assumption 3 we have In the next theorem, we prove that all the entries with a difference larger than are kept in wτ 2 .
Theorem 3.2.Define P = {(i, j) : 1 ≤ i < j ≤ p; |ρ1(i, j) − ρ2(i, j)| > λ} as the set of components which differ by more than λ, where λ = C3 τ 2 T • max as given in Assumption 2. Then under Assumption 1 and Assumption 2, we have where c4 and c5 are some constants.In addition, under Assumption 3 we have Theorem 3.2 shows that under Assumptions 2 and 3, the vector wτ 2 recovers the true support of R1 − R2 exactly with probability tending to 1.
In the next result, we establish the asymptotic consistency of the estimator β.Here they symbol p → denotes convergence in probability.
Theorem 3.3.Under Assumptions 1 and 2, we have where c6 and c7 are some constants.In addition, under Assumption 3 we have Remark 3.1.The number of trails q in the Algorithm.In a parallel analysis, the number of trials should be as large as possible to retain a stable and accurate result, such as 50 and 100 trials [15,16,24].However, many repetitions lead to time-consuming computation, especially for high-dimensional data.We investigate a suitable number of parallel trails by conducting simulations on the effect of the number of trials q in both change point detection and estimation.
Figure 2 presents the success rate of the proposed detection method, SPAD, against the number of parallel trails q under the H0 without a change point.The success rate grows fast when q is still small.Therefore, we recommend a number of trials of around 30 in the change point detection procedure.
Figure 3 presents the mean values of SPACE's estimated change point fraction against the number of parallel trails q for the four cases in Section 5.2 and different combinations of p and T .Each mean value is calculated from 200 repetitions.The mean values of estimates are all close to the true value and vary slightly with q.Therefore, the number of trails around 10 to 20 is enough to estimate the change point location.

Change Point in the Extreme Tail
Most existing literature aims to detect and estimate the change point in the middle of a data sequence.However, the change point hardly arises near the center and may  appear in the tail.Therefore, in this section, we propose combining the SMOTE with SPACE to deal with the challenging situation where the change point arises in the extreme tail of a data sequence.
SMOTE is a viral oversampling method proposed by [6] and designed to deal with imbalanced data in classification problems by generating synthetic samples from the minority class.Specifically, for each sample y in the minority class, the nearest five neighbors with the smallest Euclidean distance are identified.One of them is randomly chosen as y * t , based on which a new synthetic SMOTE sample is produced as , where u is randomly chosen from a standard uniform distribution [3].To estimate the change point in the extreme tail effectively, the proposed SPACE can be enhanced by combining the SMOTE, and the Algorithm is as follows.

Algorithm 2 : SMOTE + SPACE
Input : Data Matrix Y ∈ R p×T (p variables and T series), small positive number ϵ.
Output : the estimator of the change point fraction.
1 β0 ← SPACE result from the original data. 2 Let [γT + 1, T ] be the minority class, get (1 − γ)T SMOTE variables,γ ≥ 0.9.Similarly, by combining the SMOTE, the proposed SPAD method can also be enhanced to detect the existence of change points when they are in the extreme tail of the sequence of data, see Algorithm 3 in the Supplement Materials and corresponding simulations.

Finite Sample Properties
In this section, we conduct extensive simulations to examine the proposed SPAD method's success rate and the proposed SPACE's accuracy and compare them with several existing methods.We consider different dimensions p ranging from 5 to 500 and sample sizes T = 100 and T = 200.All the numerical results below are calculated from 200 replications.

Detection of Change Point
We first illustrate the performance of the SPAD method under both H0 and H1.
Under H0, independent samples are drawn from multivariate Gaussian distributions with correlation matrice R1 = R2 = Ip.The threshold τ1 is obtained from q = 30 parallel trials.We compare SPAD with a method proposed by [10], which is designed to detect a change point in a sequence of covariance matrices.For the reader's convenience, we recall the method of [10].
• The detection method of [10] is based on the following vector.
Overall, the SPAD method prominently outperforms Dette's method in all cases.Significantly, as the dimension p becomes larger, the accuracy of SPAD increases while the accuracy of Dette's method decreases.When the change point exists in the middle of the sequence (Cases 1 & 2), a larger magnitude of changes in the correlation matrices leads to higher detection accuracy, as expected.The detection accuracies of both methods are also affected by the location of the change point.Compared to Case 1, the change point location in Case 3 is close to the tail, so the detection accuracies of both   4 & 5), the performances of both methods are influenced by the change direction of correlations.In Case 4, the correlation matrix changes from an identity matrix to an equal correlation matrix and then changes back to the identity matrix.It is hard for both methods to detect the change points.In Case 5, the correlations increase from 0 to 0.5 and then to 0.9.Both methods can detect the existence of change points.Our SPAD method performs much better and reaches very high accuracies.

Estimation of Change Point
We investigate the finite sample properties of the proposed SPACE under different settings and compare it with two alternative methods proposed by [10] and [4], which are recalled as follows.For the threshold τ2 in SPACE, we set α = 95% in Equation (8).
• The estimator of [10] is defined as follows.First, a critical value τ is constructed from the bootstrap approach stated in Section 2.3 of [10], based on which an index set is obtained from the vector D in (17) where ỹi is a subvector of vech( ẏi ẏ′ i ) by keeping the corresponding components of the index set Dτ .
• The estimator of [4] is kernel-based.The similarities of all pairs of (y t , y t ′ ) are measured using Gaussian kernel, where h is the bandwidth parameter, equals the median Euclidean distance between all observations.Then the KCP-raw estimator is where v1,t = (t − 1) − measures the homogeneousness of phase after the time point t.
Table 3 shows the mean change point fraction, SD, and MSE of three estimators for all four cases when data are drawn from Gaussian distribution.The true change point exists in the middle of the sequence.The SPACE and Dette's estimators perform better than the KCP-raw estimator in all cases.For Cases 6 and 8, Dette's estimator is slightly better than SPACE in terms of mean, while SPACE is better than Dette's estimator in terms of SD and MSE, which implies SPACE is more stable.Moreover, SPACE performs the best in Case 7 and Dette's estimator performs the best in Case 9.As expected, the KCP-raw estimator is only close to the true value when p = 5, but performs poorly when p is large.Meanwhile, the KCP-raw estimator has the largest SD and MSE.
Table 4 shows the mean change point fraction, SD, and MSE of three estimators for all four cases when data are drawn from Student-t distribution.SPACE performs best in all cases.With the results in Table 3 (where data are Gaussian), Table 4 shows the robustness of SPACE concerning population distributions.In contrast, Dette's estimator is not robust to non-Gaussian data as the mean values are far from the true value except for the cases when p = 5.
Table 5 and Table 6 present the mean change point fraction, SD, and MSE of three estimators under four cases for normal and student-t distributed samples, respectively, when the change point is close to the tail, β = 0.7.For normal samples, the proposed SPACE performs the best for Cases 6, 7, and 8 and is very close to the true value with the smallest SD and MSE.Dette's estimator performs slightly better than SPACE only under Case 9.For student-t samples, SPACE is the best, while Dette's estimator performs the worst since the mean tends to 0 as the dimension increases.

Change Point in the Extreme Tail
To show the performance of the SPACE+SMOTE Algorithm, we explore Case 6 when the change point occurs in the extreme right tail, β = 0.8, 0.9, 0.95.The mean change point fraction, SD, MSE, and average iterations of the algorithm are shown in Table 7.As the dimension p increases from 5 to 500, or the sample size T increases from 100 to 200, the mean values become very close to the true value, and SD and MSE become smaller as expected.The algorithm is time-efficient as small number of iterations is enough.
To compare the overall performance of SPACE+SMOTE, SPACE, Dette's estimator and KCP-raw, Figure 4 shows the variation of the absolute value of β − β of these four methods as the true change point fraction β changes from 0.5 to 0.95.A smaller absolute value indicates a more accurate estimated result.We consider Case 6 and Case 7 with p = 100, T = 100, 200 for normal samples.The SPACE+SMOTE Algorithm is the most effective one in estimating the location of the change point in the extreme tail.SPACE is also better than Dette (2020) and KCP-raw methods.

Heteroscedasticity and Dependence
Till now, homoscedasticity of random variables is assumed, which is often challenged in real data application, however.Therefore, we consider the following heteroscedasticity setting.D = diag(σ1, . . ., σp), Σ1 = DR1 D, Σ2 = DR2 D, where σi, i = 1, . . ., p are independently drawn from Uniform distribution U (1, 21).We compare SPACE with Dette's estimator and KCP-raw under Cases 6-9 for normal and student-t distributed data, respectively.The results are similar to that for the homoscedasticity setting in the Section 5.2.The tables are in the supplementary materials.
The independence assumption on the sequence of data y 1 , . . ., y T is also not common in real applications.We investigate the performance of SPACE for dependent data.Moreover, heteroscedastic scenarios are also considered.Following [9], we consider a vector autoregression of order 1, where the error vector et = (et1, . . ., etp) ′ has zero mean and covariance matrix Σt = Dt RtDt with Dt = diag(σt1, . . ., σtp) and correlation matrix at time t, Rt.Here we consider the setting of correlation matrices in Case 6, that is, Rt = R1 for 1 ≤ t ≤ t0, Rt = R2 for t0 + 1 ≤ t ≤ T .The following four scenarios are considered.
• Heteroscedasticity serial dependence: The simulation results are presented in Table 8.Under both homoscedasticity serial dependence and conditional heteroscedasticity scenarios, SPACE and Dette's estimator perform better than the KCP estimator.Dette's estimator is better than SPACE in terms of mean, while SPACE is better than Dette's estimator in terms of SD and MSE.Under the unconditional heteroscedasticity scenario, the KCP estimator performs the best, and SPACE performs similarly with Dette's estimator.

Real Data Analysis
Though the proposed SPACE outperforms the existing ones in the simulation experiments, we now compare SPACE and Dette's estimator on two real data sets.The KCP-raw method is omitted here due to its bad performance in the simulations.

Gene Expression Profile
The first data is a real gene expression profile studied by [23].The data set investigated the relationship between peripheral blood gene expression patterns and coronary artery disease severity.Patients undergoing coronary angiography were selected according to their coronary artery disease index (CADi), a validated angiographical measure of the extent of coronary atherosclerosis.RNA was extracted from the blood of 110 patients with at least a stenosis greater than 50% (CADi ≥ 23 ) and from 112 controls without evidence of coronary stenosis (CADi= 0).Therefore, the total sample size is n = 222.Rearrange the data so the true change point position is t0 = 110.The raw dataset is available at the Gene Expression Omnibus repository with the accession number Series GSE12288.
Following [10], we first apply the two sample t-tests with a significance level 0.05 to all genes, and 1410 genes are reserved.As mentioned in [23], many genes are not correlated with CADi, and thus it is suitable to do such a preprocessing step.
To show the estimation performance of these two estimators, we change the number of genes kept in estimation procedures from p = 5 to p = 1410 corresponding to the smallest p-values in the two sample t-tests.Table 9 records the estimated change point fractions for different p.As p becomes large, both estimators are closer to the true value.SPACE performs better and remains stable when p is large enough.In particular, SPACE achieves the true value when p ≥ 200, while Dette's estimator never reaches the true value.

FRED-MD Data Set
The second data set is the FRED-MD, a macroeconomic database of 134 monthly U.S. indicators from eight aspects (output and income, labor market, consumption and orders, orders and inventories, money and credit, interest rate and exchange rates, prices and stock market) from January 1959, aim to establish a convenient starting point for empirical analysis that requires "big data" [20].The data set is updated in a timely manner and can be downloaded for free from the website http://research.stlouisfed.org/econ/mccracken/sel/.[13] used this data set to predict the risk of U.S. bond risk premia.As the change of correlation coefficient is often accompanied by risk, we apply the proposed method to estimate the change point location in this data set.
After removing the missing data, 127 macroeconomic indicators are retained for   When T = 200, the period is from December 2005 to July 2022.The SPACE estimated change point month is September 2008.This month, the financial crisis started to spiral out of control and led to the failure or government takeover of many fairly large financial institutions, triggering an economic recession.
By comparison, when T = 50, 100, 130, Dette's estimates are February 2021, January 2021 and October 2020, respectively.However, the implications of the changes are unclear.When T = 150, Dette's estimator gives the same estimator as SPACE, the time of the coronavirus outbreak.When T = 200, Detter's estimator also gives the timing of the coronavirus outbreak.

Conclusion
In this paper, we propose a break test and a change point location estimator in the correlation structure of high-dimensional data.The main appeal of our methods is that they are developed without any assumption on the ratio of p and T and are thus appropriate for a wide range of large dimensional data sets.The larger p or T , the better performance of the proposed methods.Moreover, they are also effective when the change point exists in the extreme tail of a data seqence.Monte Carlo experiments and real data analyses demonstrated the superiority of the proposed methods over several existing methods.In addition, when a change point exists, our methods automatically provide support recovery in the index set wτ 1 or wτ 2 .The proposed estimator is robust to non-Gaussian data, although the theories are developed under the normal assumption.
There are several possible extensions of our methods.A direct extension is to detect and estimate the change point in the covariance structure of a sequence of highdimensional vectors.In addition, as we have presented, signflip permutation can break the change point structure and construct the behavior under the null without a change point.Thus, signflip can be used in other change point problems, including online de-tection.

Appendix A: Appendix
A.1.More simulation results
Output : the successful detection rate under H1.
when there exists a change point, a(i) = 0, otherwise.
, the successful detection rate under H1 for raw data.Table 11 shows the result of the success rate of detecting the existence of the change point over 200 times repetitions under H1 when the change point occurs in the tail(β = 0.9, 0.95), and the correlation matrix before and after the change point is the same as Case 6 setting.We let ϵ = 0.05 in order to improve computing efficiency.Compared with the result of Case 3 in Table 2, where β = 0.75, when the position of the change point is closer to the tail, the proportion of successful detection of the change point is greatly increased, which also verifies the effectiveness of the iterative algorithm proposed in this paper.The following four tables 12, 13, 14 and 15 contain the simulation results under heteroscedasticity setting in Section 5.4.

Table 12
Heteroscedasticity setting.Normal distribution.Mean, standard deviation (SD), and mean squared error (MSE) of three estimators when the change point exists in the middle β = 0.5.

A.2. Proofs of main results
We first introduce some auxiliary results that are frequently used in the proofs.
Table 13 Heteroscedasticity setting.Student-t distribution.Mean, standard deviation (SD), and mean squared error (MSE) of three estimators when the change point exists in the middle β = 0.5.
then it is sufficient to verify P{w(i, The components of the vector w corresponding to the entry in the position (i, j) (1 ≤ i, j ≤ p) of the correlation matrices R1, R2 are given by As t varies from 2 to T − 2, the weight t(T −t) T has relatively larger values when In addition, when T is big enough, √ T can be pretty small, and the two tails of t(T −t) T show symmetric form.Moreover, both the two tails only involve (⌊ √ T ⌋ − 1) terms, and thus the coefficient gives an extra factor of order 1 √ T in the calculations.Thus, we can decompose w(i, j) in terms of t into three parts, w(i, j) = w (1) (i, j) + w (2) (i, j) + w (3) (i, j), where w (1) Therefore, in order to prove 1 2 (p 2 − p)P{w(i, j) > τ2} → 0, it is sufficient to prove In this inequality and hereafter in the proof, c and ci(i = 1, 2, . ..) indicate some positive constants that may change from line to line.The case h = 1, 3: For the index h = 1 and h = 3 the arguments are quite similar, and for brevity, we only consider the case h=1.For the statistic w (1) (i, j), we find that P w (1) Since xti and xtj are sub-Gaussian random variables and xtixtj is a sub-exponential random variable based on Lemma A.1.Then, according to Lemmas A.2 & A.3, we have as the smallest absolute value between the exponents is T 1/4 √ τ2.Under Assumption 3, we have The case h = 2: For the statistic w (2) (i, j), we have Then according to Lemma A.3, we have Under Assumption 3, we have We complete the proof of Theorem 3.1.

Fig 1 .
Fig 1.The distributions of increasing-order elements of w and w, (a) under H 0 , (b) under H 1 .

Fig 3 .
Fig 3. Mean value of the SPACE β vs. number of trails q.

3 β1←
SPACE result from the inflated data.
nearly 20 years, that is, p = 127.The latest month is July 2022.To show the estimation performance of these two estimators, we change the number of months T until July 2022 from 50 to 200.Table10records the estimated change point location (the exact month) for different T , with change point fraction in parentheses.When T = 50, the period is from June 2018 to July 2022.The SPACE estimated change point month is October 2021.This coincided with a sharp drop in U.S. stocks at the end of the third quarter of 2021, with the S&P 500 posting its worst monthly drop since the coronavirus outbreak in March 2020, and the risk of stagflation loomed in the US economy, according to the reports in The Wall Street Journal and the BBC.When T = 100, 130, 150, SPACE has been estimating the change point month to be April 2020, which coincided with the coronavirus outbreak.The World Health Organization (WHO) announced a global pandemic on March 11, 2020.The New York
Success rate vs. the number of trails q under H 0 .

Table 1
The success rate of detecting no change point under H 0 .Table1presents the success rate of detecting no change point.Both methods have very high accuracies, and SPAD performs slightly better than Dette's method.

Table 2
The success rate of detection change point under H 1 .

Table 3
Normal.Mean, standard deviation (SD), and mean squared error (MSE) of three estimators when the change point exists in the middle β = 0.5, T = 100.

Table 4
Student-t.Mean, standard deviation (SD), and mean squared error (MSE) of three estimators when the change point exists in the middle β = 0.5, T = 100.

Table 5
Normal.Mean, standard deviation (SD), and mean squared error (MSE) of three estimators when the change point fraction is β = 0.7, T=100.

Table 6
Student-t.Mean, standard deviation (SD), and mean squared error (MSE) of three estimators when the change point fraction is β = 0.7, T = 100.

Table 7
Normal.Mean, standard deviation (SD), mean squared error (MSE), and iteration time(ITER) of SPACE+SMOTE algorithm for Case 6 when the change point exists in the extreme tail, ϵ = 10 −3 .

Table 10
The estimated change point month in the FRED-MD data set.Times reported on April 2, 2020 that the United States was facing a dire situation, with more than 200,000 confirmed cases in the United States, and the federal medical supplies reserve is almost exhausted.

Table 11
The success rate of detection change point under H 1 and iteration time (ITER) of SPAD+SMOTE estimator under Case 6 when the change point exists in the extreme tail, ϵ = 0.05, T=100.

Table 14
Heteroscedasticity setting.Normal distribution.Mean, standard deviation (SD), and mean squared error (MSE) of three estimators when the change point exists in the tail β = 0.7.

Table 15
Heteroscedasticity setting.Student-t distribution.Mean, standard deviation (SD), and mean squared error (MSE) of three estimators when the change point exists in the middle β = 0.7.