Outliers in dynamic factor models

Dynamic factor models have a wide range of applications in econometrics and applied economics. The basic motivation resides in their capability of reducing a large set of time series to only few indicators (factors). If the number of time series is large compared to the available number of observations then most information may be conveyed to the factors. This way low dimension models may be estimated for explaining and forecasting one or more time series of interest. It is desirable that outlier free time series be available for estimation. In practice, outlying observations are likely to arise at unknown dates due, for instance, to external unusual events or gross data entry errors. Several methods for outlier detection in time series are available. Most methods, however, apply to univariate time series while even methods designed for handling the multivariate framework do not include dynamic factor models explicitly. A method for discovering outliers occurrences in a dynamic factor model is introduced that is based on linear transforms of the observed data. Some strategies to separate outliers that add to the model and outliers within the common component are discussed. Applications to simulated and real data sets are presented to check the effectiveness of the proposed method.


Introduction
Dynamic factor models have been introduced to explain and forecast time series of interest in the presence of a large set of explanatory time series.In practice, usefulness of dynamic factor models is apparent when the dimension is so large that vector autoregressive models are not able to handle the multivariate time series efficiently.Reduction of the available time series to few factors allows efficient and interpretable models to be estimated.Factor extraction has to be accomplished in such a way that only negligible or little amount of information be lost.
The study of the eigenvalues and eigenvectors of the parameter matrices was early suggested by (18) to produce a simplified version of an autoregressive model.A canonical transformation of a vector autoregressive model based on the simultaneous relationships between variables was introduced by (2) .The relationships between different time lags were considered by (9) and (19) in the frequency domain.The principal component analysis was extended in the frequency domain by (3).Identification of the number of factors in multivariate time series process was addressed by (14).
Factor models are strictly related to the diffusion indexes methodology (for instance 20).As pointed out by (6), when the dimension is large vector autoregressive (VAR) and vector autoregressive moving average (VARMA) models are difficult to estimate because the number of parameters grows with the number of time series quadratically.On the contrary, for dynamic factor models the growth is linear.
Usually mutually uncorrelated factors are assumed, whilst individual factor time series may be autocorrelated.The multivariate dynamic structure of the observed Ncomponent time series y t may be modeled through the matrix factor A, that is y t = Ax t + η t where x t is a vector of K independent time series and η t is the idiosyncratic disturbance.Each factor time series may follow a linear model, that is where B is the back-shift operator and ε i,t is uncorrelated white noise.This leads to the dynamic factor model y t = Aθ (B)ε t + η t where θ (B) = diag (θ 1 (B), . . . ,θ K (B)).This model is a special case of y t = ψ(B)ε t + η t as considered in (5), for instance.The equality ψ i, j (B) = a i, j θ j (B)   reduces to the assumption that the impact of any shock ε j,t on the observed time series y i,t decays over time in similar way for any i.This assumption may also be justified on the ground of the asymptotic results by (6), p. 456.Outliers in time series were introduced by ( 7) according to two different models, the additive outlier (or aberrant observation) and the innovation outlier (or aberrant innovation).This latter impacts the observed time series for some time span after the occurrence date, the former affects only one observation at the date of its occurrence.In spite of this, the additive outlier has serious effects on parameter estimates and forecasts, while the effects of the innovation outlier is often less serious.This motivates our choice for modeling outliers in the dynamic factor model as outlying observations of the additive type.
The plan of the paper is as follows.In Section 2 we introduce the outlier structure that we assume to be possibly present in a dynamic factor model.This structure and its implications will be examined in detail in Section 3. A method for checking the adequacy of the dynamic factor model to fit the data will be illustrated in Section 4. Methods for estimating the dynamic factor model parameters are discussed in Section 5 and the impact of outliers on the estimates will be examined in Section 6.In Section 7 a procedure for identifying outliers and estimating their size is presented and illustrated by an example.A simulation experiment for checking the effectiveness of the procedure in comparison with a multivariate model-based method (23) and a projection pursuit-based procedure (8) will be reported in Section 8. Our procedure is then applied to a set of real data, that is some quarterly economic data on asset prices, activity, wages, goods and commodity prices from the seven-country data set studied by (22).Results are reported in Section 9. Section 10 concludes.Proof of theorems are found in Appendix A.

The dynamic factor model with outliers
Let y t be an observed N-component vector time series and the temporal index t = 1, . . ., T .We may even assume that the number of the time series components N is greater than the number T of dates when observations were made.We assume further that though N may be very large the observed time series is actually explained by a much smaller number K of unobservable time series x t = (x 1t , . . ., x Kt ) ′ and an idiosyncratic N-dimension disturbance η t .Then the dynamic factor model with outliers may be written where A is an N × K matrix of rank K, K ≪ N. The outliers occurrence dates are modeled by the binary series {∆ t } and by the N × 1 vector ω which represents the outlier size.
These assumptions are motivated by the idea that the dependence among the observed time series components is entirely explained by the factors.Therefore the idiosyncratic terms are also independent, since otherwise they would contribute to explain correlations between two observed components and should be put into the factor vector.
In general model (2.1) is not identified unless some assumptions are made about either the matrix A or the vector time series x t .In fact, let C be any non-singular K × K matrix.Then in model (2.1) we have By letting A * = AC −1 and x * t = Cx t model (2.1) could be written as well No restriction is made on the matrix A except that its rank is equal to K. Notice that Assumption 1 does not imply any loss of generality.In fact if Γ x (0) = cov(x t , x ′ t ) is not the identity matrix I K we could replace x t with the transformed data Γ The variance-covariance matrix of the transformed data turns out to be the identity matrix.
We prove the following theorem in Appendix A.1.
Theorem 2.1.Model (2.1) under Assumptions 1 and 2 is identified up to factor sign changes.
It has to be noticed that model (2.1) is uniquely determined by Assumptions 1-5 up to a permutation matrix and changing of sign.In fact, the order of the factors may be taken arbitrarily without affecting the model's structure.Moreover, Assumption 1 determines the factors sizes but each factor may be multiplied by ±1 without affecting its variance.
We may write the relationships that link the variance-covariance matrices of the observed data with those of the factors and of the idiosyncratic component Let γ y i, j (h), i, j = 1, . . ., N, denote the entry in row i and column j of the matrix Γ y (h), and γ x i (h), i = 1, . . ., K, denote the diagonal elements of the matrix Γ x (h).Note that Assumption 1 is used that implies Γ x (0) = I in the first equality, while the second equality shows that the matrices {Γ y (h)}'s are symmetric because the {Γ x (h)}'s are diagonal.
It is sometimes assumed that the columns of A are orthogonal.This ensures the advantage that those columns are eigenvectors of all the covariance matrices of {y t } at any lag (see, e.g., 14).However, we feel that such assumption, together with Assumption 1, is somewhat unrealistic and it will not be formulated here.

Outliers in factor models
The estimation of outliers in Equation (2.1) is greatly simplified if a linear transform of the data exists that may highlight the impact of outlying observations.If parameters in Equation (2.1) are assumed known, then, by taking the projection matrix Z = I − A(A ′ A) −1 A ′ , the following lemma is easily proved.Lemma 1.Let the N × K matrix A be defined as in Equation (2.1).Then a N × N matrix Z exists such that ZA = 0 ( 0 is the N × K zero matrix).
Lemma 1 has interesting implications concerned with the outliers estimation in model (2.1).The matrix Z projects the vectors of R N into the space orthogonal to the space V A spanned by the columns of A. Let V ⊥ A denote this orthogonal space.By letting V be the space of the vectors in R N we have V = V A ⊕ V ⊥ A .Any vector in V may be written as the sum of a vector in V A and a vector in V ⊥ A .Then three cases may occur 1. Zω = 0.In this case ω ∈ V A , that is there exist coefficients c 1 , c 2 , . . ., c K such that where a 1 , a 2 , . . ., a K denote the columns of A. We may write ω = Ac, where c = (c 1 , c 2 , . . ., c K ) ′ .Then Equation (2.1) becomes The outliers are entirely within the factors, that is the observed y t are affected by outliers only through the factors.

Zω = 0 and A(A
A .The outliers impact the observed y t but the factors are actually outlier free.3. Zω = 0 and A(A ′ A) −1 A ′ ω = 0. Then ω may be written as the linear combination of a basis in V obtained by assuming the columns of A as a basis in V A and a basis for some coefficients vectors c and µ.Model (2.1) becomes The observed time series y t is affected by an outlier of size ζ that adds to the whole structure and an additive outlier c r in each factor x r,t .
Cases ( 1) and ( 3) may be treated by estimating x + t = x t + c∆ t as if it were actually the model factors.Univariate search may be performed on the estimated x + t factors to discover outliers dates and estimating their sizes.
We underline that in case (1), when Zω = 0, the dynamic model pattern is not affected by any perturbation.This latter is only transmitted by the model from factors to observed data.In that case the projection method we propose here is unable to identify the outliers, and they can only be discovered estimating the factors and employing univariate outlier search.
In cases ( 2) and (3) detection and estimation of outliers that impact the observed y t directly may be performed on the transformed model Note that in case (2) we have ω = ζ while in case (3) Equation (3.1) holds so that ω = ζ .In case (3) the outlier size ω has to be estimated partly in dynamic factor model and partly in the factors.
A similar development applies if the following dynamic model, as proposed by (5), is assumed: where the ψ u 's are N × K matrices, ε t is a K-dimensional completely white noise with variances equal to 1, sK < N and Assumptions 3, 4, 5 above hold.Let V ψ denote the space spanned by the columns of the matrices {ψ u , u = 1, . . ., s} and R N = V ψ ⊕ V ⊥ ψ , and Z the projection matrix onto V ⊥ ψ .We have In this case also ω may be written (but not necessarily in an unique way) as ψ , and the model may be expressed as follows: which decomposes the effect of an outlier into two parts, one that perturbs the dynamic structure of the model by altering the effect of the past values of the factors (c u ) and the other one simply superimposed to the observation (ζ ).
Usually the matrix A is unknown and we may apply the preceding procedure only by computing an estimate Â.The presence of the outlying observations themselves makes the estimation difficult and often unreliable.Under some additional assumptions the following theorems allow an alternative procedure to be entertained which does not require estimating A. In what follows, all eigenvectors are normalized, i.e. they are taken with modulus equal to one.Theorem 3.1.Let y t satisfy model (2.1) with Assumptions 1-5 and suppose that for each j = 1, 2, . . ., K there exists a lag h j = 0 such that γ x j (h j ) = 0. Then z ′ A = 0 if and only if z is eigenvector associated with a zero eigenvalue of each Γ y (h), h = 0.
Proof is in Appendix A.2.Note that the assumptions of Theorem 3.1 are not satisfied if one of the factors is white noise.Theorem 3.2.Let y t satisfy model (3.2) and Assumptions 3-5 and suppose in addition that (i) rank(ψ 1 ) = K (ii) There exists a lag k, 2 ≤ k ≤ s such that rank(ψ k ) = K then z ′ ψ u = 0, u = 1, 2, . . ., s if and only if z is eigenvector associated with a zero eigen- value of each Γ y (h), h = 1, 2, . . ., s − 1.
Proof is in Appendix A. 3. We note that Theorem 3.1 does not hold in general for h = 0 since in that case Γ y (0) = AA ′ + Σ η .Nevertheless, if the idiosyncratic disturbances are homoscedastic then the following theorems hold.Theorem 3.3.Let z be any eigenvector associated to the smallest eigenvalue of Γ y (0).Let us assume, additionally, that Σ η = σ 2 I. Then z ∈ V ⊥ A , that is z ′ A = 0.The converse is also true.
Proof is in Appendix A.4.A similar result holds for the dynamic model (3.2).Theorem 3.4.Let y t satisfy model (3.2) and suppose that Σ η = σ 2 I. Then if z ′ ψ u = 0, u = 1, . . ., s, z is eigenvector associated to the smallest eigenvalue of Γ y (0), and the converse is also true.
Proof is in Appendix A.5.The preceding theorems suggest a procedure to compute a projection of the multivariate time series that allows potential outliers to be readily detected.
If the hypothesis of homoscedasticity is assumed, we may compute an estimate Γy (0) (possibly a robust estimate) of the variance-covariance matrix Γ y (0).Then consider the eigenvectors associated to the smallest eigenvalue of Γy (0) (the smallest eigenvalue may have multiplicity greater than one).Let z be any such eigenvector, then, according to Theorem 3.3 and 3.4, for the univariate time series z ′ y t we have Any such projection of the multivariate time series may be analyzed by means of univariate methods to detect potential outlying observations.Then evaluate the outlier's size by assuming the dates of occurrence of outliers from univariate analysis and using estimation methods in the multivariate framework.
If the homoscedastic hypothesis may not be assumed, the same result is obtained using Theorem 3.1 and 3.2, by taking z equal to the eigenvector associated to a zero eigenvalue of a Γy (h) for h > 0.

Factor model adequacy
A crucial point is whether the simple factor model (2.1) together with our Assumptions fits the data adequately.Increasing the number of factors K does not solve the problem because not all processes may be represented by equation (2.1) for arbitrary K and under Assumptions 1-5, since their autocovariance matrices have to be symmetric as seen before.This suggests that a measure of adequacy of the factor model might be obtained by evaluating the differences between the elements (i, j) and ( j, i) of Γ y (h), or the autocorrelation matrix.Let r i j (h) = γ y i j (h){γ y ii (0)γ y j j (0)} −1/2 , and denote by ri j (h) the corresponding estimate.If Γ y (h) is symmetric, using classical results (see, e.g., 17) we obtain that the difference ri j (h) − r ji (h) is asymptotically normal with mean zero and variance and it depends both on the cross-correlation and autocorrelation functions in a complicated way; furthermore, such differences are correlated for different indexes i and j.Therefore the differences ri j (h) − r ji (h) cannot be used in any plausible way to test the hypothesis that Γ y (h) is symmetric.However, a possible solution is found turning to the frequency domain, in an analogue way as proposed by (9) when estimating parameters of factor models.
In the frequency domain the symmetry of the covariance matrices Γ y (h) for any h is equivalent to a real spectral density matrix for any frequency.Let denote the spectral density matrix of y t .We prove the following theorem in Appendix A.6.
Theorem 4.1.If Γ y (h) is symmetric for any h, then F(λ ) is real for any λ and vice versa.
We shall therefore test the hypothesis that F(λ ) is real for any λ .Let us define the Fourier transforms as the N × 1 complex vectors From Theorem 4.4.1 of (3) it follows that the real 2N × 1 vector [Re d T (λ ) ′ , Im d T (λ ) ′ ] ′ converges as T → ∞ to a normal random vector with mean zero and variance covariance matrix: If the spectral matrix is real, Im F(λ ) = 0, thus Re d T (λ ) and Im d T (λ ) are (as T → ∞) independently identically distributed normal vectors with zero means and variance covariance matrix 1 2 F(λ ).Therefore the hypothesis of real spectral density is equivalent to the independence of two normal vectors and may be tested by means of likelihood ratio.However, only one observation would be available for each fixed λ .To overcome such difficulty, and to test reality for any λ , we use a device similar to (9).
Let λ j = 2π j/T denote the Fourier frequencies, and suppose that T is sufficiently large so that for a set of frequencies {λ j , a < j ≤ b} we can assume F(λ j ) ≃ F. Also.let J = b − a and where we have dropped for convenience the dependence on T .For T large we may assume that while under the null hypothesis H 0 : F real, X R j and X I j are independently identically distributed normal vectors with zero means and variance covariance matrix 1  2 F. Define the variance estimates: The following theorem provides the likelihood ratio test.
Theorem 4.2.The likelihood ratio test statistic for the null hypothesis H 0 : F real is given by I + (S −1 R S I ) 2 and its distribution under H 0 is equal to that of the statistic U N,N,J−N−1 of (1, chap.9).
Proof is in Appendix A.7.Some approximations are discussed in (1), which for our statistic imply approximating −m logU (m = J − N − 3  2 ) by a chi-square variable with N 2 degrees of freedom.The test of Theorem 4.2 may be employed repeatedly on nonoverlapping frequency intervals, and the usual caveats for multiple testing apply (see, e. g., 11, chap.5.4).

Estimation problems
Outliers identification is only concerned with the detection of time points where outlying observations occur.When this task is performed by examining a univariate projection series, as is suggested in this paper, little may be said about the multivariate outliers size.The model parameters matrices, either A in the model's formulation (2.1) or {ψ 1 , ψ 2 , . . .} in the model's formulation (3.2), have to be estimated from available data {y t ,t = 1, . . ., T }, along with the common components and the variance covariance matrix Σ η of the idiosyncratic component.This way outliers size may be estimated while the estimated model is available for studying the relations among the observed time series or for forecasting purpose.
We shall distinguish in estimation procedures whether the idiosyncratic covariance matrix is constrained to the relationship Σ η = σ 2 I, or to be a diagonal matrix, or no special constraints are imposed on its entries.Also, we consider here the unperturbed case of absence of outliers.The distortion induced by the presence of an outlier will be considered in the next section.

Let us define the
where the y t 's are N × 1 arrays, and the K × T matrix where the x t 's are K × 1 arrays.Then the sum of squares in the log-likelihood may be written tr{ and its minimization is equivalent to maximizing the likelihood.Model (2.1) is considered by ( 20) and ( 21) who assume normality, Σ η = σ 2 I, and treat the factors {x t } as deterministic components.They show that, on maximizing the likelihood with respect to both {x t } and A, the maximum likelihood estimate of A is given by the matrix formed by the K eigenvectors associated to the K largest eigenvalues of Γy (0).They assume A ′ A = I; if we want to dispense with such hypothesis, we may use instead the fact that Γ x (0) = I (actually XX ′ = I is assumed for simplicity).This leads to the following different estimate.
Theorem 5.1.Let {y t } satisfy model (2.1) with Assumption 4 with Σ η = σ 2 I, and suppose that {x t } are constants and XX ′ = I.Let r be the rank of Y and K be a known pre-specified integer, so that 0 < K ≤ r ≤ min(N, T ), and let Further, the formula for Â reduces to Proof is in Appendix A.8.The estimate of the matrix A given by Theorem 5.1 is consistent as the estimate YY ′ /T is known to be consistent and its eigenvalues and eigenvectors (which are continuous functions of the elements of the matrix YY ′ /T ) are consistent as well.If the observed data are not standardized, and if the eigenvalues are all distinct and the true variance covariance matrix is definite positive, then it may be shown that the eigenvalues are asymptotically independently normally distributed.The difference between estimated eigenvalues and actual ones is of order 1/ √ T in probability.The estimates of the eigenvectors are asymptotically normally distributed but they are not independent (see, e.g., 16, p. 290).Rate of convergence to actual eigenvectors is of order 1/ √ T in probability.If we assume that the factors {x t } are random processes, the method of linear factor analysis may be employed.To overcome the problem that the factors are autocorrelated, (9) has introduced a frequency domain extension of the factor analysis which may be directly applied to model (2.1) assuming that Σ η is diagonal but not necessarily homoscedastic.
An alternative estimation method is using a Kalman filter in a state space formulation of the model, where (2.1) is considered as a measurement equation and {x t } is the state vector.In that case, a transition equation has to be specified for the factors x t which may be convenient if we assume that the process {x t } is easily modeled in state space form (if, for example, it is assumed a low order autoregression).
Finally, an estimation method which does not rely on any assumption on Σ η may be obtained using a technique of temporal blind source separation, for instance the temporal decorrelation source separation method (25) which uses an algorithm for approximate simultaneous diagonalization of several covariance matrices.Under model (2.1) we have and, taking the generalized inverse Since Γ x (h) is diagonal for any h = 0, we may determine B in such a way that the off diagonal elements of BΓ y (h)B ′ are as small as possible.Formally, the matrix B is obtained by the following approximation problem min Let B denote the solution, then A maximum lag H has to be chosen.This may be selected by estimating the covariance matrices of the data Γy (.) and taking H the minimum lag such that all entries of Γy (h), for h > H, are not significantly different from zero.Moreover, in order to recover the matrix A is necessary that ( B B′ ) −1 exists, therefore the solution matrix B should have full rank.Though it is easily seen that under model (2.1) this method is consistent, its sample properties appear very hard to be devised.
If model (3.2) seems more suitable to describe the data, the estimation methods proposed by ( 5) may be applied.

Bias induced by the outliers on the estimates
We turn now to consider the bias induced on the estimates of time-domain and frequencydomain indexes by the presence of an outlier.
Suppose that the observed time series {y t , t = 1, . . ., T } contains an outlier at time t 0 and size measured by the vector ω.Let z t denote the unperturbed data, We compare the autocovariance or spectral density estimates computed on the actually observed series with the corresponding unbiased estimates that would have been obtained from the unperturbed time series z t .Note first that for the average ȳ we have ȳ = 1 where ∆ t = 1 if t = t 0 and zero otherwise.If we denote the unperturbed autocovariance estimates (computed on the unobserved time series z t ), the actual estimates may be written and a simple calculation gives It follows that the difference between unperturbed and actual estimates is of order 1/T in probability.We prove in Appendix A.9 the following theorem which states a more precise result.
Theorem 6.1.Let {y t , t = 1, . . ., T } be part of a realization of a second order stationary process with finite covariance matrices and suppose that at time t 0 an outlier equal to ω is added.Then if the γi j (h)'s denote the covariance estimates and the γi j (h)'s denote the correspondent estimates computed on the unperturbed time series, . Results in the frequency domain may also be obtained, with the usual asymptotic approximations.On denoting by the Fourier transforms of the unperturbed data, we may write If we consider only the Fourier frequencies λ j = 2π T j, the third term disappears and and simple calculations show that for the actual and unperturbed periodograms where We prove in Appendix A.10 the following theorem which gives more specific results as regards each entry of the matrix of differences between periodograms.Theorem 6.2.Let {y t , t = 1, . . ., T } be part of a realization of a second order stationary process with spectral density matrix F(λ ) and suppose that at time t 0 an outlier equal to ω is added to the time series.Then if I(λ ) denotes the periodogram matrix and Ĩ(λ ) denotes the periodogram matrix of the correspondent unperturbed series, for each Let s(T ) be a sequence such that λ (T ) = 2πs(T ) tends to λ as T → ∞.Then, as T → ∞, The preceding results enable also to evaluate the bias induced by an outlier on spectral estimates.Let be a consistent spectral estimate where λ j = 2π T j and w T (.) is a spectral window (see, e.g., 17), and denote by F(λ ) the same form computed on Ĩ(λ ).We assume that for each T a truncation point M = M(T ) has been selected such that 2π Proof is in Appendix A.11.
The preceding discussion offers some arguments for choosing among estimation methods.The methods based on the variance-covariance matrix or on the spectral density of the observed data have the drawback that the estimates may be biased by the presence of the outlier.The influence of the outlier on the estimates is of order 1/T and this would suggest that for large samples the two methods may perform equally well.Spectral methods, however, seem to constitute the favorite device to be used for checking dynamic factor model adequacy.The methods based on temporal decorrelation on the average are not affected by the presence of the outlier.This would suggest that these methods are likely to yield more reliable results.However a treatment of the sampling properties is not available and model adequacy has to be checked in any case to ensure that temporal decorrelation may be applied properly.Summing up, it seems that no estimation method may be declared the best one, and trying different methods seems advisable.

Outlier identification and size estimation procedure
Let Y be the observed data arranged as a matrix of N rows and T columns.The entry Y it stands for the observation at time t of the i-th time series, t = 1, . . ., T and i = 1, . . ., N. We assume that a dynamic factor model may be tentatively fitted to the data with unknown number of factors and possibly outlying observations of unknown size at unknown dates.The idiosyncratic component is unknown as well.Given Assumptions 1-4 and the observed data set, model (2.1) is to be identified and estimated.
A procedure for estimating the unknown components of model (2.1) and performing outlier identification and estimation may be described as follows.
1.The optimal direction for locating the occurrence of outlying observations requires the computation of the variance-covariance matrix of the data Γy (0) and of its smallest eigenvalue λ , i.e.
where Ȳ = ȳ1 ′ .The N × 1 array ȳ is the components average computed over time and 1 is the all-one T × 1 vector.Then the eigenvalue λ and the associated N × 1 eigenvector z may be computed.The linear transform w = z ′ Y yields the 1 × T time series that may be searched for the outlier occurrence dates.Let w and σ w be the mean and standard error of w.Then the outlier is identified at time t if |w t | is the largest value such that |w t − w| > ασ w where α is a suitable constant.
Since this procedure is based on Theorem 3.3, it strictly holds only in the homoscedastic case, and is only approximately appropriate when such hypothesis does not hold.An alternative would be computing the eigenvector associated to a zero eigenvalue of one of the matrices Γy (h) for h > 0, for example Γy (1), according to Theorem 3.1 and 3.2.2. If an outlier is detected, then the multivariate time series may be adjusted by interpolating (by multivariate linear interpolator) or forecasting (by vector autoregressive (VAR) model) its value at the outlier date.Another strategy may consist in assuming that there is a missing value at the time of the outlier possible occurrence.Anyway, the potential outlier is either replaced by its conditional mean or a missing value is assumed, and the outlier free estimates of variance-covariance and lagged covariance matrices may be computed up to a pre-specified maximum lag M. Robust estimation methods may constitute a valuable choice.Either outlier free or robust estimates of the spectral density matrices have to be computed at the Fourier frequencies λ j , where λ j = 2π j/T and j = −T /2 + 1, −T/2 + 2, . .., T /2. 3. Checking conformity of the observed data to the dynamic factor model may be done by using the spectral density estimates.For instance, let the frequency interval 0 < λ ≤ π be divided in 4 sub-intervals and assume for simplicity that T is integer multiple of 4.More than four sub-intervals may obviously be used, also according to the number of the data.If there is no reason for privileging any frequency components, equally wide sub-intervals may be selected.Symmetry considerations allow us to consider only the interval (0, π) instead the whole interval (−π, π).Then in each sub-interval (λ a+1 , λ b ) there are J = T /4 Fourier frequen- cies, i.e. {λ j , j = a + 1, a + 2, . .., a + J = b}.In the ℓ-th sub-interval a = (ℓ − 1)J and b = ℓJ.The likelihood ratio test statistic U is provided by Theorem 4.2.The null hypothesis H 0 that the covariance matrices are symmetric, i.e. the dynamic factor model may not be rejected, has to be checked by using the approximate statistic −m logU (m = J − N − 3 2 ) which, under H 0 , is distributed as a chi-square with N 2 degrees of freedom.4. Once the model has been found appropriate, the number of factors K has to be estimated.A simple device is based on the eigenvalues of the variance-covariance matrix Γy (0) that is Γy (0) corrected for potential outliers.Let be the eigenvalues of Γy (0) arranged in non increasing order and consider the cumulated sums We assume K as the number of factors if it is the smallest integer such that V K /V N > 1 − α, where α is a small real number, 0.05 for instance.That is, the number of factors is chosen so as the cumulated normalized eigenvalues of Γy (0) exceed a fraction 1 − α that is judged large enough.We may want to take into account that the smallest eigenvalues may be greater than zero.So a better approximation to the correct number of factors may be obtained by assuming K as the smallest integer such that {V K + (N − K)λ min }/V N > 1 − α.More information may be used for choosing K along these same guidelines by using the eigenvalues of the (corrected) spectral density matrix at frequency zero which is essentially the sum of the covariance matrices for lags −M, −M + 1, . . ., 0, 1, . . ., M. The covariance matrices may be used as well separately, and so the spectral matrices at non zero frequencies.However, different matrices may lead to different estimates of K though the same value is to be expected in most cases.5. Estimation of matrix A and factors X allows the dynamic factor model to be specified completely.In addition, both A and X are needed to estimate the outlier size and to distinguish if the factors, the model or both are affected by the outlying observations.Several methods are available, for example we may list the following ones.
(a) Temporal decorrelation, i.e. a matrix Â not necessarily squared nor orthogonal may be computed by approximate joint diagonalization (24) of matrices Γy (h), h = 1, . . ., K. Let the N × K matrix Â be such matrix, that is where the Dh 's are diagonal K × K matrices.Then we may let (b) Assuming, for the purpose of estimating A, that the factors are not random, as suggested by (21; 22), the method outlined in Theorem 5.1 may be used, which only requires the K eigenvectors associated to the K largest eigenvalues of the matrix YY ′ be computed.Assuming W K the N × K matrix whose columns are the column eigenvectors and Λ K the diagonal matrix with the largest eigenvalues of YY ′ on its diagonal, the estimate of A is given by the simple formula Â = W K Λ 1/2 K .(c) Methods developed in the factor analysis framework may be used.Consider again the log-likelihood of the dynamic factor model Let us assume at this stage that the matrix Σ η is known.Maximizing the likelihood with respect to matrix A only yields where Λ K is the diagonal matrix with the K largest eigenvalue of the matrix on its diagonal and Q is the matrix whose columns are the corresponding eigenvectors.On the other hand, given A, the likelihood is maximized by letting Σ η = diag( Γy (0) − AA ′ ).By substituting Â to A we have the approximate formula Ση = diag( Γy (0) − Â Â′ ).
Given an appropriate starting value for Σ η we may apply iteratively the formulas that yield Â and Ση until some convergence criterion is met.For instance, the algorithm may stop if the difference between two consecutive values of the maximized log-likelihood is less than a pre-specified tolerance constant.Proofs of formulas are provided by (13), pp.367-370, who warn that this method does not guarantee convergence.Nevertheless the algorithm is simple and effective in most cases, and the entries on the diagonal of Σ η are not constrained to be all equal.
6. Finally the outlier size ω may be estimated.Note that xt 0 may possibly include a within-factor outlier α.For the static model, if an outlier has been detected at t = t 0 and given Â and X, we may write the log-likelihood of η t 0 Maximization of the likelihood with respect to To estimate the outlier size ω we will obtain an estimate α so that According to (7.1) the vector ω may be written as the sum of a vector that is obtained as a linear combination of the columns of Â and a vector which is orthogonal to the space spanned by the column of Â.The outlier α impacts the factors while ζ impacts the model structure as a whole.We may estimate the size of α for each one of the components x1 , . . ., xK essentially by building the linear interpolator of each xit 0 , i = 1, . . ., K, based on values observed at t = t 0 , and identifying an outlier at each time when the interpolator is too different from the estimated xit 0 (see e.g. 4, for details).
As an example, let us simulate T = 100 observations of the multivariate N × 1 time series {y t } with N = 5 generated by the model We assume K = 4 factors, so that the matrix A has 5 rows and 4 columns.Let where Φ = diag(.7,−.7, .7,−.7) and {ε t } is normal white noise with zero mean and variance σ 2 ε = 1 − 0.7 2 .This ensures that the (theoretical) variance of each factor is unity.In addition, as Φ is diagonal, and by normality assumption, (theoretical) factors are independent.The idiosyncratic component {η t } is assumed a zero mean normal white noise sequence with variance σ 2 η = 0.04.The outlier was located at t = 100, that is at the end of the series.Most outlier detection methods are not able to discover potential outliers at the end (or beginning) of the observed time series.
Outlier size was ω = (1.5, −1, 0, −4, 5) ′ .Each component of the generated time series y t has (theoretical) variance equal to 1.29, excepted the first and the last one that have variances 1.04 and 0.29 respectively.The outlying observation is rather large only compared to component series 4 and 5, in the remaining cases the outlier size does not exceed twice the standard error of the component series.The standard errors of the simulated time series {y t } with outlier are (1.0970,1.2858, 1.1967, 1.1906, 0.7415).
The assessment of the number of factors has been performed by examining the eigenvalues of both variance-covariance and spectral density matrices.In Fig. 1 the cumulated eigenvalues of F(0) are plotted.It has to be noticed that the smallest eigenvalue is greater than zero, so that the threshold that serves as a decision rule about the number of factors has been computed accordingly (see point 4 above in this Section).This way the correct number of factors K = 4 may be identified.
The univariate time series obtained as the linear combination {z ′ y t } is displayed in Fig. 2. The outlier at t = 100 is clearly highlighted.Mean and standard error of {z ′ y t } have been computed equal to 0.0902 and 0.6463 respectively.The standardized value at t 0 = 100 results equal to 4.7738, larger than the Tchebychev upper bound 4.47 which corresponds to the 5% level.
A VAR has been estimated for the observed time series {y t } and the one-step-ahead forecast has been taken to replace the last observation.The corrected time series was then used to compute the variance-covariance matrix, the covariance matrices at lags 1-4 and the spectral density matrices for 100 Fourier frequencies from −π to π.For checking that the estimated covariance matrices could be assumed symmetric, the interval (0, π) has been divided in 4 non overlapping intervals, each of which included 25 frequencies.We obtained for the test statistic the values 10.64, 8.93, 8.77 and 12.32, with 25 degrees of freedom: the critical value at the 5% level is 37.65.As it is greater than the computed statistics, we may not reject the dynamic factor model hypothesis.
The estimates Â and X have been computed by using the three techniques described in this Section, that is approximate temporal decorrelation, eigenvalues-based and iterative maximization of the likelihood function.The latter two methods are similar and yield indeed similar results.The first method is an entirely different approach that takes explicitly into account the covariance matrices at higher lags.
Nevertheless, the estimated dynamic factor model fits the data quite well no matter what method has been used.We report only the plot of the simulated time series (blue line) along with its estimate yielded by the approximate temporal decorrelation algorithm (red line).Other methods yield estimates that overlap almost exactly.In Fig. 3 the observed (simulated) and forecasted (estimated) series are plotted for each component.We may notice that the outlier is not generally apparent by the visual inspection of the graphics.
Then the outlier size has been estimated as the sum of the two components, the first one in the dynamic factor model and the second one in the factors.The two components are orthogonal to each other.Also the first one is orthogonal to the space spanned by the columns of Â while the other one impacts the dynamic factor model as a linear combination of the columns of Â.The estimates are displayed in Table 1.The three methods yield similar estimates of the total outlier size ω and of the component ζ that impacts the overall model.The sizes of the outlier α within the factors differ because the estimated factors themselves depend on the matrices Â estimated by each of the three methods.The differences are small, however, if we compare the arrays Â α.
The results reported in Table 1 seem reliable as regards the recovering of the outlier size ω.We may compute from the 'true' outlier size ω the arrays α = (1.161,−0.903, −0.903, −0.839) ′ is close to Â α.

A simulation experiment
We performed a simulation experiment by replicating 1000 times a dynamic factor model and applying three methods for outlier detection and estimation.This way we wanted to test the effectiveness of the method that we are proposing in this paper (let us call it ODFM).Then, we made a comparison with two methods that were available for detecting and estimating outliers in multivariate time series.The first one was proposed by (23) to detect and estimate four types of outliers in multivariate time series modeled by a vector autoregressive integrated moving-average (VARIMA) model (let us call it OARMA).The second one was proposed by (8) as a projection pursuit approach to detect and estimate four types of outliers in multivariate time series not necessarily generated by a VARIMA model (let call it OPP).We confine our attention only to the most common types of outliers, namely the additive outliers (AO).An AO impacts the series only at the time of its occurrence, while neighboring observations remain unaffected.Other outlier types were defined in the literature: innovation outliers (IO), level changes (LS) and temporary changes (TC).However, IO's are only defined when the data are assumed to follow a VARMA model, which is not our case.LS's arise when the mean levels of each component series change at once, and then remain constant.They are equivalent to AO in the difference, and may be identified by analyzing the differenced data.Finally, a TC in multivariate time series data is defined at t = t 0 if a constant ω which defines the outlier size is added to y t 0 and δ k ω is added to y t 0 +k , k > 0, where 0 < δ < 1 is a scalar constant.We feel that a TC is a very unlikely behavior in real data, in any case it is easily identified by the existence of an exponentially decaying impulse at rate δ in the univariate projection series z ′ y t .
The method of ( 23) assumes that the (N-dimensional) multivariate time series {y t } may be modeled as t , where the unobservable multivariate time series {x t } is generated by the VARIMA model In the latter equality, are N × N matrix polynomials of finite degrees p and q, c is a N-dimensional constant vector, and {ε t } is a sequence of independent and identically distributed normal random vectors with zero mean and covariance matrix Σ ε .Some assumptions are needed to ensure that is a well defined moving average model.Then, α(B) = Ψ(B) defines an IO and α(B) = I an AO.The date of the outlier is defined by the binary variable ξ (h) t which equals 1 if t = h (that is, the outlier occurs at t = h) and 0 otherwise.
The method of (8) aims at discovering the univariate projections of the multivariate time series that best highlight the presence of outliers.The directions that yield the most useful projections are given by the projections that either maximize or minimize the kurtosis.Moreover, orthogonal directions are to be taken into account as well.The number of projections to be examined for outlier detection is 2N, where N denotes the dimension of the multivariate time series.If IO's have to be detected, then the method applies to the residual multivariate time series computed from a suitable model fitted to the observed data.
The matrix A may be verified to have full rank.The (theoretical) factors are independent because both Φ and Θ are diagonal matrices and the {ε t }'s are uncorrelated normal random variables.All idiosyncratic components {η t } were assumed zero mean normal with variance σ 2 η = 0.04.We checked two outlier configurations.The first one was an isolated multivariate outlier at t = 100.The second one was a patch, that is a sequence of neighboring outliers at t = 99, 100, 101.The size of each and every outlier was chosen equal to 0.6.This figure was chosen in comparison with the standard error σ η = 0.2 of each of the idiosyncratic components.The total multivariate outlier size is the N-dimensional vector ω with entries all equal to 0.6.The outlier ω may be split in a term ζ orthogonal to the columns of A, that is and a term Aα which is a linear combination of the columns of A with coefficients The coefficients α may be thought of as the sizes of an outlier that impacts the factors.
All computations were performed by using the Matlab package.For each of 1000 replications we generated 300 independent identically normally distributed K-variate random vectors and 200 independent identically distributed N-dimensional random vectors all with mean zero and variance-covariance equal to the unit matrix.From the 300 K-dimensional random vectors 300 observations were generated from the ARMA model (8.1).Then, the data were transformed as explained before, to obtain unit variance factors.The first 100 K-dimensional data were discarded to remove the effect of the (random) initial values.As a result, a K-dimensional factor time series of length 200 was obtained.The N-dimensional white noise was pre-multiplied by the inverse of the square root of the matrix Σ η .This was the artificial idiosyncratic component that was added to the factor data.Finally the two outlier structures were superimposed to the artificial data generated from the dynamic factor model.In each replication, and for each of the two outlier structures, the methods ODFM, OARMA and OPP were applied for outlier detection and estimation.The usual Monte Carlo simulation procedures were used to compute the percentages of both correct and false identifications and the average and standard errors of the estimates.A synthetic measure of the distance between the estimated and true outlier was obtained by computing the norm of the vector difference between the estimated and true outlier size.
In the present context it seems of interest to report some results concerned with the validity of a dynamic factor model to fit the data.We divided the frequency interval (0, π) into four sub-intervals of equal size, namely Each sub-interval included T /8 − 1 frequencies (here 24 frequencies as T = 200), and the sub-interval centers were π 8 , 3 8 π, 5 8 π, and 7 8 π respectively.The null hypothesis, H 0 : variance-covariance matrices are symmetric, tested using the LRT statistics of Theorem 4.2 was never rejected at significance level 5% neither in the presence of an isolated outlier nor an outlier patch.
Table 2 shows that the number of factors (K = 4) was correctly estimated in almost all replications.
For method ODFM we computed the N − K univariate projections obtained as linear combination of the multivariate time series where v i is the eigenvector of the variance-covariance matrix Γy (0) of the observed multivariate time series associated with the eigenvalue λ i .The eigenvalues were ar- ranged in ascending order, that is the eigenvectors v 1 , . . ., v N−K belong to the smallest eigenvalues λ 1 , . . ., λ N−K respectively.Then, the presence of an outlier in the multivari- ate time series {y t } was detected at time t if for some (that is, at least one) 1-dimensional time series {w i,t }. wi and σ w i were the w i,t sample mean and standard deviation respectively.The threshold parameter k α may be computed according to the Tchebychev inequality.We chose the significance level α = 0.05 so that approximately k α = 4.47.
The parameters of the dynamic factor model were estimated along the guidelines given by Theorem 5.1 in Section 5 (see step 5(b) in Section 7 as well).The difference between the observed time series and the estimated dynamic factor model values was assumed to yield the estimate of the outlier size ζ .Then outliers α were estimated in each factor by using Formula (2.2b) p. 194 of (4).The total outlier size was obtained as ω = ζ + Â α.
The OARMA method was implemented along the guidelines given by (23).A VAR model of order M = 4 was fitted to the multivariate time series {y t }.We assumed that only outliers of either additive or innovation type could be present.The Mahalanobis type statistic for either type of outliers i,h ωi,h was computed for each time t = h and i = I for IO and i = A for AO.Σ i,h denotes the covariance matrix of the estimator.If the maximum across time of J A,h or J I,h exceeded their respective 95-th percentile, then either an AO or an IO was assumed at h = h max according to which statistics J max (i, h i ) = max h J i,h , i = I, A, was the greatest.Then the outlier size was estimated and its effect removed from the multivariate time series.The procedure was iterated until no more outliers were found.Tables of percentiles of the statistics J max (I, h I ) or J max (A, h A ) are available only up to dimension 10 (see (23), Table 1 p. 797, and (8), Table 4 p. 664).So we computed empirical percentiles from 10000 artificial multivariate time series generated by the dynamic factor model with N = 20 and T = 200.We obtained J max (A, h A ) = 47.7911 and J max (I, h I ) = 46.6493at the 5% significance level.
The estimates of the outlier size were computed by using the Formulas ωI,h for innovation outliers and ωA,h for additive outliers provided by (23) p. 794.
As far as the OPP method is concerned the direction that maximizes the kurtosis of the linear projection of the multivariate time series {y t } and all orthogonal directions had to be computed.The direction that minimizes the kurtosis and its orthogonal directions had to be computed as well.To compute these 2N projections we used the Matlab routines by (15) available on the web (http:// halweb.uc3m.es/fjp/download.html).In this case too we confined ourselves only to AO and IO.In this latter case, the procedure was applied to the multivariate residual time series {a t } computed by fitting a VAR of order M = 4 to {y t }.Then, each and every projection was searched for outliers by using the univariate counterpart of the statistic in the OARMA method.The maximum across time and across projections was computed and let Λ A denote the maximum found on the projections of the multivariate time series {y t } and Λ I denote the maximum found on the projections of the multivariate residual time series {a t }.Both Λ A and Λ I were compared with their appropriate thresholds and either an AO or an IO was detected if the greatest of Λ A and Λ I exceeded its threshold.As tables of percentiles of Λ A and Λ I are available only up to N = 10 (see (8), Table 2, p. 663), we computed the 95-th percentiles by simulation from the dynamic factor model with N = 20 and T = 200.In this case we obtained after 10000 replications Λ A = 5.9772 and Λ I = 6.0392.Once date and type of outliers were determined, to estimate their size we fitted a VAR model of order M = 4 to the multivariate time series {y t } and computed ωI,h for innovation outliers and ωA,h for additive outliers as in ( 23) p. 794.
The results obtained by applying the three procedures to discover the outlying observation in the artificial data set with an isolated outlier are displayed in Table 3.The Monte Carlo statistics were computed on 1000 replications.In order to compute the standard errors of the estimates of the percentages of detection, we divided the replications in 40 groups of length 25 each.The percentages were computed for each group of replications, then we could compute the standard error of each percentage by using 40 values.The percentage of correct detection of the isolated outlier in t = 100 was greater than 95% for all methods.The standard errors are comparable as regards methods ODFM and OARMA but the standard error is slightly larger for the OPP method.Both OARMA and OPP methods identified in most cases an outlier of innovation type possibly because the IO allows for greater flexibility as far as fitting the observed data is concerned.On the other hand, if we constrained the OARMA and OPP methods to search for AO only, then the percentages of correct detection dropped dramatically.We recorded false outlier detections as well.The number of replications (percentage) where one or more wrong detections occurred was low for methods ODFM and OPP while the method OARMA wrongly detected outlying observations in every replication.As overall figures, the observations detected as outlying ones over 1000 replications (series length 200) were 17, 80, and 9003 for the ODFM, OPP, and OARMA methods respectively.
The results for the data set with a patch at times t = 99, 100, 101 are displayed in Table 4. Considering the outlier separately, the ODFM method detected each of the three outliers about 99%, while for the OARMA and OPP methods high percentages were recorded only for the detection of the first outlier in the patch.Again, these latter methods almost always identified the outlying observations as IO.As a consequence, the outliers in t = 100 and t = 101 could be well explained by the innovation outlier structure.The percentages of false identifications were rather low for the ODFM and OPP methods, while the OARMA method wrongly identified outliers in all 1000 multivariate time series.The average number of false outliers was in this case about 7, that is less than in the case of multivariate time series with an isolated outlier.By considering the patch as a whole, correct identification was performed 97.5% by ODFM, only about 58% by OARMA and never by OPP.At least an outlier in the patch was detected by ODFM, OARMA and OPP 100%, 85.09% and 85.4% respectively.
Outlier size estimates are displayed in Table 5 for the case of the isolated outlier and in Table 6 for the case of the outlier patch.The adequacy of the estimates was evaluated by the norm of the difference between the estimated and true outlier size vectors.This figure yields a kind of distance measure which accounts both for bias and variability of the estimates.We may remind that the outlier size is equal to 0.6 for all component time series.For the ODFM method we distinguish the total outlier from its orthogonal part.This is of interest because the dynamic factor model structure is directly impacted by the array orthogonal to the columns of the matrix A, while the rest impacts the factors.For the other methods, OARMA and OPP, we distinguish between AO and IO identification.We may notice that as far as the ODFM method is concerned, the orthogonal part is close to its true counterpart while the estimated total outlier seems more variable.The distance is about 7 in all cases.As regards the other methods OARMA and OPP the distance between the estimated and true outlier size is smaller, approximately between 4 and 5.In the case of OPP this figure is even smaller (about 2.5) if the outlier is identified as AO.Such distances depend both on bias and variability.To distinguish between these two sources, we present the bias and standard errors of the estimates of ω i in Table 7 for the case of isolated outlier.Figures are bias and standard errors of estimated outlier sizes in each component, over the correctly identified replications, and averaged on the 20 components.It may be seen that the proposed ODFM method provides less biased estimates of the outlier size, while the variability is larger than the projection pursuit method.Since the distances of the estimated ζ from the true values are generally small (see Table 5 and 6), we conclude that estimation of univariate outliers in the (estimated) factors, or interaction between them and the estimates of the factor matrix A, are responsible for such a larger variability.This may be caused by the method employed in Theorem 5.1 for estimating {x t }.A more efficient method is currently under investigation.

Application to real data
We used a quarterly economic data set which covers three countries, France, Germany and Italy, from 1960 to 1999, for illustrating method ODFM.The data are taken from the seven-country data set used by (22) to compute combination forecasts of output growth.The preliminary transformations suggested in the aforementioned paper were applied.Then, 154 transformed data for each time series were available from III quarter 1960 to IV quarter 1999.We considered only the time series indicated by 'a' in Table Ib in (22).For France 7 series were available from 1960 to 1999, 9 series for Germany and 11 series for Italy.Time series labels, description, transformations and countries are displayed in Table 8.
The data set was composed of 27 time series of length T = 154.The number of factors was checked by using the eigenvalues of the variance-covariance matrix.The eigenvalues were computed and arranged in descending order.The cumulated sum is plotted in Fig. 4. The smallest integer such that the cumulated sum exceeded 0.95 was 6, so that we assumed the number of factor K = 6.
The symmetry of the variance-covariance matrices computed for lags 1, 2, 3 and 4 was checked along the guidelines given in Section 4. Two statistics for this test were computed by averaging the periodogram in two frequency bands, centered in λ = π/4 the first one and in λ = 3π/4 the second one.We obtained respectively 174.81 and 143.88 which are not significant, so that the null hypothesis was not rejected.
The identification stage required the computation of N − K = 14 projections that were searched for outlying observations along the guidelines given in Section 8.The projections that exceeded the thresholds computed by the Tchebychev inequality suggested the presence of outliers at times t = 23, 24, 37, 63, 124.The graphical display of these projections in Fig. 5 shows that each projection disclosed an outlier separately, excepted the third projection (outliers at t = 23 and t = 24).
Examination of the estimated outlier sizes allowed the time series that most contributed to the multivariate outlier to be discovered.We could see that the outlier at t = 23 (I/1966) was apparent in time series rbndl (g) and pgdp, rbndl, rstock, unemp (i).The outlier at t = 24 (II/1966) was mainly determined by roil (f), roil and rstock (g), and pgdp, rbndl, rstock, unemp (i).The outlier at t = 37 (III/1969) was mainly determined by rbndl (f), rbndl (g), and rbndl and unemp (i).The main influence on the outlier at t = 63 (I/76) came from rbndl and rgold (f), rbndl and rgold (g), and rbndl, rbndm, rcommod, roil (i).The outlier at t = 124 (II/1991) was apparent only in time series rbndl (f).The interest rate of long-term government bonds (rbndl) was present in all outliers, that is this time series produced most of the unexpected values observed in the multivariate time series.Its effect was common to the three countries at t = 37 and t = 63 while its effect was limited to Germany and Italy at t = 23, Italy at t = 24, and France at t = 124.Some time series were available for Italy only, so it seemed of interest to apply the procedure to the data set which included only the quarterly economic Italian data.We could fit a dynamic factor model with 3 factors and 4 outliers were found, at t = 23 (pgdp, rbndl, rbndm, rstock), t = 24 (pgdp, rbndl), t = 55 (rbndm, roil, rgold, rstock, unemp), and t = 103 (rbndl, rbndm, roil, rstock, unemp).Observation at t = 55 corresponds to I/1974 and t = 103 to I/1986.By comparing the results for Italy with those for the three countries together we could argue that some outliers were 'international' while others regarded only one or two countries.So we had to consider outliers at t = 23 and t = 24 as 'international' while outliers at t = 55 and t = 103 as 'national'.Some of the time series that originated these two outliers, that is rbndm and unemp, were present in the Italian data set only.Similarly we could not consider the outlier at t = 124 as 'international' but 'national' limited to the quarterly economic French data.Note that the outlier at t = 23 was not considered present in the French data, but dates t = 23 and t = 24 are close so that the outlying observations were possibly related to some common circumstance.

Conclusions
We presented a method to discover outliers in multivariate time series generated by a dynamic factor model.This method was found to yield best results compared to two other methods aimed at discovering outliers in multivariate time series in a different framework.Our method relies on the assumption that the multivariate time series is generated by a dynamic factor model, therefore the procedure to check the dynamic model adequacy for fitting the data should be carefully applied to ensure that genuine outliers could be discovered.If assumptions were carefully checked and requirements met, both the simulation experiment and the application to real data showed that the method presented in this article was effective for outlier detection and estimation, cautious against false outlier identification and, in addition, simple to implement.The estimates of the total outlier size were found less biased, but more variable than those obtained by using the other two methods.On the contrary the estimates of the size of the part of outlier that impacts the dynamic factor model without affecting the factors were found accurate and close to their respective true values.Improvements of the estimation method is currently subject of further research.A.9. Proof of Theorem 6.1 Proof.The result for the mean is obvious.For the variance we have, for example, var{T ( γrs (0) − γrs (0))} = E{(z rt 0 − zr )ω s + ω r (z st 0 − zs )} 2 = ω 2 s E(z rt 0 − zr ) 2 + 2ω s ω r E{(z rt 0 − zr )(z st 0 − zs )} + ω 2 r E(z st 0 − zs ) 2 .Then, by recalling that E( γrs (0)) = γ rs (0) + O(T −1 ), (e.g.17, vol.2, p. 693, Formula 9.5.4) the result follows.The proof for var{T ( γrs (h) − γrs (h))} is similar.

A.10. Proof of Theorem 6.2
Proof.We have Re{I rs (λ j ) − Ĩrs (λ j )} = ω r ω s and it may be easily seen that cov{A rs (λ j )A rs (λ k )} and cov{B rs (λ j )B rs (λ k )} tend to zero as T → ∞ if λ j = λ k while the results for λ j = λ k are given in the previous the- orem.As T → ∞, the window bandwidth tends to zero, (2π/T ) ∑ w T (λ − λ j ) 2 tends to c 0 M and therefore var{Re(F rs (λ ) − Frs (λ ))} has the same asymptotic behavior as 2πM T 2 var( √ T A rs (λ )) while var{Im(F rs (λ ) − Frs (λ ))} has the same asymptotic behavior as 2πM T 2 var( √ T B rs (λ )).On substituting the expressions for the variance the thesis follows.

1 FIG 1 .
FIG 1. Plot of eigenvalues of the observed time series spectral matrix at frequency zero

10 FIG 3 .
FIG 3. Observed time series (blue line) and estimated dynamic factor model (red line)

1 FIG 4 . 5 FIG 5 .
FIG 4. Cumulated eigenvalues of the variance-covariance matrix of 27 economic quarterly data time series

TABLE 1
Outlier size estimates yielded by temporal decorrelation, eigenvalue analysis and maximum likelihood methods ′.This latter is close to its estimated counterpart (in each of the three versions).As far as α is concerned we have to consider that the product Aα = (1.1613,−0.3226, −1.3548, −1.2903, −0.4194) ′

TABLE 2
Percentages of estimated number of factors in the presence of outliers

TABLE 3
Isolated outlier percentage detection by the ODFM, OARMA and OPP methods

TABLE 4
Outlier patch percentage detection by the ODFM, OARMA and OPP methods

TABLE 5
Deviation from true values of isolated outlier size estimates yielded by the ODFM, OARMA and OPP methods

TABLE 6
Deviation from true values of outlier patch size estimates yielded by the ODFM, OARMA and OPP methods