Corrections to the probability distribution

Next: Power of test Up: Basic principles of Previous: Test statistics

Corrections to the probability distribution

In principle, it is possible to compute a value of the statistic for a single frequency and to test its consistency with a random signal (). The common procedure of inspecting the whole periodogram for a detected signal corresponds to the N-fold repetition of the single test for a set of trial frequencies, . The probability of the whole periodogram being consistent with is for . The factor N means that there is an increased probability of accepting a given value of the statistic as consistent with a random signal. Therefore, increasing the number of trial frequencies decreases the sensitivity for the detection of a significant signal and accordingly is called the penalty factor for multiple trials or for the frequency bandwidth used. The true number of independent frequencies, , remains generally unknown. It is usually less than the number of resolved frequencies (Sect. ) because of aliasing and still less than the number of computed frequencies , because of oversampling: . For a practical and conservative estimate, we recommend to use as the number of trial frequencies, N.

According to the standard null hypothesis, , the noise is white noise. This is not the case in many practical cases. For instance, often the noise is a stochastic process with a certain correlation length , so that on average consecutive observations are correlated. Such noise corresponds to white noise passed through a low pass filter which cuts off all frequencies above . Such correlation is not usually taken into account by standard test statistics. The effect of this correlation is to reduce the effective number of observations by a factor (Schwarzenberg-Czerny, 1989). This has to be accounted for by scaling both the statistics S and the number of its degrees of freedom by factors depending on .

In the test statistic, a continuum level which is inconsistent with the expected value of the statistic may indicate the presence of such a correlation between consecutive data points. A practical recipe to measure the correlation is to compute the residual time series (e.g. with the SINEFIT/TSA command) and to look for its correlation length with COVAR/TSA command. The effect of the correlation in the parameter estimation is an underestimation of the uncertainties of the parameters; the true variances of the parameters are a factor larger than computed.

In the command individual descriptions, we often refer to probability distributions of specific statistics. For the properties of these individual distributions see e.g. Eadie et. al. (1971), Brandt (1970), and Abramovitz & Stegun (1972). The two latter references contain tables. For a computer code for the computation of the cumulative probabilities see Press et. al. (1986).

Next: Power of test Up: Basic principles of Previous: Test statistics

Rein Warmels
Mon Jan 22 15:08:15 MET 1996