Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

vstatmp_engl

.pdf
Скачиваний:
12
Добавлен:
12.03.2016
Размер:
6.43 Mб
Скачать

4.3 Linear Propagation of Errors

97

Covariance Matrix

To simplify the notation, we introduce the covariance matrix C

 

 

Δx1

Δx1

, Δx1

Δx2

, ... Δx1

Δxn

 

 

 

C =

hΔx2

Δx1i, hΔx2

Δx2i, ... hΔx2

Δxni

,

 

h

 

 

i

h

 

 

i

h

 

 

i

 

 

 

hΔx

:Δx i

, hΔx

:Δx i

, ... hΔx :Δx i

 

 

 

 

n

1

 

 

n

2

 

 

n

n

 

 

 

Cij = Rij δxiδxj

which, in this context, is also called error matrix. The covariance matrix by definition is positive definite and symmetric. The error δy of the dependent variable y is then given in linear approximation by

(δy)2 = XN ∂y ∂y Cij

i,j=1 ∂xi ∂xj

which can also be written in matrix notation as

(δy)2 = yT C y .

For two variables with normally distributed errors following (3.49)

 

1

 

 

exp

1

 

(Δx1)2

Δx1Δx2

+

(Δx2)2

 

 

N(Δx1, Δx2) =

 

 

 

δ2

 

 

δ1δ2

δ2

(4.6)

 

 

 

 

 

1

 

 

 

 

 

2

 

 

 

2

 

 

 

 

1

 

ρ2

 

 

2πδ1δ2p1 − ρ

2

 

 

 

 

 

 

we get

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

C =

δ12,

ρδ1δ2

 

.

 

 

 

 

 

 

 

 

 

 

ρδ1δ2,

 

 

δ22

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Error Ellipsoids

Two-dimensional Gaussian error distributions like (4.6) (see Sect. 3.6.5) have the property that the curves of constant probability density are ellipses. Instead of nσ error intervals in one dimension, we define nσ error ellipses. The curve of constant probability density with density down by a factor of exp(−n2/2) relative to the maximal density is the nσ error ellipse.

For the error distribution in the form of (4.6) the error ellipse is

(Δx1)2

− 2ρ

Δx1Δx2

+

(Δx2)2

 

 

δ12

δ1δ2

δ22

2

 

 

 

 

 

 

= n

 

.

 

1 − ρ2

 

 

 

 

 

 

 

 

 

For uncorrelated errors the one standard deviation error ellipse is simply

(Δx1)2

+

(Δx2)2

= 1 .

δ2

δ2

 

 

1

 

2

 

In higher dimensions, we obtain ellipsoids which we better write in vector notation:

yT C y = n2 .

98 4 Measurement errors

4.3.3 Averaging Uncorrelated Measurements

Important measurements are usually performed by various experiments in parallel, or are repeated several times. The combination of the results from various measurements should be performed in such a way that it leads to optimal accuracy. Under these conditions we can calculate a so-called weighted mean, with an error smaller than that of any of the contributing measurements. As usual, we assume that the individual measurements are independent.

As an example let us consider two measurements with measured values x1, x2 and errors δ1, δ2. With the relations given in Sect. 3.2.3, we find for the error squared δ2 of a weighted sum

x = w1x1 + w2x2 ,

δ2 = w12δ12 + w22δ22 .

Now we chose the weights in such a way that the error of the weighted sum is minimal, i.e. we seek for the minimum of δ2 under the condition w1 + w2 = 1. The result is

1/δ2

wi = 1/δ12 +i1/δ22 and for the combined error we get

1

=

1

+

1

.

 

 

 

δ2

δ2

δ2

 

 

1

 

2

 

Generally, for N measurements we find

x =

N

xi

/

 

N

1

,

(4.7)

Xi

 

X

δ2

 

 

 

 

 

 

 

 

 

 

δ2

 

 

 

 

 

 

 

=1

i

i=1

i

 

 

 

 

 

 

 

 

 

 

 

1

=

N

 

1

.

 

(4.8)

 

 

 

Xi

 

 

 

δ2

 

δ2

 

 

 

 

 

 

 

=1

 

i

 

 

 

 

 

 

 

 

 

 

 

 

 

When all measurements have the same error, all the weights are equal to wi =

1/N, and we get the normal arithmetic mean, with the corresponding reduction of

the error by the factor 1/ N.

Remark: If the original raw data of di erent experiments are available, then we have the possibility to improve the averaging process compared to the simple use of the relations 4.7 and 4.8. When, for example, in two rate measurements of 1 and 2 hours duration, 2, respectively 12 events are observed, then the combined rate is (2 + 12)/(1 h + 3 h) = 3.5 h−1, with an error ±0.9 h−1. Averaging according to (4.7) would lead to too low a value of (3.2±1.2) h−1, due to the above mentioned problem of small rates and asymmetric errors. The optimal procedure is in any case the addition of the log-likelihoods which will be discussed in Chap. 8. It will correspond to the addition of the original data, as done here.

4.3.4 Averaging Correlated Measurements

In Sect.4.3.3 we derived the expression for the weighted mean of independent measurements of one and the same quantity. This is a special case of a more general

PN
wi = PNj=1 Vij i,j=1 Vij

4.3 Linear Propagation of Errors

99

result for a sample of N measurements of the same quantity which di er not only in their variances, but are also correlated, and therefore not statistically independent. Consequently, they have to be described by a complete N × N covariance or error matrix C.

We choose the weights for a weighted mean such that the variance of the combined value is minimal, in much the same way as in Sect.4.3.3 for uncorrelated measurements. For simplicity, we restrict ourselves to two measurements x1,2. The weighted sum x is

x = w1x1 + w2x2 , with w1 + w2 = 1 .

To calculate var(x) we have to take into account the correlation terms:

δx2 ≡ var(x) = w12C11 + w22C22 + 2w1w2C12 .

The minimum of δx2 is achieved for

w1

=

 

C22 − C12

,

 

C11

+ C22 − 2C12

 

 

 

 

 

w2

=

 

C11 − C12

.

(4.9)

C11

+ C22 − 2C12

 

 

 

 

The uncorrelated weighted mean corresponds to C12 = 0. Contrary to this case, where the expression for the minimal value of δx2 is particularly simple, it is not as transparent in the correlated case.

The case of N correlated measurements leads to the following expression for the weights:

,

where V is the inverse matrix of C which we called the weight matrix in Sect. 3.6.5.

Example 50. Average of measurements with common o -set error

Several experiments (i) determine the energy Ei of an excited nuclear state by measuring its transition energy Ei with the uncertainty δi to the ground state with energy E0. They take the value of E0 from the same table which quotes an uncertainty of δ0 of the ground state energy. Thus the results Ei = Ei + E0 are correlated. The covariance matrix is

Cij = h( i + 0)( j + 0)i = δi2δij + δ02 .

C is the sum of a diagonal matrix and a matrix where all elements are iden-

tical, namely equal to δ2. In this special situation the variance var(x) ≡ δ2

0 P x

of the combined result x = wixi is

XX

δ2 =

w2C

ii

+

w w C

 

i

 

i j ij

ii=6 j

X X 2

= wi2δi2 + wi δ02 .

Since the second sum is unity, the second term is unimportant when we minimize δ2, and we get the same result (4.7) for the weighted mean E as in the uncorrelated case. For its error we find, as could have been expected,

δ2 = X 1 −1 + δ2 .

δi2 0

100 4 Measurement errors

It is interesting that in some rare cases the weighted mean of two correlated measurements x1 and x2 is not located between the individual measurement, the so-called “mean value” is not contained in the interval [x1, x2].

Example 51. Average outside the range defined by the individual measurements

The matrix

 

 

 

C =

1 2

 

2 5

with eigenvalues

 

 

λ1,2 = 3 ± 8 > 0

is symmetric and positive definite and thus a possible covariance matrix. But

following (4.9) it leads to weights w = 3

, w =

1

. Thus the weighted mean

3

1

1

2

2

2

1

x = 2 x1

2 x2

with x1 = 0, x2 = 1 will lead to

x = −2 which is less than

both input values. The reason for this sensible but at first sight unexpected

result can be understood intuitively in the following way: Due to the strong correlation, x1 and x2, both will usually be either too large or too low. An indication, that x2 is too large is the fact that it is larger than x1 which is the more precise measurement. Thus the true value x then is expected to be located below both x1 and x2.

4.3.5 Several Functions of Several Measured Quantities

When we fix a straight line by two measured points in the plane, we are normally interested in its slope and its intercept with a given axis. The errors of these two quantities are usually correlated. The correlations often have to be known in subsequent calculations, e.g. of the crossing point with a second straight line.

In the general case we are dealing with K functions yk(x1, .., xN ) of N variables with given measurement values xi and error matrix C. The symmetric error matrix E related to the values yk is

 

N

 

 

 

 

 

 

 

X

 

∂yk ∂yl

 

hΔykΔyli = i,j=1

 

 

 

hΔxiΔxj i ,

(4.10)

∂xi

∂xj

Ekl =

N

∂yk ∂yl Cij .

 

 

X

 

 

 

 

 

 

i,j=1 ∂xi ∂xj

Defining a matrix

Dki = ∂yk , ∂xi

we can write more compactly

XN

Ekl = DkiDlj Cij . i,j=1

4.3 Linear Propagation of Errors

101

4.3.6 Examples

The following examples represent some standard cases of error propagation.

Example 52. Error propagation: velocity of a sprinter

Given are s = (100.0 ± 0.1) m, t = (10.00 ± 0.02) s, searched for is δv:

v

 

 

=

t

+ s

 

,

 

 

 

 

δv

2

 

 

δt

2

 

 

δs

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

δv

= s

.02

 

2

 

 

0.1

 

2

 

 

 

 

 

 

 

0

+

 

 

 

= 2.2 10−3 .

 

 

 

v

10

100

Example 53. Error propagation: area of a rectangular table

Given are the sides a, b with a reading error δ1 and a relative scaling error δ2, caused by a possible extension or shrinkage of the measuring tape. We want to calculate the error δF of the area F = ab. We find

a)2 = (δ1)2 + (aδ2)2 ,

b)2 = (δ1)2 + (bδ2)2 , Cab = ab(δ2)2 ,

(δF )2 = b2a)2 + a2b)2 + 2abCab ,

 

δF

 

2

1

1

 

 

 

 

 

= (δ1)2

 

+

 

+ 2(δ2)2 .

F

 

a2

b2

For large areas, the contribution of the reading error is negligible compared to that of the scaling error.

Example 54. Straight line through two measured points

Given are two measured points (x1, y1 ± δy1), (x2, y2 ± δy2) of the straight line y = mx + b, where only the ordinate y possesses an error. We want to find the error matrix for the intercept

b = (x2y1 − x1y2)/(x2 − x1)

and the slope

m = (y2 − y1)/(x2 − x1) .

According to (4.10) we calculate the errors

 

(δm)2 =

(δy2)2 + (δy1)2

,

 

 

(x2 − x1)2

 

 

 

 

 

 

 

 

2

 

x22(δy1)2 + x12(δy2)2

 

(δb)

=

 

 

 

,

 

(x2 − x1)2

 

 

 

 

 

 

 

E12

= hΔmΔbi = −

x2(δy1)2 + x1(δy2)2

.

 

(x2 − x1)2

The error matrix E for m and b is therefore

102 4 Measurement errors

1

 

 

 

(δy

)2 + (δy

)2,

 

x (δy )2

 

x (δy

)2

 

E =

 

 

1

2

 

 

2

 

2 22 12

21 2

2 .

(x2 − x1)2

 

 

 

 

 

 

 

 

 

−x2(δy1)

− x1(δy2) , x2(δy1) + x1(δy2)

 

 

The correlation matrix element R12 is

 

 

 

 

 

 

 

 

 

 

E12

 

 

 

 

 

 

 

 

 

 

 

 

 

R12 =

 

,

 

 

 

 

 

 

 

 

 

 

 

 

 

δm δb

 

 

 

 

 

 

 

 

 

 

 

 

 

= −

 

 

 

x2(δy1)2 + x1(δy2)2

 

 

.

 

(4.11)

 

{[(δy2)2 + (δy1)2] [x22(δy1)2 + x12(δy2)2]}1/2

 

For the special case δy1 = δy2 = δy the results simplify to

 

 

 

 

 

 

 

 

(δm)2 =

 

 

2

 

(δy)2 ,

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(x1

 

2

 

 

 

 

 

 

 

 

 

 

 

 

− x2)

 

 

 

 

 

 

 

 

 

 

2

 

(x2

+ x2)

2

 

 

 

 

 

 

 

 

 

1

 

2

 

 

 

 

 

 

 

 

 

 

(δb)

=

 

 

 

 

(δy) ,

 

 

 

 

 

 

 

 

 

 

(x1

 

2

 

 

 

 

 

 

 

 

 

 

 

 

− x2)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(x1 + x2)

 

 

 

 

 

 

 

 

 

 

E12 = −

 

(δy)2 ,

 

 

 

 

 

 

 

 

 

 

(x1 − x2)2

 

 

 

 

 

R12 = − p x1 + x2 .

2(x21 + x22)

Remark: As seen from (4.11), for a suitable choice of the abscissa the correlation disappears. To achieve this, we take as the origin the “center of gravity” xs of the x-values xi, weighted with the inverse squared errors of the ordi-

nates, 1/(δyi)2:

 

 

 

 

xs = X

xi

/ X

1

.

(δyi)2

(δyi)2

Example 55. Error of a sum of weighted measurements

In the evaluation of event numbers, the events are often counted with di erent weights, in order to take into account, for instance, a varying acceptance of the detector. Weighting is also important in Monte Carlo simulations (see 5.2.6) especially when combined with parameter estimation Sect. 6.5.9 An event with weight 10 stands for 10 events with weight 1. For N events with weights w1 . . . wN the weighted number of events is

N

 

Xi

wi .

s =

=1

 

This sum s stands for N measurements with result xi = 1 in each of them and where the results are added after weighting with wi: s = wixi. The

P

P

error of this sum is δs2 = wi2δi2. Since each individual result corresponds to a Poisson distribution with mean equal to 1, it has also variance of 1 and thus δi2 = 1. We thus obtain

X

δs2 = wi2.

which corresponds to the variance of the sum of weighted events as derived in Sect. 3.6.3.

4.4 Biased Measurements

103

4.4 Biased Measurements

We have required that our measurement values xi are undistorted (unbiased). We have used this property in the discussion of error propagation. Anyway, it is rather plausible that we should always avoid biased measurements, because averaging measurements with a common bias would produce a result with the same bias. The average from infinitely many measurements would thus be di erent from the true parameter value but the associated error would be infinitely small. However a closer look at the problem reveals that the requirement of unbiasedness has also its problems: When we average measurements, the measurements xi are weighted with 1/δi2, their inverse squared errors, as we have seen above. To be consistent, it is therefore required that the quantities xii2 are unbiased! Of course, we explicitly excluded the possibility of errors which depend on the measurement values, but since this requirement is violated so often in reality and since a bias which is small compared to the uncertainty in an individual experiment can become important in the average, we stress this point here and present an example.

Example 56. Bias in averaging measurements

Let us assume that several measurements of a constant x0 produce unbiased results xi with errors δi xi which are proportional to the measurements. This could be, for instance, measurements of particle lifetimes, where the relative error is determined by the number of recorded decays and thus the absolute error is set proportional to the observed mean life. When we compute the weighted mean x over many such measurements

x = X

xi

/ X

1

δi2

δi2

 

1

 

 

1

= X

 

 

/ X2xi2

xi

≈ h1/xi / 1/x

the expected value is shifted systematically to lower values. This is easily seen from a Taylor expansion of the expected values:

x

x

0i

=

h1/xi

 

 

 

x

 

 

 

,

 

 

 

 

 

 

 

h −

 

 

h1/x2i

0

 

 

 

 

+ x02

 

+ · · ·

x =

x0

 

1 −

x0

 

1

 

 

 

1

 

 

 

 

 

 

 

 

Δx

 

Δx2

 

 

 

 

 

 

 

1

 

 

 

 

δ2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(1 +

 

 

) ,

 

 

 

 

 

 

 

 

x2

 

x0

x02

 

 

 

 

 

 

 

 

x02

 

 

 

 

x0 x02

 

 

1

 

 

=

1

1

 

 

2

 

 

 

Δx

+ 3

Δx2

+ ..

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

δ2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(1 + 3

 

) ,

 

 

 

 

 

 

 

 

 

 

 

x02

x02

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1 + δ2/x2

 

 

 

 

 

 

 

 

 

hx − x0i ≈ x0

 

 

 

 

 

 

 

 

0

 

 

− x0

 

 

 

1 + 3δ2/x02

 

 

 

 

 

 

 

 

≈ x0(1 − 2δ2/x02) − x0 ,

 

 

 

hx − x0i

 

≈ −

2

δ2

.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

x0

 

 

 

 

x02

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

104 4 Measurement errors

Here δ2 is the expectation of the error squared in an individual measurement. For a measurement error δ/x0 of 20% we obtain a sizable final bias of 8% for the asymptotic result of infinitely many contributions.

The revised requirement of unbiasedness of measurements divided by the error squared does not alter the other results which we have derived for the general error propagation in the linear approximation.

4.5 Confidence Intervals

Under the condition that the error distribution is a one-dimensional Gaussian, with a width independent of the expected value, the error intervals of many repeated measurements will cover the true parameter value in 68.3 % of the cases, because for any true value µ the probability to observe x inside one standard deviation interval

is

Zδ

 

2

 

 

2πδ

 

 

1

 

δ

exp

 

(x − µ)2

dx

 

0.683

 

 

 

 

 

 

 

 

 

 

 

 

 

The region [x−δ, x+δ] is called a confidence interval7 with the confidence level (CL) of 68.3 %, or, in physicists’ jargon, a 1σ confidence interval. Thus in about one third of the cases our standard error intervals, under the above assumption of normality, will not contain the true value. Often a higher safety is desired, for instance 90 %, 95 %, or even 99 %. The respective limits can be calculated, provided the probability distribution is known with su cient accuracy. For the normal distribution we present some limits in units of the standard deviation in Table 4.2. The numerical values can be taken from tables of the χ2-distribution function.

Example 57. Confidence level for the mean of normally distributed variates

Let us consider a sample of N measurements x1, . . . , xN which are supposed to be normally distributed with unknown mean µ but known variance σ2. The sample mean x is also normally distributed with variance δN = σ/N. The 1σ confidence interval [x−δN , x+δN ] covers, as we have discussed above, the true value µ in 68.3 % of the cases. We can, with the help of Table 4.2, also find a 99 % confidence level, i.e. [x − 2.58δN , x + 2.58δN ].

We have to keep in mind that the Gaussian confidence limits do not or only approximately apply to other distributions. Error distributions often have tails which are not well understood. Then it is impossible to derive reliable confidence limits with high confidence levels. The same is true when systematic errors play a role, for example due to background and acceptance which usually are not known with great accuracy. Then for a given confidence level much wider intervals than in the above case are required.

We come back to our previous example but now we assume that the error has to be estimated from the sample itself, according to (4.1), (4.3):

δN2 = XN (xi − x)2/[N(N − 1)] .

i=1

7We will discuss confidence intervals in more detail in Chap. 8 and in Appendix 13.5.

4.5 Confidence Intervals

105

Table 4.2. Left hand side: confidence levels for several error limits and dimensions, right hand side: error limits in units of σ for several confidence levels and dimensions.

Deviation

 

Dimension

 

 

1

2

3

4

1 σ

0.683

0.393

0.199

0.090

2 σ

0.954

0.865

0.739

0.594

3 σ

0.997

0.989

0.971

0.939

4 σ

1.

1.

0.999

0.997

Confidence

 

Dimension

 

level

1

2

3

4

0.50

0.67

1.18

1.54

1.83

0.90

1.65

2.14

2.50

2.79

0.95

1.96

2.45

2.79

3.08

0.99

2.58

3.03

3.37

3.64

Table 4.3. Values of the factor k for the Student’s t-distribution as a function of the confidence levels CL and sample size N.

N

68.3%

99%

 

 

 

3

1.32

3.85

10

1.06

1.26

20

1.03

1.11

1.00

1.00

To compute the confidence level for a given interval in units of the standard deviation, we now have to switch to Student’s distribution (see Sect. 3.6.11). The variate t, given by (x − µ)/δN , can be shown to be distributed according to hf (t) with f = N − 1 degrees of freedom. The confidence level for a given number of standard deviations will now be lower, because of the tails of Student’s distribution. Instead of quoting this number, we give in Table 4.3 the factor k by which we have to increase the interval length to get the same confidence level as in the Gaussian case. To clarify its meaning, let us look at two special cases: For 68.3% confidence and N = 3 we require a 1.32 standard deviation interval and for 99% confidence and N = 10 a 1.26 × 2.58 = 3. 25 standard deviation interval. As expected, the discrepancies are largest for small samples and high confidence levels. In the limit when N approaches infinity the factor k has to become equal to one.

Often it is overlooked that for distributions of several variates, the probability to find all variables inside their error limits is strongly decreasing with the number of variables. Some probabilities for Gaussian errors are given in Table 4.2. In three dimensions only 20 % of the observations are found in the 1σ ellipsoid. Fig. 4.2 shows confidence ellipses and probabilities for two variables.

106 4 Measurement errors

d

 

2 1

 

x

 

2

 

0.393

d

 

 

2 2

0.865

 

0.989

 

 

x

 

1

Fig. 4.2. Confidence ellipses for 1, 2 and 3 standard deviations and corresponding probabilities.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]