How do confidence intervals make sense at all?

Confidence intervals for the population mean

When looking for a particular property If you are interested in a (large) population, you could of course go here and actually measure it for all members of the population. For example, you could check for each weld seam at what force it really tears, or ask all voters every week who they want to vote for, or ...

As the above examples show, what you want to know about everyone can practically not always be measured against everyone.

Perhaps the measurement method is destructive, or it is too expensive, or one is just too lazy. In such cases, a (small) sample is drawn from the population and the measurements are only made in this sample. The question of the price is of course: What can we say about the population from our results in the sample?

The sample

How do we choose the members of the sample? Do we only test all weld seams that seem suspicious to us, or only those from colleague S., the bungler? Are we just asking the first 100 people in the phone book?

The most important prerequisite for everything else is that our sample is for the entire population representative is. This means that all elements of the population should have the same probability of being included in the sample. Unfortunately, this cannot always be guaranteed in practice. In any case, we should avoid selection methods that are guaranteed not to result in representative samples.

The measurand

The details below will depend on the type of size you are trying to determine. When you consider the size actually measures for all members of the population, the readings will have a certain distribution. We are interested in their mean here.

For example, if we have determined the filling quantity of ten bottles, what can we say about the average filling quantity of all bottles (from the same production)?

The model

In order to be able to draw any meaningful conclusions from the sample about the population at all, we need to have an idea of ​​the possible values ​​of our measured variable are distributed. This suspicion could come from the fact that one has already measured similar quantities often, or from the fact that one understands, for example, the processes that lead to different values ​​of to lead.

Then we model our measurand by a random variable whose distribution is as good as possible the distribution of should correspond.

If we can assume that our measured variable is approximately normally distributed, we can make each measurement in the sample by drawing from the random variable

model. In this case it becomes the mean by the random variable

modeled, where is the sample size. We have already seen that averaging is a slimming diet for distribution.

If we do the distribution of do not know or only very imprecisely, we can still assume, based on the central limit theorem, that the mean value of a (sufficiently large) sample is approximately normally distributed. That is the main advantage of averaging!

In each case apply to the expected value and the standard deviation of

and .

The point estimate

Unsurprisingly, we estimate the expected value by the sample mean


Because one only has one value as an estimate, one speaks of one Point estimate.

E.g.: We have from Bottles, the filling quantities are measured in mL and the following values ​​are obtained: 500.1, 500.5, 501.5, 502.7, 499.6, 501.2, 498.2, 501.9, 503.8, 497.8. Then the mean (in mL) is


In the statistics software R, we could have calculated it like this, for example:

x <- c (500.1, 500.5, 501.5, 502.7, 499.6, 501.2, 498.2, 501.9, 503.8, 497.8) mean (x)

Where this estimate is relative to our unknown we don't know. What we do know is if a good model for is, the mean values ​​should several Samples of the same size according to the probability density of our unknown sprinkle (see Fig. 1).

To find out more about our wanted In order to be able to make a statement, we shall now find the intervals that are the value of included with a certain probability.

The z-Confidence interval

For now we assume that we have the standard deviation know. This is seldom the case in practice, but it is initially easier.

The measured mean of a random sample is also random. As we saw above, it can take any number of values, but it is likely to be "near" what you are looking for land. If we would know, we could look around now Calculate the symmetrical interval in which the measured mean value with the probability will lie (see Fig. 2). One calls this the Confidence level. Typical values ​​for this are or . It's up to us to determine how safe we ​​want to be. If is the total width of the interval, so must

be valid. Unfortunately we can change the interval

do not calculate because - as already mentioned - we not knowing.

But what we know is what we have measured . If we think the probability density from to shift, does not change the width of the distribution and thus the width of the hatched area (see Fig. 3). So if the above interval with probability the measured value then it must be the shifted confidence interval

with the same probability the unknown value cover.

So we can instead of the random variable the random variable consider. Means we can actually calculate our desired confidence interval. The following applies to the lower and upper limits:



By reversing the cumulative distribution function of we get the two limits

or. .

The reverse function for a normal distribution is called in R. In our example, we could have calculated the confidence interval as follows:

n <- 10 x_quer <- 500.73 sigma <- 2.0 gamma <- 0.95 mu_down <- qnorm ((1.0 - gamma) / 2.0, x_quer, sigma / sqrt (n)) mu_up <- qnorm (1.0 - (1.0 - gamma) / 2.0, x_quer, sigma / sqrt (n))

With that we get that -Confidence interval For . So we are to sure the medium capacity all Bottles (in mL) in the area lies. In fact, the "measured values" in the example were taken from the computer using a normal distribution and "Diced" and rounded to one decimal place.

The above calculation hides a bit what the width of the confidence interval depends on. So we're going to make the calculation a little more explicit now. First we standardize the random variable by means of

and receive the also normally distributed random variable (Standard normal distribution).

As Fig. 4 shows, this is our standardized interval now symmetrical about the origin. The cumulative distribution function of the standard normal distribution gives the area of to at. So, for example, we need the equation

by means of the inverse cumulative distribution function to solve:


In the second step we reverse the standardization using

around, and get that -Confidence interval for


So the width of the interval is


When we have our confidence want to increase, hike outwards, and our interval becomes wider. With -iger security covers the interval the value you are looking for .

Furthermore, the width depends on the factor from. So if we want to be more precise and want to halve our interval width, for example, then we have to increase the sample size quadruple!

To stick with our example:

n <- 10 x_quer <- 500.73 sigma <- 2.0 gamma <- 0.95 z <- -qnorm ((1.0 - gamma) / 2.0) mu_down <- x_quer - z * sigma / sqrt (n) mu_up <- x_quer + z * sigma / sqrt (n)

It is , and we get the confidence interval from above.

If we gave the individual values ​​instead of the mean, we could

x <- c (500.1, 500.5, 501.5, 502.7, 499.6, 501.2, 498.2, 501.9, 503.8, 497.8) n <- length (x) x_quer <- mean (x) sigma <- 2.0 ...


The t-Confidence interval

If we - which is the usual case - the standard deviation do not know are though and still normally distributed, but we can also use nothing more to calculate.

One idea is the standard deviation by the empirical standard deviation

to estimate the sample. For our example we get (in mL). In R we do this with the command:

x <- c (500.1, 500.5, 501.5, 502.7, 499.6, 501.2, 498.2, 501.9, 503.8, 497.8) s <- sd (x)

So we could take place at least approximately use. If large enough, this approximation can also be used.

For small Samples can show that the "standardization"

to a random variable leads the studentt-distributed with Degrees of freedom is:


Fig. 5 compares two t-Distributions with 1 or 5 degrees of freedom with a standard normal distribution. You can see that with the size of the sample, the t-Distribution is becoming more and more similar to the normal distribution; the central limit theorem still applies. For smaller ones has the t-Distribution broader runners than the normal distribution. For small samples, our confidence intervals become wider.

Apart from using the t-Distribution () instead of the standard normal distribution (), the calculation is identical to the detailed variant above. For our example we get with

x <- c (500.1, 500.5, 501.5, 502.7, 499.6, 501.2, 498.2, 501.9, 503.8, 497.8) n <- length (x) x_quer <- mean (x) s <- sd (x) gamma <- 0.95 t <- -qt ((1.0 - gamma) / 2.0, n - 1) mu_ down <- x_quer - t * s / sqrt (n) mu_up <- x_quer + t * s / sqrt (n)

as Confidence interval for (in mL): . This is 0.1 mL larger in both directions than what is known . The reason for this small change is that our measured the a little underestimated. The is a little bigger than that from above.

To come back to the question of what a small sample is: The following table shows by which factor that -Interval wider than that Interval is (other things being equal; the width also depends on the empirical standard deviation from).

Usually you hear that samples are taking are small. Personally, I would rather get the value put.

Measurement accuracy

So far we have assumed that all numbers are exact; so there is no measurement uncertainty. If we can measure fill quantities with an accuracy of sub-mL, we get a certain confidence interval. If we pour the contents of the bottle into a jar with a 1L mark, the best we can do is say that each bottle was about 0.5L and get the same confidence interval again. Obviously absurd! The more imprecisely we measure, the wider our confidence interval would have to be.

When we on be able to measure accurately ( is therefore our measurement inaccuracy), then we can generally calculate the measurement error by the random variable


In our example describes the inaccuracy of the filling line, the parameters describes the inaccuracy of the level measurement of a bottle. These two things are obviously independent of each other. So when we measure the level of a bottle, we're not just drawing from the distribution , but from the distribution


because the sum of two independent, normally distributed random variables is normally distributed again, but with a greater width (which corresponds to our greater uncertainty).

For the mean value Samples then applies:


Analogous to the above, we then get that -Confidence interval for


So the R code for our example will be:

x <- c (500.1, 500.5, 501.5, 502.7, 499.6, 501.2, 498.2, 501.9, 503.8, 497.8) n <- length (x) x_quer <- mean (x) sigma <- 2.0 sigma_u <- 0.05 sigma_ges <- sqrt (sigma ^ 2 + sigma_u ^ 2) gamma <- 0.95 z <- -qnorm ((1.0 - gamma) / 2.0) mu_below <- x_quer - z * sigma_ges / sqrt (n) mu_oben <- x_quer + z * sigma_ges / sqrt (n)

The following table shows how the -Change confidence intervals for decreasing measurement accuracy (all values ​​in mL):

If the measurement uncertainty is smaller than the (empirical) spread (or. ) our measured variable, practically nothing happens. As soon as but of the same order of magnitude or even greater than is, the confidence interval becomes significantly wider.


After calculating the confidence interval, we know the mean value of our measured variable still not. But we can sensibly limit its range of values. However, our confidence interval covers the value of only with a certain probability. So it may well be that the value is outside of our range.

After we had to make some assumptions above that cannot really be checked in practice, is the confidence level rather an upper limit for our interval.

Author Herr FessaPosted on Categories mathematics, stochasticsTags Expectation value, confidence interval, mean value, normal distribution, t-distribution, probability