Main content

Course: Statistics and probability > Unit 11

Lesson 2: Estimating a population proportion

Reference: Conditions for inference on a proportion

When we want to carry out inferences on one proportion (build a confidence interval or do a significance test), the accuracy of our methods depend on a few conditions. Before doing the actual computations of the interval or test, it's important to check whether or not these conditions have been met, otherwise the calculations and conclusions that follow aren't actually valid.

The conditions we need for inference on one proportion are:

Random: The data needs to come from a random sample or randomized experiment.
Normal: The sampling distribution of $\hat{p}$ ‍ needs to be approximately normal — needs at least $10$ ‍ expected successes and $10$ ‍ expected failures.
Independent: Individual observations need to be independent. If sampling without replacement, our sample size shouldn't be more than $10 %$ ‍ of the population.

Let's look at each of these conditions a little more in-depth.

The random condition

Random samples give us unbiased data from a population. When samples aren't randomly selected, the data usually has some form of bias, so using data that wasn't randomly selected to make inferences about its population can be risky.

More specifically, sample proportions are unbiased estimators of their population proportion. For example, if we have a bag of candy where

50 %

of the candies are orange and we take random samples from the bag, some will have more than

50 %

orange and some will have less. But on average, the proportion of orange candies in each sample will equal

50 %

. We write this property as

μ_{\hat{p}} = p

, which holds true as long as our sample is random.

This won't necessarily happen if our sample isn't randomly selected though. Biased samples lead to inaccurate results, so they shouldn't be used to create confidence intervals or carry out significance tests.

The normal condition

The sampling distribution of

\hat{p}

is approximately normal as long as the expected number of successes and failures are both at least

10

. This happens when our sample size

n

is reasonably large. The proof of this is beyond the scope of AP statistics, but our tutorial on sampling distributions can provide some intuition and verification that this condition indeed works.

So we need:

\begin{aligned} expected successes: n p \geq 10 \\ expected failures: n (1 - p) \geq 10 \end{aligned}

If we are building a confidence interval, we don't have a value of

p

to plug in, so we instead count the observed number of successes and failures in the sample data to make sure they are both at least

10

. If we are doing a significance test, we use our sample size

n

and the hypothesized value of

p

to calculate our expected numbers of successes and failures.

The independence condition

To use the formula for standard deviation of

\hat{p}

, we need individual observations to be independent. When we are sampling without replacement, individual observations aren't technically independent since removing each item changes the population.

But the

10 %

condition says that if we sample

10 %

or less of the population, we can treat individual observations as independent since removing each observation doesn't significantly change the population as we sample. For instance, if our sample size is

n = 150

, there should be at least

N = 1500

members in the population.

This allows us to use the formula for standard deviation of

\hat{p}

σ_{\hat{p}} = \sqrt{\frac{p (1 - p)}{n}}

In a significance test, we use the sample size

n

and the hypothesized value of

p

If we are building a confidence interval for

p

, we don't actually know what

p

is, so we substitute

\hat{p}

as an estimate for

p

. When we do this, we call it the standard error of

\hat{p}

to distinguish it from the standard deviation.

So our formula for standard error of

\hat{p}

σ_{\hat{p}} \approx \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}}

Want to join the conversation?

Sort by:

No posts yet.

Do you understand English? Click here to see more discussion happening on Khan Academy's English site.