Tests of Significance
Once sample data has been gathered through an observational study or experiment, statistical inference allows analysts to assess evidence in favor or some claim about the population from which the sample has been drawn. The methods of inference used to support or reject claims based on sample data are known as tests of significance.
Every test of significance begins with a null hypothesis H0. H0 represents a theory that has been put forward, either because it is believed to be true or because it is to be used as a basis for argument, but has not been proved. For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug. We would write H0: there is no difference between the two drugs on average.
The alternative hypothesis, Ha, is a statement of what a statistical hypothesis test is set up to establish. For example, in a clinical trial of a new drug, the alternative hypothesis might be that the new drug has a different effect, on average, compared to that of the current drug. We would write Ha: the two drugs have different effects, on average. The alternative hypothesis might also be that the new drug is better, on average, than the current drug. In this case we would write Ha: the new drug is better than the current drug, on average.
The final conclusion once the test has been carried out is always given in terms of the null hypothesis. We either “reject H0 in favor of Ha” or “do not reject H0“; we never conclude “reject Ha“, or even “accept Ha“.
If we conclude “do not reject H0“, this does not necessarily mean that the null hypothesis is true, it only suggests that there is not sufficient evidence against H0 in favor of Ha; rejecting the null hypothesis then, suggests that the alternative hypothesis may be true.
Hypotheses are always stated in terms of population parameter, such as the mean . An alternative hypothesis may be one-sided or two-sided. A one-sided hypothesis claims that a parameter is either larger or smaller than the value given by the null hypothesis. A two-sided hypothesis claims that a parameter is simply not equal to the value given by the null hypothesis — the direction does not matter.
Hypotheses for a one-sided test for a population mean take the following form:
H0: µ= k
Ha:µ > k
or
H0: µ= k
Ha: µ< k.
Hypotheses for a two-sided test for a population mean take the following form:
H0: µ= k
Ha:µ≠ k.
The approach described in this lesson is appropriate, as long as the sample includes at least one success and one failure. The key steps are:
- Formulate the hypotheses to be tested. This means stating the null hypothesis and the alternative hypothesis.
- Determine the sampling distribution of the proportion. If the sample proportion is the outcome of a binomial experiment, the sampling distribution will be binomial. If it is the outcome of a hypergeometric experiment, the sampling distribution will be hypergeometric.
- Specify the significance level. (Researchers often set the significance level equal to 0.05 or 0.01, although other values may be used.)
- Based on the hypotheses, the sampling distribution, and the significance level, define the region of acceptance.
- Test the null hypothesis. If the sample proportion falls within the region of acceptance, do not reject the null hypothesis; otherwise, reject the null hypothesis.
The following examples illustrate how to test hypotheses with small samples. The first example involves a binomial experiment; and the second example, a hypergeometric experiment.
Example 1: Sampling With Replacement
Suppose an urn contains 30 marbles. Some marbles are red, and the rest are green. A researcher hypothesizes that the urn contains 15 or more red marbles. The researcher randomly samples five marbles, with replacement, from the urn. Two of the selected marbles are red, and three are green. Based on the sample results, should the researcher reject the null hypothesis? Use a significance level of 0.20.
Solution: There are five steps in conducting a hypothesis test, as described in the previous section. We work through each of the five steps below:
(I) Formulate hypotheses: The first step is to state the null hypothesis and an alternative hypothesis.
Null hypothesis: P >= 0.50
Alternative hypothesis: P < 0.50
Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected only if the sample proportion is too small.
(II) Determine sampling distribution: Since we sampled with replacement, the sample proportion can be considered an outcome of a binomial experiment. And based on the null hypothesis, we assume that at least 15 of 30 marbles are red. Thus, the true population proportion is assumed to be 15/30 or 0.50.
Given those inputs (a binomial distribution where the true population proportion is equal to 0.50), the sampling distribution of the proportion can be determined. It appears in the table below. (Previously, we showed how to compute binomial probabilities that form the body of the table.)
Number of red marbles in sample | Sample prop | Binomial prob | Cumu prob |
0 | 0.0 | 0.03125 | 0.03125 |
1 | 0.2 | 0.15625 | 0.1875 |
2 | 0.4 | 0.3125 | 0.5 |
3 | 0.6 | 0.3125 | 0.8125 |
4 | 0.8 | 0.15625 | 0.96875 |
5 | 1.0 | 0.03125 | 1.00 |
(III) Specify significance level: The significance level was set at 0.20. (This means that the probability of making a Type I error is 0.20, assuming that the null hypothesis is true.)
(IV) Define the region of acceptance: From the sampling distribution (see above table), we see that it is not possible to define a region of acceptance for which the significance level is exactly 0.20.
However, we can define a region of acceptance for which the significance level would be no more than 0.20. From the table, we see that if the true population proportion is equal to 0.50, we would be very unlikely to pick 0 or 1 red marble in our sample of 5 marbles. The probability of selecting 1 or 0 red marbles would be 0.1875. Therefore, if we let the significance level equal 0.1875, we can define the region of rejection as any sampled outcome that includes only 0 or 1 red marble (i.e., a sampled proportion equal to 0 or 0.20). We can define the region of acceptance as any sampled outcome that includes at least 2 red marbles. This is equivalent to a sampled proportion that is greater than or equal to 0.40.
(V) Test the null hypothesis: Since the sample proportion (0.40) is within the region of acceptance, we cannot reject the null hypothesis.
Example 2: Sampling Without Replacement
The Acme Advertising company has 25 clients. Account executives at Acme claim that 80 percent of these clients are very satisfied with the service they receive. To test that claim, Acme’s CEO commissions a survey of 10 clients. Survey participants are randomly sampled, without replacement, from the client population. Six of the ten sampled customers (i.e., 60 percent) say that they are very satisfied. Based on the sample results, should the CEO accept or reject the hypothesis that 80 percent of Acme’s clients are very satisfied. Use a significance level of 0.10.
Solution: There are five steps in conducting a hypothesis test, as described in the previous section. We work through each of the five steps below:
(I) Formulate hypotheses: The first step is to state the null hypothesis and an alternative hypothesis.
Null hypothesis: P >= 0.80
Alternative hypothesis: P < 0.80
Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected only if the sample proportion is too small.
(II) Determine sampling distribution: Since we sampled without replacement, the sample proportion can be considered an outcome of a hypergeometric experiment. And based on the null hypothesis, we assume that at least 80 percent of the 25 clients (i.e. 20 clients) are very satisfied.
Given those inputs (a hypergeometric distribution where 20 of 25 clients are very satisfied), the sampling distribution of the proportion can be determined. It appears in the table below. (Previously, we showed how to compute hypergeometric probabilities that form the body of the table.)
Number of satisfied clients in sample | Sample prop
|
Prob | Cumu prob |
4 or less | 0.4 or less | 0.00 | 0.00 |
5 | 0.5 | 0.00474 | 0.00474 |
6 | 0.6 | 0.05929 | 0.06403 |
7 | 0.7 | 0.23715 | 0.30119 |
8 | 0.8 | 0.38538 | 0.68656 |
9 | 0.9 | 0.25692 | 0.94348 |
10 | 1.0 | 0.05652 | 1.00 |
(III) Specify significance level: The significance level was set at 0.10. (This means that the probability of making a Type I error is 0.10, assuming that the null hypothesis is true.)
(IV) Define the region of acceptance: From the sampling distribution (see above table), we see that it is not possible to define a region of acceptance for which the significance level is exactly 0.10.
However, we can define a region of acceptance for which the significance level would be no more than 0.10. From the table, we see that if the true proportion of very satisfied clients is equal to 0.80, we would be very unlikely to have fewer than 7 very satisfied clients in our sample. The probability of having 6 or fewer very satisfied clients in the sample would be 0.064. Therefore, if we let the significance level equal 0.064, we can define the region of rejection as any sampled outcome that includes 6 or fewer very satisfied customers. We can define the region of acceptance as any sampled outcome that includes 7 or more very satisfied customers. This is equivalent to a sample proportion that is greater than or equal to 0.70.
(V) Test the null hypothesis: Since the sample proportion (0.60) is outside the region of acceptance, we cannot accept the null hypothesis at the 0.064 level of significance.
One thought on “Tests of Significance: Small Sample Test”