back | next
8.1 Statistical Testing
In this chapter we will introduce hypothesis testing to enable us to answer questions such as the following:
- A company is labeling their product to weigh, on average 10 oz. However, the last time we bought that product it only weighed 8.5 oz so we suspect the company is cheating and puts less product in the package that it is putting on the label. We want to determine whether our suspicion is true or not.
- A new medical drug is supposed to work better in lowering a
person's cholesterol level than currently existing drugs. From
past experiments we know that the existing drugs lower
colesterol levels by 10 units, on average (I made up the numbers
-:). We want to determine whether the new drug really is more
effective than the existing ones.
We are interested in testing a particular hypothsis and we want to decide whether it is true or not. Moreoever, we want to associate a probability with our decisison so that we know how certain (or uncertain) we are that our decision is correct.
We will approach this problem like a trial. Recall that in a standard trial in front of a judge or jury there are two mutually exclusive hypothesis:
The defendent is either guilty or not guilty
During the trial evidence is collected and weighed either in favor of the defendent being guilty (the job of the DA) or in favor of the defendent being not guilty (the job of the Defense Lawyer). At the end of the trial the judge (or jury) decides between the two alternatives and either convicts the defendent (if he/she was assumed to be proven guilty beyond a reasonable doubt) or lets them go (if there was sufficient doubt in the defendent's guilt).
Note that a defendent is "innocent until proven guilty". If the judge (or jury) decides a defendent is not guilty, that does not necessarily mean he/she is innocent. It simply means there was not enough evidence for a conviction.
In general, a statistical test involves four elements to a statistical test:
- Null Hypothesis (written as H0): The
"tried and true situation", or "the status quo", or "innocent
until proven guilty"
- Alternative Hypothesis (written as Ha):
This is what you suspect (or hope) is
really true, the new
situation, "guilty" - in general it is the
opposite of the null
- Test Statistics: Collecting evidence - in our case we usually select a random sample and compute some number based on the sample data
- Rejection Region: Do we reject the null hypothesis (and therefore accept the alternative), or do we declare our test inconclusive, and if we do decide to reject the null hypothesis, what is the probability that our decision is incorrect.
- Rejecting the null hypothesis when in fact it is true is called a Type I - Error. That's exactly the error we will be computing in the procedure above when we reject the null hypothesis. It should, of course, be small so that we can be confident in our decision to reject the null hypothesis.
- Accepting the null hypothesis when in fact it is false is called a Type II - Error. This type of probability is not covered by our procedure (which is why we will never accept the null hypothesis, we rather declare our test inconclusive if necessary)
Since the sample mean is 11.3, which is more than other drugs, it looks like this sample mean supports the claim (because the mean from our sample is indeed bigger than 10). But - knowing that we can never be 100% certain - we must compute a probability and associate that with our conclusion, if indeed we want to make that conclusion.
In other words, we need to setup the four components of a statistical test: the population is the amount of decrease in blood pressure in people who have been given the new drug.
- The Null Hypothesis is the "tried and true" assumption that all drugs are about the same and the new drug has about the same effect as all other drugs. Thus, the null hypothesis is that the average decrease in blood pressure (the population mean) is 10 mmHg, just as for all other drugs.
- The Alternative Hypothesis is what we hope to be true, i.e. that the new drug results in higher decrease than the traditional dugs. Thus, the alternative hypothesis is that the average decrease in blood pressure (the population mean) is more than 10 mmHg.
- For our Test Statistics
we collect evidence in form of our random sample. We found that
for this random sample the sample mean is 11.3 mmHg, the sample
standard deviation is 5.1 mmHg, and the sample size N is 62.
These figures are converted into a single number (as described
in the next chapter). In this case the test statistics will turn
out to be
z = 2.01
- Rejection Region:
Finally we use the test statistics z = 2.01 to compute the
probability p of committing an error in deciding that the null
hypothesis is true (the type-1 error). If that error is small,
we do indeed decide to reject the null hypothesis, otherwise we
will declare the test to be invalid. In this case the
probability will turn out to be (see next chapter):
p = 2*P(z > 2.01) = 0.044 or 4.4%
So, how do we compute the above numbers to arrive at this decision ... read the next section -:)