1. Concept Behind Inferential Statistics
To begin with there are certain fundamental concepts in this section that span the section and are used throughout statistics. Often a sample is taken and the sample is a subgroup of a larger group, the population. From the sample it is desired to learn about the population. For example, a survey is taken on 200 people living in Bangkok and their opinion about the underground train. Results are published and comments are made about it. Is anyone truly concerned about the specific 200 people in the survey? If 200 people do not like the underground train does it really matter? No. In fact there exist well over 200 people in Bangkok that have never taken the underground train. What people really want to learn from the survey is the general opinion within Bangkok about the underground and to do this a sample of 200 people are surveyed and asked questions. If all 200 people surveyed did not like the underground train, this is of concern only because it leads us to believe that the general populace within Bangkok do not like the underground train and perhaps only a small minority like the train. The sample is almost immediately within our minds extrapolated to the population at large. In this case the population at large is people living in Bangkok. Inferential statistics are used to learn about the population from the sample.
Two common techniques to use a sample to learn about a population that go beyond descriptive statistics are hypothesis testing and confidence intervals. Hypothesis testing is used to test a theory. Confidence intervals are used to obtain a range of values for which you might consider the population mean, $\mu$, to be within. Technically, from a frequentist viewpoint, the population mean is either within the interval or not.
1.1. Hypothesis testing
In general within hypothesis testing we wish to test a theory, belief or simply something of interest. It is desired to test if a quantity concerning the
population, called a parameter, is either not equal to, greater than or less than some value.
Typically, the population mean, $\mu$, or proportion, $\pi$, is the parameter, but not always.
In hypothesis testing the theory is turned into what is called a null hypothesis, denoted $H_0$, and an alternative hypothesis, denoted $H_1$ or $H_A$. In general hypothesis testing one may want to compare one group/sample to a specific value, say $\mu_0$. Often within hypothesis testing one may want to compare two groups/samples to each other, such as comparing the average salary of men, say $\mu_1$, to the average salary of women, say $\mu_2$.
The alternative hypothesis is what is desired to prove or show to be true and the null hypothesis the opposite. Examples: If it is desired to prove the …
- average income in Bangkok is greater than 30,000 Baht/month:
- $H_0:$ $\mu \leq 30,000$ and $H_A:$ $\mu > 30,000$.
- average income in Bangkok of men is greater than that of women:
- $H_0:$ $\mu_{men} \leq \mu_{women}$ and $H_A:$ $\mu_{men} > \mu_{women}$.
- percent of women in Hong Kong is less than 50%:
- $H_0:$ $\pi \geq 50%$ and $H_A:$ $\pi < 50%$.
- etc.
group/ &\mu \geq \mu_0 & \mu < \mu_0 & P(Z < z) \\ sample &\mu \leq \mu_0 & \mu > \mu_0 & P(Z > z) \\\hline
\pi from one&\pi=\pi_0 & \pi \neq \pi_0& 2\times P(Z>|z|) \\
group/ &\pi \geq \pi_0 & \pi < \pi_0& P(Z < z) \\ sample &\pi \leq \pi_0 & \pi > \pi_0& P(Z > z) \\\hline
\mu from two&\mu_1=\mu_2 & \mu_1 \neq \mu_2& 2\times P(Z>|z|) \\
groups/&\mu_1 \geq \mu_2 & \mu_1 < \mu_2& P(Z < z) \\ samples&\mu_1 \leq \mu_2 & \mu_1 > \mu_2& P(Z > z) \\\hline
\pi from two&\pi_1=\pi_2 & \pi_1 \neq \pi_2& 2\times P(Z > |z|) \\
groups/&\pi_1 \geq \pi_2 & \pi_1 < \pi_2& P(Z < z) \\ samples&\pi_1 \leq \pi_2 & \pi_1 > \pi_2& P(Z > z) \\\hline
\end{array}
In hypothesis testing a decision is made by using what is known as a {\it p-value}. The p-value is the probability of observing what was observed or more extreme assuming the null hypothesis is true. If the probability of observing what was observed or more extreme assuming the null hypothesis is true is “very small” the researcher rejects the null hypothesis. The researcher rejects the null hypothesis when the p-value is small because we trust the data over the null hypothesis. Typically p-values less than that of 0.1, 0.05, or 0.01 are considered too small to be random chance and the null hypothesis is rejected. The value which the null hypothesis will be rejected at is called the {\it level of significance} and denoted by $\alpha$. Commonly for large data sets often a significance level of 0.01 is used. Typically in the class room setting an $\alpha=0.05$ is used.
\defm{ Important: \\ If p-value $< \alpha$ then reject $H_0$ \\ If p-value $\ge \alpha$ then fail to reject $H_0$} For hypothesis testing regardless of the test chosen and the test-statistic used the steps are generally the same. This book will only cover the p-value approach to hypothesis testing. Other books cover a rejection region as well. The rejection region approach is useful for when a p-value can’t be calculated. For example, when the researcher does not have access to a computer, like on exams. When working, in this day in age the researcher will most likely have access to a computer and almost all, if not all statistical software calculates a p-value for hypothesis testing. For this reason only the p-value approach will be covered.\\
- Determine the null hypothesis, $H_{0}$, and the alternative hypothesis, $H_{A}$.
- Decide on the appropriate level of significance, $\alpha$.
- Determine the sample size and sampling design to use.
- The tests in this chapter are appropriate when the data comes from a simple random sample.
- The tests in this chapter and other statistical tests are \bf not appropriate when the data comes from a convenience or other type of non-probability sample.
- Determine the appropriate test statistic given the data and sampling design.
- Collect the data and calculate the appropriate test statistic.
- Calculate the p-value for the $H_{0}$ and $H_{A}$ combination.
- Make a decision whether to fail to reject $H_{0}$ or reject the $H_{0}$ by comparing the p-value to $\alpha$.
1.2. Confidence Intervals In general when creating what is called a confidence interval, we wish to obtain a range of plausible values for a quantity concerning the population, a parameter. Typically the population mean, $\mu$, or proportion, $\pi$, is the parameter, but not always. Also, it is often desired determine a plausible range between two groups/samples to each other, such as comparing the average salary of men, say $\mu_1$, to the average salary of women, say $\mu_2$. A $(1-\alpha) \times 100%$ confidence interval is the probability of obtaining the parameter of interest under what is known as a Bayesian approach and is often the way a confidence interval is explained. Bayesian’s consider the parameter of interest a random variable. The author is a frequentist, and the author considers the parameter to be an unknown constant. Under the frequentist approach, a $(1-\alpha) \times 100%$ is the percent of confidence intervals that are expected to contain the true value of the parameter of interest. This is assuming an infinite number of samples taken of the same size, under a simple random sample. Of course, in reality only a single sample is taken in practice. The confidence interval is thus often considered the range of plausible values the parameter might be, what it is, is unknown in reality though and may or may not be within the interval.