Understanding the concept of a random variable is important for a deeper understanding of statistics. Next some key terminology will be covered related to random variables.
\defs{Definitions}
- {\bf Random variable:
Is an outcome or observation whose value is determined by a process that is not predetermined and thus can't be predicted. Random variables are often denoted using capital letters, and possible values that a random variable can take by a lower case letter.}- {\bf Categorical random variable:} Is a random variable that results in categorical response (non-numeric), such as gender (male or female), and opinion (strongly disagree, disagree, ..., or strongly agree).
- {\bf Dummy coding:} Dummy coding is turning a variable with two or more outcomes into a variable(s) with possible values of 0 and 1. Often categorical variables are dummy coded for
analysis purposes. For example, the gender male might be assigned the value of 0 and females the value of 1. If there are several categories, several dummy variables are needed to capture all the information. The dummy coded data can now be treated as a numerical random variable.
- {\bf Dummy coding:} Dummy coding is turning a variable with two or more outcomes into a variable(s) with possible values of 0 and 1. Often categorical variables are dummy coded for
- {\bf Numerical random variable:} Is a random variable that results in a numerical response. Examples include height, weight, age, income, etc. of a randomly selected individual.
- {\bf Categorical random variable:} Is a random variable that results in categorical response (non-numeric), such as gender (male or female), and opinion (strongly disagree, disagree, ..., or strongly agree).
- {\bf Cumulative distribution function (c.d.f.):}
Basically $P(X \leq x)$ where $X$ is a random variable and $x$ is a real number.
The cdf is often denoted with a capital $F$ as $F(x)$, i.e. $F(x)=P(X \leq x)$. - {\bf Probability distribution function (p.d.f.):}
- For a discrete random variable it is merely the probability of a certain value occurring, $P(X=x)$.
- The probability distribution function has the following properties:
- $f(x_i) \geq 0, \quad \forall i.$
- $\sum_{\forall i} f(x_i)=1$
- The probability distribution function has the following properties:
- For a continuous random variable the $P(X=x)=0$ and thus the definition is not the same. The p.d.f. for a continuous random variable is a curve described by the function, $f(x)$. The area under the curve within a given interval yields the probability of the continuous random variable falling within that given interval.
- The probability distribution function has the following properties:
- $f(x) \geq 0$
- $\int_{-\infty}^{\infty}{f(x)dx}=1$
- $F(b)-F(a)=P(a\leq X\leq b) = \int_{a}^{b}{f(x)dx}$, which is the area under the curve $f(x)$ from $a$ to $b$, $a\leq b$.
- Note: $P(X=b)=F(b)-F(b)=\int_{b}^{b}{f(x)dx}=0$, that is the probability of a continuous random variable equaling a specific constant, say $b$, is zero.
- The probability distribution function has the following properties:
- For a discrete random variable it is merely the probability of a certain value occurring, $P(X=x)$.
- {\bf Expectation} of a random variable is the mean value (a weighted mean) of the variable $X$ in the sample space, or population, of possible outcomes. {\em Expected value} can also be interpreted as the mean value that would be obtained from an infinite number of observations of the random variable.
\begin{table}
\centering
\begin{tabular}{|c|c|}\hline
Discrete & Continuous\\\hline
0& 736.1918273\\
1& 759.5668806\\
2& 812.7593044\\
3& 562.2359305\\
4& 798.2952718\\\hline
\end{tabular}
\caption{Example of Discrete and Continuous Data}
\label{contdisc1}
\end{table}
\defl{Examples of Categorical, Continuous and Discrete Data.}
- Categorical:
- Gender
- Blood Type
- Marital Status
- Eye Color
- Political Party
- Discrete:
- Number of people using the ATM at a certain location within the past hour.
- Number of brothers or sisters a person has.
- Number of times a person won at roulette within the past 20 spins.
- Continuous:
- Income
- Age
- Height
- Weight