Random Variables

$\text{[math]}$

Understanding the concept of a random variable is important for a deeper understanding of statistics. Next some key terminology will be covered related to random variables.

\defs{Definitions}

{\bf Random variable:
Is an outcome or observation whose value is determined by a process that is not predetermined and thus can't be predicted. Random variables are often denoted using capital letters, and possible values that a random variable can take by a lower case letter.}
1. {\bf Categorical random variable:} Is a random variable that results in categorical response (non-numeric), such as gender (male or female), and opinion (strongly disagree, disagree, ..., or strongly agree).
  - {\bf Dummy coding:} Dummy coding is turning a variable with two or more outcomes into a variable(s) with possible values of 0 and 1. Often categorical variables are dummy coded for
    analysis purposes. For example, the gender male might be assigned the value of 0 and females the value of 1. If there are several categories, several dummy variables are needed to capture all the information. The dummy coded data can now be treated as a numerical random variable.
2. {\bf Numerical random variable:} Is a random variable that results in a numerical response. Examples include height, weight, age, income, etc. of a randomly selected individual.
  1. {\bf Discrete random variable:} Resulting integer values, like the number of heads observed when flipping a coin four times, x=0,1,2,3 or 4. For an example, see Table~contdisc1.
  2. {\bf Continuous random variable:} Resulting in continuous values,
    like income. For an example see Table~contdisc1.
{\bf Cumulative distribution function (c.d.f.):}
Basically $P(X \leq x)$ where $X$ is a random variable and $x$ is a real number.
The cdf is often denoted with a capital $F$ as $F(x)$, i.e. $F(x)=P(X \leq x)$.
{\bf Probability distribution function (p.d.f.):}
1. For a discrete random variable it is merely the probability of a certain value occurring, $P(X=x)$.
  - The probability distribution function has the following properties:
    1. $f(x_i) \geq 0, \quad \forall i.$
    2. $\sum_{\forall i} f(x_i)=1$
2. For a continuous random variable the $P(X=x)=0$ and thus the definition is not the same. The p.d.f. for a continuous random variable is a curve described by the function, $f(x)$. The area under the curve within a given interval yields the probability of the continuous random variable falling within that given interval.
  - The probability distribution function has the following properties:
    1. $f(x) \geq 0$
    2. $\int_{-\infty}^{\infty}{f(x)dx}=1$
    3. $F(b)-F(a)=P(a\leq X\leq b) = \int_{a}^{b}{f(x)dx}$, which is the area under the curve $f(x)$ from $a$ to $b$, $a\leq b$.
  - Note: $P(X=b)=F(b)-F(b)=\int_{b}^{b}{f(x)dx}=0$, that is the probability of a continuous random variable equaling a specific constant, say $b$, is zero.
{\bf Expectation} of a random variable is the mean value (a weighted mean) of the variable $X$ in the sample space, or population, of possible outcomes. {\em Expected value} can also be interpreted as the mean value that would be obtained from an infinite number of observations of the random variable.

\begin{table}
\centering
\begin{tabular}{|c|c|}\hline
Discrete & Continuous\\\hline
0& 736.1918273\\
1& 759.5668806\\
2& 812.7593044\\
3& 562.2359305\\
4& 798.2952718\\\hline
\end{tabular}
\caption{Example of Discrete and Continuous Data}
\label{contdisc1}
\end{table}

\defl{Examples of Categorical, Continuous and Discrete Data.}

Categorical:
1. Gender
2. Blood Type
3. Marital Status
4. Eye Color
5. Political Party
Discrete:
1. Number of people using the ATM at a certain location within the past hour.
2. Number of brothers or sisters a person has.
3. Number of times a person won at roulette within the past 20 spins.
Continuous:
1. Income
2. Age
3. Height
4. Weight

[\latex]

Binomial

\defl{Binomial Distribution has the following properties:} There are a fixed number of trials or observations, $n$, determined in advance.Each trial can take on one of two possible outcomes, labeled ”success” and ”failure”.Each trial’s outcome is determined independently of all the other trials.The probability of a success and that of a failure remains the same from …

Exponential

Exponential Distribution has the following properties: Equals the distance between successive occurances or arrivals of a Poisson process with mean $\lambda > 0$$\lambda$ is the average number of occurances or arrivals per unit of time (length, space, etc.)$\frac{1}{\lambda}$ is the average time between occurrences or arrivals. \defl{Exponential Distribution:} \[f(x) = \lambda e^{-{\lambda}x} \] \[F(x) = …

Hypergeometric

\defl{Hypergeometric distribution has the following properties:} When units are selected from a finite population without replacement and the population consists of successes and failures. The major difference between the Hypergeometric distribution and the Binomial distribution is that the probability of selecting a success is {\bf not constant and is not independent} from each draw. \defm{Hypergeometric …

Normal

\defl{Normal Distribution has the following properties:} Symmetrical and a bell shaped appearance. The population mean and median are equal. An infinite range, $-\infty < x < \infty$ The approximate probability for certain ranges of $X$-values: $P(\mu - 1\sigma < X < \mu + 1\sigma) \approx 68%$ $P(\mu - 2\sigma < X < \mu + 2\sigma) …

Random Variables

Binomial

Exponential

Hypergeometric

Normal

Poisson

In this section

Search

Recent Posts

Categories