A probability distribution model belongs to the field of probability theory.
Ways to choose the probability distribution model for a random variable:
- Study the characteristics of the probabilities
- Represent graphically and check which model matches better
Probability distribution models
Probability distribution models featured on this post:
- Bernoulli
- Binomial
- Multinominal
- Poisson
- Uniform
- Exponential
- Normal
- Student’s t
- Geometric
- Pascal
- Hypergeometric
- Chi-square
- Snedecor
Used for discrete variables: Binomial and Poisson
Used for continuous variables: uniform, exponential, normal, Student’s t.
Bernoulli
A Bernoulli trial is a random experiment that has only two possible outcomes, usually labeled “success” and “failure”.
The Bernoulli distribution is the probability distribution of a random variable that represents the outcome of a single Bernoulli trial. The variable can take a value of 1 for “success” (with probability p) and 0 for “failure” (with probability 1−p).
Binomial and geometric distribution models are built upon it.
$$X =
\begin{cases}
1, & \text{with probability } \pi \\
0, & \text{with probability } 1 – \pi
\end{cases}$$
Expected value:
$$E[X] = 1 \cdot \pi + 0 \cdot (1 – \pi) = \pi$$
In the context of a dichotomous variable coded as 0 and 1, the mean is equal to the proportion.
Variance:
$$\text{Var}(X) = E[X^2] – (E[X])^2$$
Since \(X^2 = X\)
$$E[X^2] = E[X] = \pi$$
Therefore:
$$\text{Var}(X) = \pi – \pi^2 = \pi(1 – \pi)$$
Given n independent copies of X, the sampling variance is calculated as:
$$\text{Var}(\widehat{\Pi}) = \frac{\pi(1 – \pi)}{n}$$
Binomial
A binomial distribution is built upon benouilli distribution.
The random variable that follows a binomial distribution is described mathematically with this formula:
$$X \sim \mathcal{B}(n,p)$$
This can be explained as X follows a binominal distribution, whose parameters are n, i.e. the number of times that the experiments was repeated, and p, i.e. the probability that when the experiment is performed the result A is obtained.
Multinomial
The multinomial distribution is a generalization of the binomial distribution.
Poisson
The poisson distribution is…
$$X \sim \mathcal{P}(\lambda)$$
Uniform distribution
Formula of a uniform distribution over the interval [a,b]:
$$f_X(x) = \left\{ \begin{array}{ll} \frac{1}{(b-a)} & \text{if}\ a \le x \le b \\ 0 & \text{otherwise} \end{array} \right.$$
The formula for the expected value is:
$$\mathbb{E}[X]=\frac{a+b}{2}$$
Exponential distribution
$$X \sim Exp(\lambda)$$
This means that X follows an exponential distribution with parameter \(\lambda\).
Possible values: x > 0.
Density function:
$$f_X(x) = \left\{ \begin{array}{ll} \lambda e^{-\lambda x} & \text{if}\ x > 0 \\ 0 & \text{otherwise} \end{array} \right.$$
$$\mathbb{E}[X]=1/\lambda$$
$$\text{Var}(X)=1/\lambda^2$$
Normal distribution
The normal distribution is denoted as N(0,1).
The density function for a normal distribution is:
$$ f(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$
$$X \sim \mathcal{N}(\mu,\sigma^2)$$
There is the standard normal distribution.
Reference:
- Standard normal distribution [online]. Wikipedia. Available at: https://en.wikipedia.org/wiki/Normal_distribution
Student’s t distribution
The Student’s t distribution is a generalization of normal distribution when samples are low.
Statistic
The general formula for a statistic:
$$t = \frac{\text{observed value} – \text{theorical value under} H_0}{\text{SE}}$$
$$Var(\bar{Y}_1 – \bar{Y}_2) = \frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}$$
$$SE(\bar{Y}_1 – \bar{Y}_2) = \sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}$$
In conclusion, the statistic used Student’s t-test for two samples:
$$T = \frac{(\bar{Y}_1 – \bar{Y}_2)-(\mu_1 – \mu_2)}{SE(\bar{Y}_1 – \bar{Y}_2)}$$
$$t = \frac{(\bar{Y}_1 – \bar{Y}_2)-(\mu_1 – \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}}$$
References
Student’s t distribution at Wikipedia
Geometrical distribution
The geometrical distribution is…
Reference:
- Geometrical distribution [online]. Wikipedia. Available at: https://en.wikipedia.org/wiki/Geometric_distribution
Pascal distribution
A Pascal distribution is a generalization of the geometrical distribution.
Hypergeometric distribution
The hypergeometric distribution is a probability distribution.
It is used on the Fisher’s exact test in contrast hypothesis.
The hypergeometric law:
$$A \sim \mathrm{HYP}(N, n_1, r_1)$$
Reference:
- Hypergeometric distribution [online]. Wikipedia. Available at: https://en.wikipedia.org/wiki/Hypergeometric_distribution
Chi-square distribution
The chi-square distribution is a probability distribution.
- Chi-square distribution [online]. Wikipedia. Available at: https://en.wikipedia.org/wiki/Chi-squared_distribution
Snedecor’s F distribution
The Snedecor’s F distribution is denoted as Fm,n,
- F-distribution [online]. Wikipedia. Available at: https://en.wikipedia.org/wiki/F-distribution
Related entries
Main predecessor:
Other:
- Frequentist inferential statistics