Probability distribution model

A probability distribution model belongs to the field of probability theory.

Ways to choose the probability distribution model for a random variable:

  • Study the characteristics of the probabilities
  • Represent graphically and check which model matches better

Probability distribution models

Probability distribution models featured on this post:

  • Bernoulli
  • Binomial
  • Multinominal
  • Poisson
  • Uniform
  • Exponential
  • Normal
  • Student’s t
  • Geometric
  • Pascal
  • Hypergeometric
  • Chi-square
  • Snedecor

Used for discrete variables: Binomial and Poisson

Used for continuous variables: uniform, exponential, normal, Student’s t.

Bernoulli

A Bernoulli trial is a random experiment that has only two possible outcomes, usually labeled “success” and “failure”.

The Bernoulli distribution is the probability distribution of a random variable that represents the outcome of a single Bernoulli trial. The variable can take a value of 1 for “success” (with probability p) and 0 for “failure” (with probability 1−p).

Binomial and geometric distribution models are built upon it.

$$X =
\begin{cases}
1, & \text{with probability } \pi \\
0, & \text{with probability } 1 – \pi
\end{cases}$$

Expected value:

$$E[X] = 1 \cdot \pi + 0 \cdot (1 – \pi) = \pi$$

In the context of a dichotomous variable coded as 0 and 1, the mean is equal to the proportion.

Variance:

$$\text{Var}(X) = E[X^2] – (E[X])^2$$

Since \(X^2 = X\)

$$E[X^2] = E[X] = \pi$$

Therefore:

$$\text{Var}(X) = \pi – \pi^2 = \pi(1 – \pi)$$

Given n independent copies of X, the sampling variance is calculated as:

$$\text{Var}(\widehat{\Pi}) = \frac{\pi(1 – \pi)}{n}$$

Binomial

A binomial distribution is built upon benouilli distribution.

The random variable that follows a binomial distribution is described mathematically with this formula:

$$X \sim \mathcal{B}(n,p)$$

This can be explained as X follows a binominal distribution, whose parameters are n, i.e. the number of times that the experiments was repeated, and p, i.e. the probability that when the experiment is performed the result A is obtained.

Multinomial

The multinomial distribution is a generalization of the binomial distribution.

Poisson

The poisson distribution is…

$$X \sim \mathcal{P}(\lambda)$$

Uniform distribution

Formula of a uniform distribution over the interval [a,b]:

$$f_X(x) = \left\{ \begin{array}{ll} \frac{1}{(b-a)} & \text{if}\ a \le x \le b \\ 0 & \text{otherwise} \end{array} \right.$$

The formula for the expected value is:

$$\mathbb{E}[X]=\frac{a+b}{2}$$

Exponential distribution

$$X \sim Exp(\lambda)$$

This means that X follows an exponential distribution with parameter \(\lambda\).

Possible values: x > 0.

Density function:

$$f_X(x) = \left\{ \begin{array}{ll} \lambda e^{-\lambda x} & \text{if}\ x > 0 \\ 0 & \text{otherwise} \end{array} \right.$$

$$\mathbb{E}[X]=1/\lambda$$

$$\text{Var}(X)=1/\lambda^2$$

Normal distribution

The normal distribution is denoted as N(0,1).

The density function for a normal distribution is:

$$ f(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$

$$X \sim \mathcal{N}(\mu,\sigma^2)$$

There is the standard normal distribution.

Reference:

Student’s t distribution

The Student’s t distribution is a generalization of normal distribution when samples are low.

Statistic

The general formula for a statistic:

$$t = \frac{\text{observed value} – \text{theorical value under} H_0}{\text{SE}}$$

$$Var(\bar{Y}_1 – \bar{Y}_2) = \frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}$$

$$SE(\bar{Y}_1 – \bar{Y}_2) = \sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}$$

In conclusion, the statistic used Student’s t-test for two samples:

$$T = \frac{(\bar{Y}_1 – \bar{Y}_2)-(\mu_1 – \mu_2)}{SE(\bar{Y}_1 – \bar{Y}_2)}$$

$$t = \frac{(\bar{Y}_1 – \bar{Y}_2)-(\mu_1 – \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}}$$

References

Student’s t distribution at Wikipedia

Geometrical distribution

The geometrical distribution is…

Reference:

Pascal distribution

A Pascal distribution is a generalization of the geometrical distribution.

Hypergeometric distribution

The hypergeometric distribution is a probability distribution.

It is used on the Fisher’s exact test in contrast hypothesis.

The hypergeometric law:

$$A \sim \mathrm{HYP}(N, n_1, r_1)$$

Reference:

Chi-square distribution

The chi-square distribution is a probability distribution.

Snedecor’s F distribution

The Snedecor’s F distribution is denoted as Fm,n,

Related entries

Main predecessor:

Other:

  • Frequentist inferential statistics

Leave a Reply

Your email address will not be published. Required fields are marked *