A probability model is a function that describes the random behavior and properties of a random variable. This application supports various probability models of two types:
Discrete Probability Models describe discrete random variables, which have countable possible outcomes. This often means that we are counting; for example, we might count the number of times an event of interest occurs per week. Discrete probability models are defined by their probability mass function (PMF) which describes the probability of each possible outcome. A cumulative distribution function (CDF) is calculated from the PMF and returns the cumulative probability; e.g., P(X ≤ x).
Continuous Probability Models: describe continuous random variables, which have infinitely many possible outcomes. These observations are often made using units that can be made more precise. For example, we might report our age in years, months, weeks, days, hours, minutes, seconds, and so on. Continuous probability models are defined by their probability density function (PDF) which describes the likelihood of the possible outcomes. A cumulative distribution function (CDF) is calculated from the PDF and returns the cumulative probability; e.g., P(X ≤ x).
Each named distribution is described by a PMF/PDF, depending on whether the random variable it describes is discrete or continuous. The named distributions are indexed by a parameter or multiple parameters that determine the center, shape, and spread of the probability distribution.
Step 1: To use this app, go to the 'Probability Calculator' tab.
Step 2: Next, you must select the named distribution and specify the necessary parameters.
Step 3: You can specify a probability of interest to calculate by selecting the appropriate expression, and specifying the necessary inputs.
Step 4: The resulting probability calculation is calculated and plotted and provided as output.
Please contact us if you have any questions at datascience@colgate.edu.
The Empirical Rule describes what portion of observations we can expect between one, two, and three standard deviations of the mean under the Gaussian (Normal) distribution. Specifically, the Empirical Rule states that we can expect 68% of observations within one standard deviation of the mean, 95% of observations within two standard deviations of the mean, 99.7% of observations within three standard deviations of the mean. This is depicted in the figure below.
We can confirm this using the Probability Calculator. First, we show that 68% of observations are between -1 and 1 under a standard Gaussian (Normal) distribution where μ =0 and σ =1.
Next, we show that 68% of observations are between -2 and 2 under a standard Gaussian (Normal) distribution where μ =0 and σ =1.
Finally, we show that 68% of observations are between -3 and 3 under a standard Gaussian (Normal) distribution where μ =0 and σ =1.
We have shown the Empirical Rule holds under the standard Gaussian (Normal) distribution. Note that this demonstration shows that there are only a small portion of observations that fall 'far' -- more than three standard deviations -- from the mean. You may also note that observations with a standardized observation or z-score less than -3 or more than 3 only occur 0.3% of the time; these rare observations are outliers.