# Measures of Dispersion

- Dispersion along with the central tendency gives us a fairly good idea about the distribution. Dispersion measures include â€“
**range, standard deviation, variance,****coefficient of variation (CV)**and**percentiles.**

CV =__Std. deviation__x 100

Mean

CV helps us in comparing variability amongst different variables.**(AIIMS Mayâ€™08).** **Percentile**is a number that indicates the percentage of distribution that is equal to or below that number. E.g. 95^{th}percentile is the value below which 95% of the values in the given distribution will lie.**Thus 50**^{th}percentile is also the median.

**â€‹**

**Standard normal distribution curve**

**For standard normal curve:**

Mean Â± ISD = 68.3% of values covered

Mean Â± 2SD = 95.4% of values covered

Mean Â± 3SD = 99.7% of values covered

**For skewed area**

Right skewed are = Mean > median > mode

Outliers are more & larger than rest of values.

**Left Skewed Area**

Left skewed are = Mode > Median > Main

Outliers are smaller than rest of value

**Skewness coefficient**

Pearson has described 2 types of skewness coefficient which measures the degree and direction of skewness or asymmetry in the data.

**Probability****&****probability distributions (AIPG- 09)****â€‹**The probability of an event is a quantitative measure of the proportion of all possible, equally likely outcomes that are favourable to the event; it is denoted by p.- Probabilities are usually expressed as decimal fractions, not as a percentages, and must lie between zero(zero probability) and one(absolute certainity). The probability of an event cannot be negative.
- Probability of an event can be expressed as a ratio of the number of likely outcomes to the number of possible outcomes.
- The probability of an event not occurring is equal to one minus the probability that it will occur; this is denoted by q.

- â€‹
**Mutually Exclusive Events & the Addition Rule****â€‹**Two events are said to be mutually exclusive when the occurrence of one precludes the occurrence of the other, eg - male or female, pregnant or not, blood group A or O etc. the probability of mutually exclusive events occurring is the probability that either one event occurs*or*the other event occurs.- Thus the probability of being either blood type A or blood type O is:
+*P (O or A) = P (O)**P (A)*

*â€‹***Independent events & the multiplicative rule****â€‹**Two different events are independent if the outcome or occurrence of one event has no effect on the outcome or occurrence of the second event. Eg â€“ gender and blood types are independent events; the sex of the person doesnâ€™t affect in any way the personâ€™s blood type.- The probability of two independent events is the probability that both events occur, and this probability is found by multiplying the probabilities of the two events.
- Thus:

x*P (male and blood type O) = P (male)**P (blood type O).*

*â€‹***Binomial Distribution****â€‹**The probability that a specific combination of mutually exclusive independent events will occur can be determined by the use of the binomial distribution.- A binomial distribution is one in which there are only two possibilities, such as yes/no, male/female, healthy/sickly. A typical medical use of the binomial distribution is in genetic counseling.

- â€‹
**Probability distributions****â€‹**The values of a random variable can be summarized in a frequency distribution called a*probability distribution*.- Three such distributions are important in medicine.
**Binomial**and**Poisson**distributions are discrete probability distributions of a variable that can have only two outcomes (yes or no).- The associated random variable can take only integer values, 0,1,2,3â€¦
*n.* **Normal (Gaussian)**distribution is continuous, so it can take on any values. The curve is inverted, bell shaped and smooth. The peak represents the mean, median and mode all of which coincide. The variation is measured by standard deviation. The area under the curve is equal to 1(because it is a probability distribution). The area on the left half of the mean is equal to the area on the right half of the mean.- We use the
**standard normal distribution curve,**which has a mean of 0 and a standard deviation of 1. It is also called the*z*-*distribution*. 68.2% of all values lie between â€“1 and + 1 SD around the mean. 95.4% of all values lie between â€“2 and +2 SD. **Poissonâ€™s Distribution (AIIMS Mayâ€™08)**

In probability theory and statistics Poisson distribution is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event.

It can also be used for the number of events in other specified intervals such as distance, area or volume. The distribution was discovered by Simeon Denis Poisson. The work focused on certain random variables N that count, among other things, a number of discrete occurrences ( sometimes called arrivals) that take place during a time interval of given length.

**Drawing inferences from data****â€‹**The most common parameters of interest in health are mean (Î¼) and proportion (Ï€).- They tend to vary from sample to sample, if we take repeated samples from a target population the sample means and proportion will have a gaussian or non-gaussian distribution, called
**sampling distribution.** - The sample means will tend to vary around a population standard deviation, but this variability is now called
**standard error of mean (SEM).**Similarly we can get a**standard error of proportion.** - The larger the standard error, larger will be the variability and less is our confidence in the sample results.

**SE (mean)**=__Sample standard deviation__where â€˜nâ€™ = sample size

âˆšn

**SE (proportion)**= âˆš__p(1-p)__where â€˜pâ€™ = proportion, and â€˜nâ€™= sample size

âˆšn

- â€‹
**Confidence interval (CI):**

The sample mean and proportion are considered to be point estimates of the corresponding population parameters. When a mean (or any central value) is calculated from a random sample, a range (a, b) can be obtained that is likely to contain the true population parameter when repeated samples are taken. The degree of this assurance is called the level of confidence in that range. Thus a 95% CI of mean is to be interpreted as: 95% of such confidence intervals will contain the true population mean (Î¼) if repeated samples are taken, but in reality we take only one sample thus 95% CI is interpreted as having a 95% probability of containing true mean. **Null hypothesis (H**_{0}):

It states that there is no difference between the population and the sample parameter or between two sample parameters. It is either rejected (if difference exists) or not rejected (if no difference exists) after conducting a study on sample subjects, which serve as evidence.**Steps involved in hypothesis testing:****â€‹**State the null and alternate hypotheses.- Select decision criteria a (level of significance).
- Establish critical values of 't' associated with this criteria (using degree of freedom and t value from table of t scores).
- Draw a random sample from the population and calculate it's mean.
- Calculate standard deviation and standard error of the mean.
- Calculate value of t that corresponds to mean of the sample.
- Compare the calculated value of 't' with the critical value selected above and then accept or reject the null hypothesis.â€‹