A statistician's toolbox
Vocabulary
- census: recensement
- average: moyenne
- mean: moyenne
- median: médiane
- standard deviation: écart-type
- variance: variance
- box plot: boîte à moustaches
- scatter plot: nuage de points
Formulas
Let be real numbers.
- The mean is often denoted by and is given by
- The variance measures the dispersion of the values . It is denoted by the the letter and is given by
- The standard deviation is denoted by the Greek letter and is given by
Let be real numbers. We now have two samples of data and .
- We can compute their covariance
If the respective standard deviations of the values and are denoted by and , we then obtain their
- Pearson correlation coefficient, denoted by , and given by
We always have . If the values and are mostly independant, we obtain , whereas a value means that the values and are correlated.
Application
Three students, Alice, Beatriz, and Charles, are in the same class. Out of guilt, they admitted that they cheated during their year together. By analyzing their marks, can you understand who cheated?
Alice | Beatriz | Charles |
---|---|---|
12 | 2 | 15 |
15 | 12 | 12 |
3 | 12 | 6 |
12 | 6 | 16 |
3 | 5 | 6 |
18 | 17 | 18 |
8 | 5 | 9 |
8 | 16 | 11 |
4 | 9 | 0 |
7 | 13 | 14 |
9 | 13 | 12 |
8 | 14 | 2 |
12 | 10 | 14 |
17 | 17 | 12 |
16 | 9 | 9 |
14 | 5 | 12 |
9 | 9 | 16 |
16 | 10 | 9 |
16 | 9 | 8 |
1 | 15 | 7 |
Correlations
Sometimes two things can be correlated (have ), just by coincidence.