The quasivariance , quasi variance or variance unbiased is a statistical measure of the dispersion of the data in a sample from the mean. The sample, in turn, consists of a series of data taken from a larger universe, called the population .
It is denoted in various ways, here s c 2 has been chosen and the following formula is used to calculate it:
-s c 2 = the quasi-variance or variance of the sample (sample variance)
-x i = each of the sample data
-n = number of observations
-X = the sample mean
Given that the unit of the sample quasi-variance is the square of the unit in which the sample comes, when interpreting the results it is preferred to work with the quasi standard deviation or standard deviation of the sample.
This is denoted as s c and is obtained by extracting the square root of the quasi-variance:
s c = √ s c 2
The quasi-variance is similar to the variance s 2 , with the only difference that the denominator of the variance is n-1 , while the denominator of the variance is divided only by n . It is clear that when n is very large, the values of both tend to be the same.
When you know the value of the quasi-variance, you can immediately know the value of the variance.
Examples of quasi-variance
Often you want to know the characteristics of any population: people, animals, plants and, in general, any type of object. But analyzing the entire population may not be an easy task, especially if the number of elements is very large.
Then samples are taken, with the hope that their behavior reflects that of the population and thus be able to make inferences about it, thanks to which resources are optimized. This is known as statistical inference .
Here are some examples in which the quasi-variance and the associated quasi-standard deviation serve as a statistical indicator by indicating how far the results obtained are from the mean.
1.- The marketing director of a company that manufactures automotive batteries needs to estimate, in months, the average life of a battery.
To do this, he randomly selects a sample of 100 purchased batteries of that brand. The company keeps a record of buyers’ details and may interview them to find out how long the batteries last.
2.- The academic management of a university institution needs to estimate the enrollment of the following year, analyzing the number of students who are expected to pass the subjects they are currently studying.
For example, from each of the sections currently taking Physics I, the management can select a sample of students and analyze their performance in that chair. In this way you can infer how many students will take Physics II in the next period.
3.- A group of astronomers focuses their attention on a part of the sky, where a certain number of stars with certain characteristics are observed: size, mass and temperature for example.
One wonders if stars in another similar region will have the same characteristics, even stars in other galaxies, such as the neighboring Magellanic Clouds or Andromeda.
Why divide by n-1?
In the quasi-variance, it is divided by n-1 instead of by n and it is because the quasivariate is an unbiased estimator , as mentioned at the beginning.
It happens that it is possible to extract many samples from the same population. The variance of each of these samples can also be averaged, but the average of these variances does not turn out to be equal to the variance of the population.
In fact, the mean of the sample variances tends to underestimate the population variance, unless n-1 is used in the denominator. It can be verified that the expected value of the quasi-variance E (s c 2 ) is precisely s 2 .
For this reason, it is said that the quasivariate is unbiased and is a better estimator of the population variance s 2 .
Alternative way to calculate quasivariance
It is easily shown that the quasi-variance can also be calculated as follows:
s c 2 = [∑x 2 / (n-1)] – [∑nX 2 / (n-1)]
The standard score
By having the sample deviation, we can tell how many standard deviations a particular value x has, either above or below the mean.
For this, the following dimensionless expression is used:
Standard score = (x – X) / s c
Calculate the quasi-variance and quasi-standard deviation of the following data, which consist of monthly payments in $ made by an insurance company to a private clinic.
863 903 957 1041 1138 1204 1354 1624 1698 1745 1802 1883
a) Use the definition of quasi-variance given at the beginning and also check the result using the alternative form given in the previous section.
b) Calculate the standard score of the second piece of data, reading from top to bottom.
The problem can be solved by hand with the help of a simple or scientific calculator, for which it is necessary to proceed in order. And for this, nothing better than organizing the data in a table like the one shown below:
Thanks to the table, the information is organized and the quantities that are going to be needed in the formulas are at the end of the respective columns, ready to use immediately. Sums are indicated in bold.
The average column is always repeated, but it is worth it because it is convenient to have the value in view, to fill each row of the table.
Finally, the equation for the quasivariance given at the beginning is applied, only the values are substituted and as for the summation, we already have it calculated:
s c 2 = 1,593,770 / (12-1) = 1,593,770 / 11 = 144,888.2
This is the value of the quasi-variance and its units are “dollars squared”, which does not make much practical sense, so the quasi-standard deviation of the sample is calculated, which is no more than the square root of the quasi-variance:
s c = ( √ 144,888.2) $ = $ 380.64
It is immediately confirmed that this value is also obtained with the alternative form of quasi-variance. The sum needed is at the end of the last column on the left:
s c 2 = [Σx 2 / (n-)] – [ΣnX 2 / (n-1)] = [23496182/11] – [12 x 1351 2 /11]
= 2,136,016.55 – 1,991,128.36 = $ 144,888 squared
It is the same value obtained with the formula given at the beginning.
The second value from top to bottom is 903, its standard score is
Standard score of 903 = (x – X) / s c = (903 – 1351) /380.64 = -1.177
- Canavos, G. 1988. Probability and Statistics: Applications and methods. McGraw Hill.
- Devore, J. 2012. Probability and Statistics for Engineering and Science. 8th. Edition. Cengage.
- Levin, R. 1988. Statistics for Administrators. 2nd. Edition. Prentice Hall.
- Measures of dispersion. Recovered from: thales.cica.es.
- Walpole, R. 2007. Probability and Statistics for Engineering and Sciences. Pearson.