Analysis of Variance (ANOVA): what it is and how it is used in statistics

In statistics, when the means of two or more samples are compared in relation to some variable of interest (for example, anxiety after psychological treatment), tests are used to determine whether or not there are significant differences between the means.

One of them is the Analysis of Variance (ANOVA) . In this article we will know what this parametric test consists of and what assumptions must be fulfilled in order to use it.

Analysis of Variance (ANOVA): what is it?

In statistics, we find the concept of analysis of variance (ANOVA), which consists of a grouping of statistical models and their associated procedures, where the variance is partitioned into certain components , due to diverse explanatory variables. If we break down its acronym in English, ANOVA means: Analysis Of VAriance.

Analysis of Variance (ANOVA) is a type of parametric test . This means that a number of assumptions must be met in order to apply it, and that the level of the variable of interest must be at least quantitative (i.e. at least interval, e.g. IQ, where there is a relative 0).

Analysis of Variance techniques

The first Analysis of Variance techniques were developed in the 1920s and 1930s by R.A. Fisher, a statistician and geneticist. This is why analysis of variance (ANOVA) is also known as “Fisher’s Anova” or “Fisher’s analysis of variance” ; this is also due to the use of Fisher’s F distribution (a probability distribution) as part of the hypothesis contrasting.

The analysis of variance (ANOVA) arises from the concepts of linear regression . Linear regression, in statistics, is a mathematical model used to approximate the relationship of dependence between a dependent variable Y (e.g. anxiety), the independent variables Xi (e.g. different treatments) and a random term.

Function of this parametric test

Thus, an analysis of variance (ANOVA) serves to determine whether different treatments (e.g., psychological treatments) show significant differences , or whether, on the contrary, it can be established that their mean populations do not differ (they are practically the same, or their difference is not significant).

That is, ANOVA is used to test hypotheses about differences in means (always more than two). The ANOVA involves an analysis or decomposition of the total variability; this, in turn, can be attributed mainly to two sources of variation:

Intergroup Variability
Intragroup variability or error

Types of ANOVA

There are two types of analysis of variance (ANOVA) :

1. Anova I

When there is only one classification criterion (independent variable; for example, type of therapeutic technique). In turn, it can be inter-group (there are several experimental groups) and intra-group (there is only one experimental group).

2. Anova II

In this case, there is more than one classification criterion (independent variable). As in the previous case, this can be inter-group and intra-group.

Characteristics and assumptions

When analysis of variance (ANOVA) is applied in experimental studies, each group consists of a certain number of subjects, and the groups may differ in this number. When the number of subjects coincides, we speak of a balanced or equilibrated model .

In statistics, in order to apply the analysis of variance (ANOVA), a number of assumptions must be made:

1. Normality

This means that scores on the dependent variable (e.g. anxiety) should follow a normal distribution. This assumption is tested by means of so-called goodness-of-fit tests .

2. Independence

It implies that there is no autocorrelation between the scores, i.e. the existence of independence of the scores from each other. To ensure that this assumption is fulfilled, we should perform a MAS (simple random sampling) to select the sample we are going to study or work on.

3. Homocedasticity

This term means “equality of variances of subpopulations” . Variance is a statistic of variability and dispersion, and increases the greater the variability or dispersion of scores.

The assumption of homocedasticity is verified by the Levene Test or the Bartlett Test. If this is not the case, another alternative is to carry out a logarithmic transformation of the scores.

Other assumptions

The above assumptions must be met when using analysis of variance (ANOVA) inter-group. However, when an intragroup ANOVA is used, the above assumptions and two more must be fulfilled:

1. Sphericity

If this is not true, it would indicate that the different sources of error correlate with each other . A possible solution if that happens is to perform a MANOVA (Multivariate Analysis of Variance).

2. Additivity

Assumes no interaction subject x treatment; failure to do so would increase the error variance.

Bibliographic references:

Botella, J., Sueró, M., Ximénez, C. (2012). Data analysis in psychology I. Madrid: Pirámide.
Fontes de Gracia, S. García, C. Quintanilla, L. et al. (2010). Fundamentals of research in psychology. Madrid.
Martínez, M.A. Hernández, M.J. Hernández, M.V. (2014). Psychometry. Madrid: Alianza.