[STAT article] Two-Way ANOVA: An Essential Tool for Understanding Factorial Experiments
A factorial experiment involves the simultaneous manipulation of multiple factors or independent variables (x) to study their effects on a dependent variable (y). The experiment is called factorial because it involves testing multiple factors simultaneously. In factorial experiments, the combination of the different levels of each factor being tested is called a factorial, and each factorial represents a unique combination of these levels. For instance, N0_Genotyp1, N0_Genotyp2, N1_Genotyp1, N1_Genotyp2, etc. are different factorials used to conduct the experiment and analyze its results.
Two-way ANOVA, on the other hand, is a statistical technique used to analyze the effects of two independent variables on a dependent variable. It is called “two-way” because it involves two factors (i.e., independent variables).
In summary, factorial experiment involves manipulating multiple factors simultaneously, while two-way ANOVA is used to analyze the effects of two independent variables on a dependent variable.
For example, let’s say we want to see how the yield changes with four different nitrogen fertilizers (N0, N1, N2, N3) for two different cultivars (Cultivar 1, Cultivar 2). The experiment is designed with three blocks, and each block is repeated for accuracy.
Cultivar=rep(c("CV1","CV2"),each=12)
Nitrogen=rep(rep(c("N0","N1","N2","N3"), each=3),2)
Block=rep(c("I","II","III"),8)
Yield=c (99, 109, 89, 115, 142, 133, 121, 157, 142, 125, 150, 139, 82, 104, 99, 117, 125, 127, 145, 154, 154, 151, 166, 175)
dataA=data.frame(Cultivar,Nitrogen,Block,Yield)
Cultivar Nitrogen Block Yield
1 CV1 N0 I 99
2 CV1 N0 II 109
3 CV1 N0 III 89
4 CV1 N1 I 115
5 CV1 N1 II 142
6 CV1 N1 III 133
7 CV1 N2 I 121
8 CV1 N2 II 157
9 CV1 N2 III 142
10 CV1 N3 I 125
11 CV1 N3 II 150
12 CV1 N3 III 139
13 CV2 N0 I 82
14 CV2 N0 II 104
15 CV2 N0 III 99
16 CV2 N1 I 117
17 CV2 N1 II 125
18 CV2 N1 III 127
19 CV2 N2 I 145
20 CV2 N2 II 154
21 CV2 N2 III 154
22 CV2 N3 I 151
23 CV2 N3 II 166
24 CV2 N3 III 175
Let’s summarize the basic concepts of statistics.
□ What are experimental factors and what are the levels of each experimental factor? Factor 1 = Cultivar (2 levels: Cultivar 1, Cultivar 2) Factor 2 = Nitrogen (4 levels: N0, N1, N2, N3) □ How many treatments are there? Treatment = 2 levels x 4 levels = 8 treatments (Number of experimental combinations) □ How many experimental units are there? Experimental unit = 2 levels x 4 levels x 3 blocks = 24 experimental units (Number of observations) □ What experimental design should be chosen? Since the experiment has been designed with blocks as repeats, a Randomized Complete Block Design (RCBD) should be used as the experimental design. □ What statistical analysis method should be used? As this is an experiment to investigate whether two different factors affect the yield, a Two-way ANOVA should be performed.
From now on, I want to find out if there is a difference in yield based on cultivar and nitrogen fertilizer. As always, using a statistical program will give us results in 10 seconds. However, what’s important is understanding the principles behind it. Let’s take a step-by-step approach to understand how a statistical program produces results.
First, let’s look at the mathematical model for this data analysis
yijk = μ + αi + βj + δij + γk + εijk where yijk: observed values for treatment (ij; i= factor 1, j= factor 2) and replicates (k) μ: grand mean of observed values αi: treatment effect of factor 1 βj: treatment effect of factor 2 δij: interaction effect between factor 1 (i) and factor 2 (j) γk: block effect εijk: residuals
Although this equation may seem difficult, it is actually just simple addition and subtraction. It can be rewritten using the following calculation formula.
Let’s break down this formula step by step using Excel.
(1) Grand mean (ȳ…
) The average of the entire harvest yield value, which is 130.0 in this case. (2) Cultivar mean (ȳi
..) The average harvest yield for each cultivar. It is 126.75 for CV1 and 133.25 for CV2. (3) Nitrogen mean (ȳ.j.
) The average harvest yield for each nitrogen treatment. It is 97.0 for N0, 126.5 for N1, 145.5 for N2, and 151.0 for N3. (4) Interaction mean (ȳij.
) The average harvest yield value when cultivar and nitrogen treatment are combined. For example, the average yield value for CV1N0 combination is 99.0. (5) Block mean (ȳ..k
) The average yield value for each block. It is 119.4 for Block 1, 138.4 for Block 2, and 132.3 for Block 3. (6) Cultivar effect (ȳi..– ȳ…
) = αi The cultivar effect is essentially a comparison between the average yield of each cultivar and the overall average. The average yield for CV1 is 126.75, and for cultivar 2 it is 133.25. Therefore, the effect of CV1 is -3.25 (=126.75 - 130.0), and the effect of cultivar 2 is 3.25 (=133.25 - 130.0). (7) Nitrogen effect (ȳ
.j.– ȳ…
) = βj Similar to the cultivar effect, the nitrogen effect compares the average yield of each nitrogen treatment with the overall average. (8) Interaction effect (ȳij. – ȳi.. – ȳ.j. + ȳ…
) = δij The interaction effect measures the difference between the average yield value of a cultivar-nitrogen combination and the average yield values of the corresponding cultivar and nitrogen treatments. For example, for the CV1N0 combination, the interaction effect is 5.25 (=99.0 - 126.75 - 97.0 + 130.0). (9) Block effect (ȳ..k – ȳ…
) = γk The block effect measures the difference between the average yield of each block and the overall average. (10) Residual (ȳijk – ȳij.– ȳ..k + ȳ…
) = εijk The residual, or error, value for each yield value is calculated by subtracting the corresponding interaction and block averages and then adding the overall average. For example, the error value for the yield value of CV1 and N1 in the first block is 10.63 (=99 - 99 - 119.4 + 130). (11) Total (ȳijk – ȳ…
) Although it is labeled as "Total," it would be more accurate to describe it as the variation in the entire dataset. It measures how much each yield value deviates from the mean. In other words, it is the difference between each yield value and the overall mean.
Each harvest value is partitioned by the above calculation. That is, the harvest value of 99 in the CV1, N1 of the first block is partitioned as follows: 99 = 130 + (126.75 – 130) + (97.0 – 130) + (99 – 126.75 – 97 + 130) + (119.4 – 130) + (99 – 99- 119.4 + 130)
Therefore, yijk = μ + αi + βj + δij + γk + εijk
is: 99 = 130 + (-3.25) + (-33.0) + 5.25 + (-10.6) + 10.63.
Now let’s calculate the sum of squares for these partitioned values. You may wonder why we need to calculate the sum of squares (SS), but actually this is a fundamental statistical knowledge.
Note!!
Let's say we have observation values of 4, 7, 6, and 3. The mean of these values is 5. We want to see how far our data is from the mean, so we subtracted the mean from each observation value. This is called deviation.
4 - 5 = -1
7 - 5 = 2
6 - 5 = 1
3 - 5 = -2
We added these four values to see how far the entire data is from the mean, and we got 0. The sum of deviations is always 0. To avoid the sum of deviations being 0, we squared the deviations and added them up. This is called the sum of squares.
(-1)2 + (2)2 + (1)2 + (-2)2 = 10
We can assume that our data is about 10 away from the mean. This is because we squared and added the deviations. Therefore, it is called the sum of squares.
While we're at it, let's look at variance and standard deviation.
We call the value obtained by dividing the sum of squares by n-1 variance. In this case, 10 / (4 - 1) = 3.33 is the variance. When discussing the distribution of data, we first talk about variance. In other words, our data of 4, 7, 6, and 3 are spread out by about 3.33 from the mean. Let's say our data values were 5, 5, 5, 5. The variance would be 0. Is a high or low variance better? It depends on the nature of the data. In the case of data that emphasizes uniformity, a large variance would mean producing defective products in a factory that produces standardized products. In the case of data that emphasizes diversity, a large variance would mean that there is a higher level of diversity.
The standard deviation is the value obtained by taking the square root (√) of the variance. Therefore, we represent the variance as s2 and the standard deviation as s. The standard deviation is also represented by σ. In the case above, the standard deviation would be 1.83. This means that our data is spread out by about 1.83. Initially, to avoid the sum of deviations being 0, we squared the deviations to find the variance. Therefore, the standard deviation is a process of adjusting the increased value.
Let’s get back to the main topic and check the SS of our data.
We obtained the same result as above. You can easily calculate it in Excel by using =sumsq()
function. From the above result, we can discover an interesting fact.
Sum of Squares (Total) = Sum of Squares (Treatments) + Sum of Squares (Blocks) + Sum of Squares (Error) 14204 = 11988 (=253.5+ 10695 + 1039.5) + 1504.75 + 711.25
Now, let’s create an ANOVA table with this result.
This is the two-way ANOVA with blocks table for yijk = μ + αi + βj + δij + γk + εijk
Source of variation | df | Sum of squares | Mean squares | F-ratio |
Cultivar | a-1 | SSA | MSA = SSA / (a-1) | MSA / MSE |
Nitrogen | b-1 | SSB | MSB = SSB / (b-1) | MSB / MSE |
Cultivar*Nitrogen | (a-1)*(b-1) | SSAB | MSAB = SSAB / (a-1)*(b-1) | MSAB / MSE |
Block | r-1 | SSBk | MSBk = SSBk / (r-1) | MSBk / MSE |
Error | N-ab-r+1 | SSE | MSE = SSE / ab(r-1) | |
Total | abr-1 | SST |
Number of levels for cultivar (a) and nitrogen (b) are 2 and 4 respectively. The number of levels for block (r) is 3. We have already calculated the sum of squares for each source of variation, and now we need to match them accordingly.
Source of variation | df | Sum of squares | Mean squares | F-ratio |
Cultivar | 1 | 253.50 | 253.5 = 253.50 / 1 | 4.99 = 253.5 / 50.80 |
Nitrogen | 3 | 10695.0 | 3565.0 = 10695.0 / 3 | 70.18 = 3565.0 / 50.80 |
Cultivar*Nitrogen | 3 | 1039.50 | 346.5 = 1039.50 / 3 | 6.82 = 346.5 / 50.80 |
Block | 2 | 1504.75 | 752.375 = 1504.75 / 2 | 14.81 = 752.375 / 50.80 |
Error | 14 | 711.25 | 50.8 = 711.25 / 14 | |
Total | 23 | 14204.00 |
Then, let’s check the p-value. In Excel, we can check the p-value using the FDIST()
function. We can also check the p-value visually using PQRS.
=FDIST(4.99,1,14) # 0.0423236
=FDIST(70.18,3,14) # 0.0000000
=FDIST(6.82,3,14) # 0.0046055
=FDIST(14.81,2,14) # 0.0003508
Let’s check the p-value visually using PQRS. To use PQRS, go to https://pqrs.software.informer.com/ and download the program. Then, run it and select F in the distribution. Enter the degrees of freedom for each treatment and error, respectively. What is the critical value of the F-ratio for α=0.05 when the F-ratio is 4.99 (df1:1, df2:14)? PQRS makes it easy to check.
When the degrees of freedom for treatment and error are 1 and 14, respectively, the F-value corresponding to the 0.05 significance level is 4.6. If our F-value is greater than 4.6, it can be considered statistically significant. In our case, the F-value is 4.99, which is smaller than 4.6. Therefore, we can conclude that the difference between cultivars is statistically significant (p<0.05).
To illustrate how changing the F-value affects the p-value, let’s use an example where the F-value is 4.99. The significance level corresponding to this F-value is 0.0423, which means that if our F-value is 4.99 or higher, we can reject the null hypothesis at the 0.05 significance level (p<0.05). Running a statistical software program can help confirm our findings and calculate the p-value accurately.
Source of variation | df | Sum of squares | Mean squares | F-ratio | p-value |
Cultivar | 1 | 253.50 | 253.50 | 4.99 | 0.0423236 |
Nitrogen | 3 | 10695.0 | 3565.0 | 70.18 | 0.0000000 |
Cultivar*Nitrogen | 3 | 1039.50 | 346.50 | 6.82 | 0.0046055 |
Block | 2 | 1504.75 | 752.375 | 14.81 | 0.0003508 |
Error | 14 | 711.25 | 50.80 | ||
Total | 23 | 14204.00 |
When the significance level (α) was set to 0.05, statistically significant results were observed for each treatment and their combinations. Let’s reconsider the sum of squares (SS). For the case of cultivar, the F-value of cultivar is the sum of squares for cultivar divided by the sum of squares for error. This means that as the sum of squares for cultivar becomes smaller or the sum of squares for error becomes larger, the F-value of cultivar will decrease. In other words, a smaller sum of squares for cultivar means that the difference between the means of cultivar 1 and cultivar 2 is small when compared to the overall mean.
For example, if the mean of cultivar 1 is 90 and the mean of cultivar 2 is 170 with an overall mean of 130, and if the mean of cultivar 1 is 129 and the mean of cultivar 2 is 131 with an overall mean of 130, the former case will have a much larger sum of squares for cultivar, resulting in a larger F-value for cultivar, indicating a higher probability of statistical significance.
An increase in the Sum of Squares for Error means that there is a significant amount of variation within each treatment group. This suggests that the differences in harvest yield are not caused by the different cultivars, but rather by environmental factors or experimental error.
For example, let’s say there was a flood in the cultivation area and a particular plot was submerged. This would lead to a sharp decrease in the harvest yield, but it would not be caused by the cultivar itself.
In this case, the data will be partitioned as shown above. Do you see it? You can see that the residual values have significantly increased.
Let’s take another look at the ANOVA table.
Cultivar=rep(c("CV1","CV2"),each=12)
Nitrogen=rep(rep(c("N0","N1","N2","N3"), each=3),2)
Block=rep(c("I","II","III"),8)
Yield=c(99,109,89,115,142,133,121,157,142,125,150,139,82,104,99,117,125,127,145,154,154,151,166,175)
dataA=data.frame(Cultivar,Nitrogen,Block,Yield)
ANOVA=aov(Yield~ Cultivar+Nitrogen+Cultivar:Nitrogen+ factor(Block), data=dataA)
summary(ANOVA)
Df Sum Sq Mean Sq F value Pr(>F)
Cultivar 1 253 253 4.99 0.042327 *
Nitrogen 3 10695 3565 70.17 1.12e-08 ***
factor(Block) 2 1505 752 14.81 0.000351 ***
Cultivar:Nitrogen 3 1040 347 6.82 0.004604 **
Residuals 14 711 51
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
This result is obtained from R programming and confirms the values we manually calculated earlier. It is the same as the values we calculated manually.
Cultivar=rep(c("CV1","CV2"),each=12)
Nitrogen=rep(rep(c("N0","N1","N2","N3"), each=3),2)
Block=rep(c("I","II","III"),8)
Yield=c (1,109,89,2,142,133,121,3,142,125,150,1,82,2,99,117,2,127,145,154,154,151,166,175)
dataB=data.frame(Cultivar,Nitrogen,Block,Yield)
ANOVA=aov(Yield~ Cultivar+Nitrogen+Cultivar:Nitrogen+ factor(Block), data=dataB)
summary(ANOVA)
Df Sum Sq Mean Sq F value Pr(>F)
Cultivar 1 5281 5281 1.335 0.267
Nitrogen 3 15970 5323 1.346 0.300
factor(Block) 2 2837 1419 0.359 0.705
Cultivar:Nitrogen 3 8526 2842 0.719 0.557
Residuals 14 55373 3955
This result is the difference caused by assuming that there was a flood. If we assume that there was a flood, then Cultivar is not significant (p=0.267). We recognize that the difference in harvest yield is not due to the differences in cultivars, but rather to the experimental error within the same treatment group. Therefore, we can see that the Sum of Squares for Error has increased from 711 to 55373.
Df Sum Sq Mean Sq F value Pr(>F)
Cultivar 1 253 253 4.99 0.042327 *
Nitrogen 3 10695 3565 70.17 1.12e-08 ***
factor(Block) 2 1505 752 14.81 0.000351 ***
Cultivar:Nitrogen 3 1040 347 6.82 0.004604 **
Residuals 14 711 51
Df Sum Sq Mean Sq F value Pr(>F)
Cultivar 1 5281 5281 1.335 0.267
Nitrogen 3 15970 5323 1.346 0.300
factor(Block) 2 2837 1419 0.359 0.705
Cultivar:Nitrogen 3 8526 2842 0.719 0.557
Residuals 14 55373 3955
Using a statistical program, you can obtain results in just 10 seconds. However, it is important to understand the principles underlying the program rather than just knowing how to run a statistical program that produces results in 10 seconds. By understanding the principles, you can examine your data in detail and identify what may be causing issues.