[Coding article] A Guide to Analyzing Statistical Tests for Each Level of a Factor in R without Manual Specification
This is my experimental data.
install.packages("readr")
library (readr)
github="https://raw.githubusercontent.com/agronomy4future/r_code/main/corn_grain_yield.csv"
dataA=data.frame(read_csv(url(github),show_col_types = FALSE))
variety reps nitrogen grain_yield
1 CV1 1 N1 16.59
2 CV1 2 N1 16.59
3 CV1 3 N1 15.91
4 CV1 4 N1 17.57
5 CV1 5 N1 16.40
6 CV1 6 N1 16.20
7 CV1 7 N1 18.45
8 CV1 8 N1 14.35
9 CV1 9 N1 13.27
10 CV1 10 N1 16.15
.
.
.
There are 10 corn varieties, and I want to analyze the effect of nitrogen treatments (N0, N1) on grain yield for each variety. This is One-Way ANOVA analysis. Let’s assume that there are no blocks for the replicates. Therefore, the statistical model will be a One-Way ANOVA with no blocks.
ONE_WAY_ANOVA=aov(grain_yield~nitrogen, data=dataA)
summary(ONE_WAY_ANOVA)
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 41.9 41.93 19.02 2.08e-05 ***
Residuals 198 436.4 2.20
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
If we run the above analysis, we can observe the overall effect of nitrogen treatments on grain yield across all varieties, as they are pooled together. However, as I mentioned earlier, what I’m interested in is the effect of nitrogen treatments (N0, N1) on grain yield for each individual variety.
In this case, we should divide each varieties in the data like following codes;
ONE_WAY_ANOVA=aov(grain_yield~nitrogen, data=subset(dataA, variety=="CV1"))
ONE_WAY_ANOVA=aov(grain_yield~nitrogen, data=subset(dataA, variety=="CV2"))
ONE_WAY_ANOVA=aov(grain_yield~nitrogen, data=subset(dataA, variety=="CV3"))
.
.
.
However, if we had 100 varieties, analyzing the nitrogen effect for each variety using the subset()
function one by one would be impractical. In this case, we need to use a specific code to analyze the nitrogen effect for each variety simultaneously. Today, I will explain how to do this.
lapply(split(dataA, dataA$variety), function(df) {
ANOVA=aov(grain_yield~nitrogen, data=df)
summary(ANOVA)
})
$CV1
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 111.4 111.44 79.28 5.17e-08 ***
Residuals 18 25.3 1.41
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$CV10
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 4.334 4.334 4.755 0.0427 *
Residuals 18 16.404 0.911
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$CV2
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 2.783 2.7826 3.3 0.086 .
Residuals 18 15.177 0.8431
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$CV3
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 22.22 22.218 39.54 6.28e-06 ***
Residuals 18 10.12 0.562
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$CV4
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 0.182 0.1824 0.329 0.574
Residuals 18 9.992 0.5551
$CV5
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 34.72 34.72 49.53 1.45e-06 ***
Residuals 18 12.62 0.70
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$CV6
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 2.701 2.701 3.573 0.0749 .
Residuals 18 13.608 0.756
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$CV7
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 0.173 0.1730 0.283 0.601
Residuals 18 10.992 0.6107
$CV8
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 12.17 12.168 7.709 0.0124 *
Residuals 18 28.41 1.578
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$CV9
Df Sum Sq Mean Sq F value Pr(>F)
nitrogen 1 1.625 1.6245 3.199 0.0905 .
Residuals 18 9.142 0.5079
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1