[Coding article] A Guide to Analyzing Statistical Tests for Each Level of a Factor in R without Manual Specification

[Coding article] A Guide to Analyzing Statistical Tests for Each Level of a Factor in R without Manual Specification


This is my experimental data.

install.packages("readr")
library (readr)
github="https://raw.githubusercontent.com/agronomy4future/r_code/main/corn_grain_yield.csv"
dataA=data.frame(read_csv(url(github),show_col_types = FALSE))

    variety reps nitrogen grain_yield
1       CV1    1       N1       16.59
2       CV1    2       N1       16.59
3       CV1    3       N1       15.91
4       CV1    4       N1       17.57
5       CV1    5       N1       16.40
6       CV1    6       N1       16.20
7       CV1    7       N1       18.45
8       CV1    8       N1       14.35
9       CV1    9       N1       13.27
10      CV1   10       N1       16.15
.
.
.

There are 10 corn varieties, and I want to analyze the effect of nitrogen treatments (N0, N1) on grain yield for each variety. This is One-Way ANOVA analysis. Let’s assume that there are no blocks for the replicates. Therefore, the statistical model will be a One-Way ANOVA with no blocks.

ONE_WAY_ANOVA=aov(grain_yield~nitrogen, data=dataA)
summary(ONE_WAY_ANOVA)

             Df    Sum Sq   Mean Sq   F value   Pr(>F)    
nitrogen     1     41.9     41.93     19.02     2.08e-05 ***
Residuals    198   436.4    2.20                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

If we run the above analysis, we can observe the overall effect of nitrogen treatments on grain yield across all varieties, as they are pooled together. However, as I mentioned earlier, what I’m interested in is the effect of nitrogen treatments (N0, N1) on grain yield for each individual variety.

In this case, we should divide each varieties in the data like following codes;

ONE_WAY_ANOVA=aov(grain_yield~nitrogen, data=subset(dataA, variety=="CV1"))
ONE_WAY_ANOVA=aov(grain_yield~nitrogen, data=subset(dataA, variety=="CV2"))
ONE_WAY_ANOVA=aov(grain_yield~nitrogen, data=subset(dataA, variety=="CV3"))
.
.
.

However, if we had 100 varieties, analyzing the nitrogen effect for each variety using the subset() function one by one would be impractical. In this case, we need to use a specific code to analyze the nitrogen effect for each variety simultaneously. Today, I will explain how to do this.



lapply(split(dataA, dataA$variety), function(df) {
  ANOVA=aov(grain_yield~nitrogen, data=df)
  summary(ANOVA)
})

$CV1
            Df Sum Sq Mean Sq F value   Pr(>F)    
nitrogen     1  111.4  111.44   79.28 5.17e-08 ***
Residuals   18   25.3    1.41                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$CV10
            Df Sum Sq Mean Sq F value Pr(>F)  
nitrogen     1  4.334   4.334   4.755 0.0427 *
Residuals   18 16.404   0.911                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$CV2
            Df Sum Sq Mean Sq F value Pr(>F)  
nitrogen     1  2.783  2.7826     3.3  0.086 .
Residuals   18 15.177  0.8431                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$CV3
            Df Sum Sq Mean Sq F value   Pr(>F)    
nitrogen     1  22.22  22.218   39.54 6.28e-06 ***
Residuals   18  10.12   0.562                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$CV4
            Df Sum Sq Mean Sq F value Pr(>F)
nitrogen     1  0.182  0.1824   0.329  0.574
Residuals   18  9.992  0.5551               

$CV5
            Df Sum Sq Mean Sq F value   Pr(>F)    
nitrogen     1  34.72   34.72   49.53 1.45e-06 ***
Residuals   18  12.62    0.70                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$CV6
            Df Sum Sq Mean Sq F value Pr(>F)  
nitrogen     1  2.701   2.701   3.573 0.0749 .
Residuals   18 13.608   0.756                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$CV7
            Df Sum Sq Mean Sq F value Pr(>F)
nitrogen     1  0.173  0.1730   0.283  0.601
Residuals   18 10.992  0.6107               

$CV8
            Df Sum Sq Mean Sq F value Pr(>F)  
nitrogen     1  12.17  12.168   7.709 0.0124 *
Residuals   18  28.41   1.578                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$CV9
            Df Sum Sq Mean Sq F value Pr(>F)  
nitrogen     1  1.625  1.6245   3.199 0.0905 .
Residuals   18  9.142  0.5079                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Source: https://stackoverflow.com/questions/76159449/in-r-how-to-automatically-analyze-statistical-test-i-e-anova-per-treatment



Comments are closed.