How to calculate responsiveness in response to control using R?

How to calculate responsiveness in response to control using R?


In my previous post, I explained how to quantify phenotypic plasticity and introduced the concept of ‘responsiveness.’


Quantifying Phenotypic Plasticity of Crops


I introduced a formula to calculate responsiveness as (Treatment - Control) / Control.

GenotypeControlTreatmentResponsiveness
A10090-10.0%
B12070-41.7%
C11590-21.7%
D9585-10.5%
E110105-4.5%

However, when analyzing data, the format may not always be the same as above. Mostly, treatments (independent variable) are arranged in one column, and the dependent variable is arranged in another column. In this case, how to calculate responsiveness calculated as (Treatment - Control) / Control?


Let’s upload one data.

install.packages("readr")
library(readr)

github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/fertilizer_treatment.csv"

dataA= data.frame(read_csv(url(github),show_col_types = FALSE))
dataA

     Genotype Block    variable value
1  Genotype_A     I     Control  42.9
2  Genotype_A    II     Control  41.6
3  Genotype_A   III     Control  28.9
4  Genotype_A    IV     Control  30.8
5  Genotype_B     I     Control  53.3
6  Genotype_B    II     Control  69.6
7  Genotype_B   III     Control  45.4
8  Genotype_B    IV     Control  35.1
9  Genotype_C     I     Control  62.3
10 Genotype_C    II     Control  58.5
11 Genotype_C   III     Control  44.6
12 Genotype_C    IV     Control  50.3
13 Genotype_D     I     Control  75.4
14 Genotype_D    II     Control  65.6
15 Genotype_D   III     Control  54.0
16 Genotype_D    IV     Control  52.7
17 Genotype_A     I Fertilizer1  53.8
18 Genotype_A    II Fertilizer1  58.5
19 Genotype_A   III Fertilizer1  43.9
20 Genotype_A    IV Fertilizer1  46.3
21 Genotype_B     I Fertilizer1  57.6
22 Genotype_B    II Fertilizer1  69.6
23 Genotype_B   III Fertilizer1  42.4
24 Genotype_B    IV Fertilizer1  51.9
25 Genotype_C     I Fertilizer1  63.4
26 Genotype_C    II Fertilizer1  50.4
27 Genotype_C   III Fertilizer1  45.0
28 Genotype_C    IV Fertilizer1  46.7
29 Genotype_D     I Fertilizer1  70.3
30 Genotype_D    II Fertilizer1  67.3
31 Genotype_D   III Fertilizer1  57.6
32 Genotype_D    IV Fertilizer1  58.5
33 Genotype_A     I Fertilizer2  49.5
34 Genotype_A    II Fertilizer2  53.8
35 Genotype_A   III Fertilizer2  40.7
36 Genotype_A    IV Fertilizer2  39.4
37 Genotype_B     I Fertilizer2  59.8
38 Genotype_B    II Fertilizer2  65.8
39 Genotype_B   III Fertilizer2  41.4
40 Genotype_B    IV Fertilizer2  45.4
41 Genotype_C     I Fertilizer2  64.5
42 Genotype_C    II Fertilizer2  46.1
43 Genotype_C   III Fertilizer2  62.6
44 Genotype_C    IV Fertilizer2  50.3
45 Genotype_D     I Fertilizer2  68.8
46 Genotype_D    II Fertilizer2  65.3
47 Genotype_D   III Fertilizer2  45.6
48 Genotype_D    IV Fertilizer2  51.0
49 Genotype_A     I Fertilizer3  44.4
50 Genotype_A    II Fertilizer3  41.8
51 Genotype_A   III Fertilizer3  28.3
52 Genotype_A    IV Fertilizer3  34.7
53 Genotype_B     I Fertilizer3  64.1
54 Genotype_B    II Fertilizer3  57.4
55 Genotype_B   III Fertilizer3  44.1
56 Genotype_B    IV Fertilizer3  51.6
57 Genotype_C     I Fertilizer3  63.6
58 Genotype_C    II Fertilizer3  56.1
59 Genotype_C   III Fertilizer3  52.7
60 Genotype_C    IV Fertilizer3  51.8
61 Genotype_D     I Fertilizer3  71.6
62 Genotype_D    II Fertilizer3  69.4
63 Genotype_D   III Fertilizer3  56.6
64 Genotype_D    IV Fertilizer3  47.4

This data has two treatments (independent variables) with 4 replicates.

xtabs(~variable + Genotype, data=dataA)

                Genotype
variable        Genotype_A Genotype_B Genotype_C Genotype_D
Control              4          4          4          4
Fertilizer1          4          4          4          4
Fertilizer2          4          4          4          4
Fertilizer3          4          4          4          4

Now, I’ll calculate the ‘responsiveness’ of yield for each fertilizer in response to control.

library(dplyr)

dataA=data.frame(dataA %>%
      group_by(Genotype) %>%
      mutate(responsiveness=ifelse(variable!="Control", 
      (value-value[variable=="Control"])/value[variable=="Control"], NA)))
dataA

     Genotype Block    variable value responsiveness
1  Genotype_A     I     Control  42.9             NA
2  Genotype_A    II     Control  41.6             NA
3  Genotype_A   III     Control  28.9             NA
4  Genotype_A    IV     Control  30.8             NA
5  Genotype_B     I     Control  53.3             NA
6  Genotype_B    II     Control  69.6             NA
7  Genotype_B   III     Control  45.4             NA
8  Genotype_B    IV     Control  35.1             NA
9  Genotype_C     I     Control  62.3             NA
10 Genotype_C    II     Control  58.5             NA
11 Genotype_C   III     Control  44.6             NA
12 Genotype_C    IV     Control  50.3             NA
13 Genotype_D     I     Control  75.4             NA
14 Genotype_D    II     Control  65.6             NA
15 Genotype_D   III     Control  54.0             NA
16 Genotype_D    IV     Control  52.7             NA
17 Genotype_A     I Fertilizer1  53.8    0.254079254
18 Genotype_A    II Fertilizer1  58.5    0.406250000
19 Genotype_A   III Fertilizer1  43.9    0.519031142
20 Genotype_A    IV Fertilizer1  46.3    0.503246753
21 Genotype_B     I Fertilizer1  57.6    0.080675422
22 Genotype_B    II Fertilizer1  69.6    0.000000000
23 Genotype_B   III Fertilizer1  42.4   -0.066079295
24 Genotype_B    IV Fertilizer1  51.9    0.478632479
25 Genotype_C     I Fertilizer1  63.4    0.017656501
26 Genotype_C    II Fertilizer1  50.4   -0.138461538
27 Genotype_C   III Fertilizer1  45.0    0.008968610
28 Genotype_C    IV Fertilizer1  46.7   -0.071570577
29 Genotype_D     I Fertilizer1  70.3   -0.067639257
30 Genotype_D    II Fertilizer1  67.3    0.025914634
31 Genotype_D   III Fertilizer1  57.6    0.066666667
32 Genotype_D    IV Fertilizer1  58.5    0.110056926
33 Genotype_A     I Fertilizer2  49.5    0.153846154
34 Genotype_A    II Fertilizer2  53.8    0.293269231
35 Genotype_A   III Fertilizer2  40.7    0.408304498
36 Genotype_A    IV Fertilizer2  39.4    0.279220779
37 Genotype_B     I Fertilizer2  59.8    0.121951220
38 Genotype_B    II Fertilizer2  65.8   -0.054597701
39 Genotype_B   III Fertilizer2  41.4   -0.088105727
40 Genotype_B    IV Fertilizer2  45.4    0.293447293
41 Genotype_C     I Fertilizer2  64.5    0.035313002
42 Genotype_C    II Fertilizer2  46.1   -0.211965812
43 Genotype_C   III Fertilizer2  62.6    0.403587444
44 Genotype_C    IV Fertilizer2  50.3    0.000000000
45 Genotype_D     I Fertilizer2  68.8   -0.087533156
46 Genotype_D    II Fertilizer2  65.3   -0.004573171
47 Genotype_D   III Fertilizer2  45.6   -0.155555556
48 Genotype_D    IV Fertilizer2  51.0   -0.032258065
49 Genotype_A     I Fertilizer3  44.4    0.034965035
50 Genotype_A    II Fertilizer3  41.8    0.004807692
51 Genotype_A   III Fertilizer3  28.3   -0.020761246
52 Genotype_A    IV Fertilizer3  34.7    0.126623377
53 Genotype_B     I Fertilizer3  64.1    0.202626642
54 Genotype_B    II Fertilizer3  57.4   -0.175287356
55 Genotype_B   III Fertilizer3  44.1   -0.028634361
56 Genotype_B    IV Fertilizer3  51.6    0.470085470
57 Genotype_C     I Fertilizer3  63.6    0.020866774
58 Genotype_C    II Fertilizer3  56.1   -0.041025641
59 Genotype_C   III Fertilizer3  52.7    0.181614350
60 Genotype_C    IV Fertilizer3  51.8    0.029821074
61 Genotype_D     I Fertilizer3  71.6   -0.050397878
62 Genotype_D    II Fertilizer3  69.4    0.057926829
63 Genotype_D   III Fertilizer3  56.6    0.048148148
64 Genotype_D    IV Fertilizer3  47.4   -0.100569260

I’ll check if this calculation is correct. Let’s download this data as an Excel file.

#install.packages("writexl")
library(writexl)

write_xlsx (dataA,"C:/Users/Desktop/dataA.xlsx")

Then, I compared what I calculated manually with what R calculated, and they are the same.



How to calculate when several groups exist?

The previous data only includes replicates from a single group (genotype). My next question is how to calculate when multiple groups are present?

This dataset involves different planting rows, including outside, east, west, and middle divisions. It analyzes biomass yield per wheat head and stem in response to three conditions: control, tr1, and tr2. How should I calculate responsiveness for each group combination in this scenario? I already calculated responsiveness using excel.

library(readr)
github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/responsiveness.csv"
dataA= data.frame(read_csv(url(github),show_col_types = FALSE))

dataA
    tissue     row plot treatment  yield responsiveness
1     Head outside    5   Control 159.30           <NA>
2     Head outside    5       Tr1  59.70         -62.5%
3     Head outside    5       Tr2  97.50         -38.8%
4     Stem outside    5   Control  74.65           <NA>
5     Stem outside    5       Tr1  34.05         -54.4%
6     Stem outside    5       Tr2  76.65           2.7%
7     Head outside    6   Control  99.30           <NA>
8     Head outside    6       Tr1 109.00           9.8%
9     Head outside    6       Tr2  82.30         -17.1%
10    Stem outside    6   Control  53.95           <NA>
11    Stem outside    6       Tr1  52.95          -1.9%
12    Stem outside    6       Tr2  65.45          21.3%
.
.
.

Now I’ll calculate the responsiveness using R by groups.

library(dplyr)

dataB=data.frame(dataA %>%
      group_by(tissue, row, plot) %>%
      mutate(Calculation=ifelse(treatment!="Control", 
      (yield-yield[treatment=="Control"])/yield[treatment=="Control"], NA)))
dataB

    tissue     row plot treatment  yield responsiveness  Calculation
1     Head outside    5   Control 159.30           <NA>           NA
2     Head outside    5       Tr1  59.70         -62.5% -0.200267917
3     Head outside    5       Tr2  97.50         -38.8% -0.018126888
4     Stem outside    5   Control  74.65           <NA>           NA
5     Stem outside    5       Tr1  34.05         -54.4% -0.714824121
6     Stem outside    5       Tr2  76.65           2.7%  0.243309002
7     Head outside    6   Control  99.30           <NA>           NA
8     Head outside    6       Tr1 109.00           9.8%  0.405544810
9     Head outside    6       Tr2  82.30         -17.1%  0.567619048
10    Stem outside    6   Control  53.95           <NA>           NA
11    Stem outside    6       Tr1  52.95          -1.9% -0.278610354
12    Stem outside    6       Tr2  65.45          21.3%  0.348094748
.
.
.

However, as you can see, the calculated value differs from what I calculated using excel This issue can occur when the ‘plyr’ package is installed. To resolve it, you can either uninstall the ‘plyr’ package or use the dplyr::mutate function.

dataB=data.frame(dataA %>%
      group_by(tissue, row, plot) %>%
      dplyr::mutate(Calculation=ifelse(treatment!="Control", 
      (yield-yield[treatment=="Control"])/yield[treatment=="Control"], NA)))
dataB

    tissue     row plot treatment  yield responsiveness Calculation
1     Head outside    5   Control 159.30           <NA>          NA
2     Head outside    5       Tr1  59.70         -62.5% -0.62523540
3     Head outside    5       Tr2  97.50         -38.8% -0.38794727
4     Stem outside    5   Control  74.65           <NA>          NA
5     Stem outside    5       Tr1  34.05         -54.4% -0.54387140
6     Stem outside    5       Tr2  76.65           2.7%  0.02679169
7     Head outside    6   Control  99.30           <NA>          NA
8     Head outside    6       Tr1 109.00           9.8%  0.09768379
9     Head outside    6       Tr2  82.30         -17.1% -0.17119839
10    Stem outside    6   Control  53.95           <NA>          NA
11    Stem outside    6       Tr1  52.95          -1.9% -0.01853568
12    Stem outside    6       Tr2  65.45          21.3%  0.21316033

Now, the value will be the same.



How about calculating responsiveness using the mean?

Next, I want to determine responsiveness by averaging the yield. Therefore, the first step is to calculate the average yield.

library (plyr)

dataC=ddply(dataA, c("tissue","row","treatment"), summarise, mean=mean(yield))
colnames(dataC)[4]=c("yield")

dataC
   tissue     row treatment    yield
1    Head    east   Control  71.8000
2    Head    east       Tr1  52.7375
3    Head    east       Tr2  45.1750
4    Head  middle   Control  62.9000
5    Head  middle       Tr1  57.8250
6    Head  middle       Tr2  37.2250
7    Head outside   Control 118.0500
8    Head outside       Tr1 137.6500
9    Head outside       Tr2  96.8250
10   Head    west   Control  73.8750
11   Head    west       Tr1  60.9125
12   Head    west       Tr2  47.6000
13   Stem    east   Control  48.5500
14   Stem    east       Tr1  35.1875
15   Stem    east       Tr2  53.9375
16   Stem  middle   Control  41.3750
17   Stem  middle       Tr1  36.3750
18   Stem  middle       Tr2  63.6250
19   Stem outside   Control  66.9500
20   Stem outside       Tr1  71.7750
21   Stem outside       Tr2  79.1000
22   Stem    west   Control  50.6250
23   Stem    west       Tr1  38.1500
24   Stem    west       Tr2  55.2250

Then, I’ll calculate responsiveness again.

library(dplyr)

dataD=data.frame(dataC %>%
      group_by(tissue, row) %>%
      dplyr::mutate(responsiveness=ifelse(treatment!="Control", 
      (yield-yield[treatment=="Control"])/yield[treatment=="Control"], NA)))

dataD
   tissue     row treatment    yield responsiveness
1    Head    east   Control  71.8000             NA
2    Head    east       Tr1  52.7375    -0.26549443
3    Head    east       Tr2  45.1750    -0.37082173
4    Head  middle   Control  62.9000             NA
5    Head  middle       Tr1  57.8250    -0.08068362
6    Head  middle       Tr2  37.2250    -0.40818760
7    Head outside   Control 118.0500             NA
8    Head outside       Tr1 137.6500     0.16603134
9    Head outside       Tr2  96.8250    -0.17979670
10   Head    west   Control  73.8750             NA
11   Head    west       Tr1  60.9125    -0.17546531
12   Head    west       Tr2  47.6000    -0.35566836
13   Stem    east   Control  48.5500             NA
14   Stem    east       Tr1  35.1875    -0.27523172
15   Stem    east       Tr2  53.9375     0.11096807
16   Stem  middle   Control  41.3750             NA
17   Stem  middle       Tr1  36.3750    -0.12084592
18   Stem  middle       Tr2  63.6250     0.53776435
19   Stem outside   Control  66.9500             NA
20   Stem outside       Tr1  71.7750     0.07206871
21   Stem outside       Tr2  79.1000     0.18147872
22   Stem    west   Control  50.6250             NA
23   Stem    west       Tr1  38.1500    -0.24641975
24   Stem    west       Tr2  55.2250     0.09086420


Tip!! How about we consider reshaping the data format??

Let’s go back to dataA. After going through the above process to calculate responsiveness due to variables being in the same column, how about we reshape the data format to facilitate the calculation of responsiveness more easily?

dataA
     Genotype Block    variable value
1  Genotype_A     I     Control  42.9
2  Genotype_A    II     Control  41.6
3  Genotype_A   III     Control  28.9
4  Genotype_A    IV     Control  30.8
5  Genotype_B     I     Control  53.3
6  Genotype_B    II     Control  69.6
7  Genotype_B   III     Control  45.4
8  Genotype_B    IV     Control  35.1
9  Genotype_C     I     Control  62.3
10 Genotype_C    II     Control  58.5
.
.
.

Let’s use the below code to reshape data format.

library(dplyr)
library(tidyr)
dataB= data.frame(dataA %>%
                    group_by(Genotype, Block, variable) %>%
                    spread(key=variable, value=value))

dataB
     Genotype Block Control Fertilizer1 Fertilizer2 Fertilizer3
1  Genotype_A     I    42.9        53.8        49.5        44.4
2  Genotype_A    II    41.6        58.5        53.8        41.8
3  Genotype_A   III    28.9        43.9        40.7        28.3
4  Genotype_A    IV    30.8        46.3        39.4        34.7
5  Genotype_B     I    53.3        57.6        59.8        64.1
6  Genotype_B    II    69.6        69.6        65.8        57.4
7  Genotype_B   III    45.4        42.4        41.4        44.1
8  Genotype_B    IV    35.1        51.9        45.4        51.6
9  Genotype_C     I    62.3        63.4        64.5        63.6
10 Genotype_C    II    58.5        50.4        46.1        56.1
11 Genotype_C   III    44.6        45.0        62.6        52.7
12 Genotype_C    IV    50.3        46.7        50.3        51.8
13 Genotype_D     I    75.4        70.3        68.8        71.6
14 Genotype_D    II    65.6        67.3        65.3        69.4
15 Genotype_D   III    54.0        57.6        45.6        56.6
16 Genotype_D    IV    52.7        58.5        51.0        47.4

Then we can easily calculate the responsiveness.

dataB$R_Fertilizer1=(dataB$Fertilizer1-dataB$Control)/dataB$Control
dataB$R_Fertilizer2=(dataB$Fertilizer2-dataB$Control)/dataB$Control
dataB$R_Fertilizer3=(dataB$Fertilizer2-dataB$Control)/dataB$Control

dataB
     Genotype Block Control Fertilizer1 Fertilizer2 Fertilizer3 R_Fertilizer1 R_Fertilizer2 R_Fertilizer3
1  Genotype_A     I    42.9        53.8        49.5        44.4    0.25407925   0.153846154   0.153846154
2  Genotype_A    II    41.6        58.5        53.8        41.8    0.40625000   0.293269231   0.293269231
3  Genotype_A   III    28.9        43.9        40.7        28.3    0.51903114   0.408304498   0.408304498
4  Genotype_A    IV    30.8        46.3        39.4        34.7    0.50324675   0.279220779   0.279220779
5  Genotype_B     I    53.3        57.6        59.8        64.1    0.08067542   0.121951220   0.121951220
6  Genotype_B    II    69.6        69.6        65.8        57.4    0.00000000  -0.054597701  -0.054597701
7  Genotype_B   III    45.4        42.4        41.4        44.1   -0.06607930  -0.088105727  -0.088105727
8  Genotype_B    IV    35.1        51.9        45.4        51.6    0.47863248   0.293447293   0.293447293
9  Genotype_C     I    62.3        63.4        64.5        63.6    0.01765650   0.035313002   0.035313002
10 Genotype_C    II    58.5        50.4        46.1        56.1   -0.13846154  -0.211965812  -0.211965812
11 Genotype_C   III    44.6        45.0        62.6        52.7    0.00896861   0.403587444   0.403587444
12 Genotype_C    IV    50.3        46.7        50.3        51.8   -0.07157058   0.000000000   0.000000000
13 Genotype_D     I    75.4        70.3        68.8        71.6   -0.06763926  -0.087533156  -0.087533156
14 Genotype_D    II    65.6        67.3        65.3        69.4    0.02591463  -0.004573171  -0.004573171
15 Genotype_D   III    54.0        57.6        45.6        56.6    0.06666667  -0.155555556  -0.155555556
16 Genotype_D    IV    52.7        58.5        51.0        47.4    0.11005693  -0.032258065  -0.032258065

Then, let’s reshape data again.

dataC= data.frame(pivot_longer(dataB, 
                   cols= starts_with("R_Fertilizer"), 
                   names_to= "Responsiveness", 
                   values_to= "Value"))
dataC
     Genotype Block Control Fertilizer1 Fertilizer2 Fertilizer3 Responsiveness        Value
1  Genotype_A     I    42.9        53.8        49.5        44.4  R_Fertilizer1  0.254079254
2  Genotype_A     I    42.9        53.8        49.5        44.4  R_Fertilizer2  0.153846154
3  Genotype_A     I    42.9        53.8        49.5        44.4  R_Fertilizer3  0.153846154
4  Genotype_A    II    41.6        58.5        53.8        41.8  R_Fertilizer1  0.406250000
5  Genotype_A    II    41.6        58.5        53.8        41.8  R_Fertilizer2  0.293269231
6  Genotype_A    II    41.6        58.5        53.8        41.8  R_Fertilizer3  0.293269231
7  Genotype_A   III    28.9        43.9        40.7        28.3  R_Fertilizer1  0.519031142
8  Genotype_A   III    28.9        43.9        40.7        28.3  R_Fertilizer2  0.408304498
9  Genotype_A   III    28.9        43.9        40.7        28.3  R_Fertilizer3  0.408304498
10 Genotype_A    IV    30.8        46.3        39.4        34.7  R_Fertilizer1  0.503246753


Comments are closed.