How to analyze linear plateau model in R Studio?

How to analyze linear plateau model in R Studio?


# This is the full code to generate the above graph. You can simply copy and paste this code in your R console to obtain the graph. 

# to uplopad upload
library (readr)
github= "https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/sulphur%20application.csv"
dataA=data.frame(read_csv(url(github), show_col_types = FALSE))

# to find reasonable initial values for parameters
fit.lm= lm(yield~sulphur,data=dataA)
a_parameter= fit.lm$coefficients[1]
b_parameter= fit.lm$coefficients[2]
x_mean= mean(dataA$sulphur)

# to define quadratic  plateau function
#a = intercept
#b = slope
#jp = join point or break point 

linplat<- function(x, a, b, jp){
  ifelse(x<jp, 
         a+b*x,
         a+b*jp)
}

# to find the best fit parameters
library(dplyr)
library(minpack.lm)

model= nlsLM(formula= yield ~ linplat(sulphur, a, b, jp),
             data= dataA,
             start= list(a=a_parameter,
                         b=b_parameter,
                         jp=x_mean))

summary(model)

# to generate a graph
library(ggplot2)
library(minpack.lm)
library(nlraa)

dataA %>% 
  ggplot(aes(sulphur, yield)) +
  geom_point(size=4, alpha = 0.5) +
  geom_line(stat="smooth",
            method="nlsLM",
            formula=y~SSlinp(x,a,b,jp),
            se=FALSE,
            color="Dark red") +
  geom_vline(xintercept=23.2722, linetype="solid", color="grey") +
  annotate("text", label=paste("sulphur=","23.3","kg/ha"),
           x=23.3, y=1000, angle=90, hjust=0, vjust=1.5, alpha=0.5)+
  labs(x="Sulphur application (kg/ha)", y="Yield (kg/ha)") +
  theme_grey(base_size=15, base_family="serif")+
  theme(legend.position="none",
        axis.line=element_line(size=0.5, colour="black")) +
  windows(width=5.5, height=5)

When we talk about regression, it’s usually about simple linear regression model. This is about the relationship between two variables.


FYI
Simple linear regression (1/5)- correlation and covariance
Simple linear regression (2/5)- slope and intercept of linear regression model


Linear plateau model is similar with simple linear model, but linear plateau model is a segmented model, and this statistical model is interested in the critical value (the x-value above which there is no further increase in y), indicating the plateau value (the statistically highest value that y reaches).

I always talk with data. I’ll run the below R code.

library (readr)
github= "https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/sulphur%20application.csv"
dataA=data.frame(read_csv(url(github), show_col_types= FALSE))

Or you can directly generate the data like below.

variety= rep(c("CV1","CV2","CV3","CV4","CV5"), each=9)
sulphur= c(20.67, 19.9, 19.73,21.16, 21.61, 20.9, 22.36, 
            21.97, 21.9, 21.34, 21.28, 21.9, 20.7, 22.65, 
            22.16, 22.03, 20.4, 22.47, 22.65, 21.82, 24.41, 
            22.34, 24.03, 24.38, 23.6, 23.44, 23.51, 23.89, 
            24.7, 22.8, 23.08, 25.05, 24.09, 24.3, 25.23, 
            23.14, 24.06, 26.98, 25.71, 23.93, 25.16, 23.79, 
            26.34, 24.64, 24.77)
yield= c(1286, 1165, 1176, 1262, 1271, 1221, 1374, 1288, 
          1285, 1261, 1281, 1302, 1199, 1368, 1340, 1272, 
          1276, 1329, 1361, 1267, 1380, 1386, 1408, 1380, 
          1378, 1413, 1400, 1412, 1422, 1381, 1410, 1410, 
          1381, 1422, 1389, 1403, 1420, 1403, 1400, 1420, 
          1379, 1393, 1417, 1415, 1383)

dataA= data.frame(variety,sulphur,yield)
Also, you can download this data in my Github> https://github.com/agronomy4future/raw_data_practice/blob/main/sulphur%20application.csv

Let’s assume that this data is about yield data in five different crop variety according to different sulphur application. First, this is about analysis for simple linear regression.

library (ggplot2)
ggplot(data=dataA,aes(x=sulphur, y=yield)) +
  geom_smooth(method=lm, level=0.95, se=FALSE, linetype=1, size=0.5, formula= y ~ x) +
  geom_point(aes(shape=variety, fill=variety),col="Black", size=5, stroke = 0.5) +
  scale_shape_manual(values=rep(c(21),5)) +
  scale_fill_manual(values=rep(c("Dark grey"),5))+
  scale_x_continuous(breaks=seq(15,30,5), limits=c(15,30)) +
  scale_y_continuous(breaks=seq(500,1800,300), limits=c(500,1800)) +
  labs(x="sulphur application (kg/ha)", y="Yield (kg/ha)") +
  theme_grey(base_size=15, base_family="serif")+
  theme(legend.position="none",
        axis.line=element_line(size=0.5, colour="black")) +
  windows(width=5.5, height=5)
regression= lm (yield~sulphur,data=dataA)
summary(regression)

Call:
lm(formula = yield ~ sulphur, data = dataA)

Residuals:
    Min      1Q  Median      3Q     Max 
-85.236 -25.614  -1.877  30.319  64.933 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  516.204     78.601   6.567 5.46e-08 ***
sulphur       36.028      3.402  10.591 1.47e-13 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 38.79 on 43 degrees of freedom
Multiple R-squared:  0.7229,	Adjusted R-squared:  0.7164 
F-statistic: 112.2 on 1 and 43 DF,  p-value: 1.467e-13

We obtained the model equation, y= 516.204 + 36.025x

However, this model equation would be different from actual data because the model equation predicts yield increase when sulphur increases, but in the actual data, yield does not increase at certain point.

I’m interestedin this certain point. Therefore, I’d like to know the optimun sulphur application to bring about the grestest yield, using a linear plateau model.



1) Find reasonable initial values for parameters

fit.lm= lm(yield~sulphur,data=dataA)
a_parameter= fit.lm$coefficients[1]
b_parameter= fit.lm$coefficients[2]
x_mean= mean(dataA$sulphur)

fit.lm= lm(yield~sulphur,data=dataA)
a_parameter= fit.lm$coefficients[1]
b_parameter= fit.lm$coefficients[2]
x_mean= mean(dataA$sulphur)

> a_parameter
(Intercept) 
516.2043 

> b_parameter
sulphur 
36.02786
 
> x_mean
23.04378

Now, the gereral values for simple linear regresion were calculated again. a_parameter is for intercept, b_parameter is for slope, and x_mean is for average of sulphur.


2) Define linear plateau function

Linear plateau model was defined as below.

#a = intercept
#b = slope
#jp = join point or break point 

linplat= function(x, a, b, jp){
         ifelse(condition= x<jp, 
                true= a+b*x,
                false= a+b*jp)
}

“Sometimes, this code causes errors. If an error occurs, you can delete the text identifying the equation.”

linplat= function(x, a, b, jp){
         ifelse(x<jp, 
                a+b*x,
                a+b*jp)
}

3) Find best fit parameters

model= nls(yield~linplat(sulphur, a, b, jp),
            data=dataA,
            start=list(a=a_parameter,
                       b=b_parameter,
                       jp=x_mean),
            trace=FALSE,
            nls.control(maxiter=1000))

summary(model)

----------- or ----------- 

library(minpack.lm)
model= nlsLM(formula= yield ~ linplat(sulphur, a, b, jp),
             data= dataA,
             start=list(a=a_parameter,
                        b=b_parameter,
                        jp=x_mean))

summary(model)

We obtained the linear plateau model equation. The below code is the whole code to obtain this model equation.

# Find reasonable initial values for parameters
fit.lm= lm(yield~sulphur,data=dataA)
a_parameter= fit.lm$coefficients[1]
b_parameter= fit.lm$coefficients[2]
x_mean= mean(dataA$sulphur)

# Define linear plateau function
linplat= function(x, a, b, jp){
         ifelse(x<jp, a+b*x,
         a+b*jp)
}

# Find best fit parameters
library(minpack.lm)
model= nlsLM(formula= yield ~ linplat(sulphur, a, b, jp),
             data= dataA,
             start=list(a=a_parameter,
                         b=b_parameter,
                         jp=x_mean))

summary(model)

Also, we can modify the code as below.

# Find reasonable initial values for parameters
linear <- lm(yield~sulphur, data=dataA)
start_values <- coef(linear)

# Define linear plateau function
linplat<- function(x, a, b, jp){
  ifelse(x<jp, a+b*x,
         a+b*jp)
}

# Find best fit parameters
library(minpack.lm)
model <- nlsLM(formula=yield~linplat(sulphur, a, b, jp),
  data=dataA,
  start=list(
    a=start_values[1],
    b=start_values[2],
    jp=mean(dataA$sulphur)))

summary(model)

Tip!! nlraa() package

install.packages("nlraa")
library(nlraa)
model= nlsLM(formula=yield~SSlinp(sulphur,a,b,jp),data=dataA)
summary(model)

If we use nlraa(), we can obtain the model eqution at once without extra coding. However, it would be better to follow what I suggested step by step to understand the concept of linear plateau model.



How to interpret linear plateau model?

In the model equation; y= – 125.4 + 65.6, the important thing is critical value, 23.3. Statistically, at this point, yield does not increase although sulphur application increases. Therefore, critical value is regarded as the highest value that yield reaches.

Therefore, the model equation is expressed as below.

y= -125.4 + 65.6x if x < 23.3

and when sulphur application is 23.3, yield would be

y= -125.3667 + 65.5958*23.2722 ≈ 1401.2

That is, when 23.3 S kg/ha was applied, yield would be 1,401.2 kg/ha and no more yield increases, indicating the optimum sulphur application would be 23.3 kg/ha.


How to make a graph?

Using the below code, we can easily draw a graph for linear plateau.

dataA %>% 
  ggplot(aes(sulphur, yield)) +
  geom_point(size=4, alpha = 0.5) +
  geom_line(stat="smooth",
            method="nlsLM",
            formula=y~SSlinp(x,a,b,jp),
            se=FALSE,
            color="Dark red") +
  labs(x="Sulphur application (kg/ha)", y="Yield (kg/ha)") +
  theme_grey(base_size=15, base_family="serif")+
  theme(legend.position="none",
        axis.line=element_line(size=0.5, colour="black")) +
  windows(width=5.5, height=5)

The breaking point is 23.3 and at this point, yield would be 1,401.2

To visualize more clearly, more code will be added.

dataA %>% 
  ggplot(aes(sulphur, yield)) +
  geom_point(size=4, alpha = 0.5) +
  geom_line(stat="smooth",
            method="nlsLM",
            formula=y~SSlinp(x,a,b,jp),
            se=FALSE,
            color="Dark red") +
  geom_vline(xintercept=23.2722, linetype="solid", color="grey") +
  annotate("text", label=paste("sulphur=","23.3","kg/ha"),
           x=23.3, y=1000, angle=90, hjust=0, vjust=1.5, alpha=0.5)+
  labs(x="Sulphur application (kg/ha)", y="Yield (kg/ha)") +
  theme_grey(base_size=15, base_family="serif")+
  theme(legend.position="none",
        axis.line=element_line(size=0.5, colour="black")) +
  windows(width=5.5, height=5)

p-value and pseudo R-squared

# Define null model
nullfunct= function(x, m){m}
m.ini= mean(dataA$sulphur)
null= nls(yield~nullfunct(sulphur, m),
          data=dataA,
          start=list(m=m.ini),
          trace=FALSE,
          nls.control(maxiter=1000))

# Find p-value and pseudo R-squared
library(rcompanion)
nagelkerke(model,null)

Reference
https://gradcylinder.org/post/linear-plateau/
https://rcompanion.org/handbook/I_11.html


How to analyze the same data as quadratic plateau model? The answer is below.

How to analyze quadratic plateau model in R Studio?



Comments are closed.