[R package] Prediction of Grain Weight and Area in Bread Wheat (feat. kimindex)
These days, image analysis equipment can easily provide grain area measurements (mm²), and the large datasets acquired instantly from this equipment offer more insights into wheat grains. While grain weight can be a good indicator of wheat yield, obtaining data on grain weight is challenging with the available equipment. Currently, average grain weight is calculated using thousand kernel weight (TWK), a process that is time-consuming and labor-intensive. Therefore, predicting wheat grain weight from the grain area would allow us to generate a large dataset for grain weight more efficiently.
A couple of years ago, I developed a model equation to predict grain weight from the area of bread wheat.
Kim, J., Savin, R. and Slafer, G.A., 2021. Weight of individual wheat grains estimated from high-throughput digital images of grain area. European Journal of Agronomy, 124, p.126237.
Using this model; y=x1.32, where y is the predicted grain weight and x is the grain area, we can estimate grain weight based on area.
This is a simple R package that I recently developed for predicting grain weight from area, or vice versa. I named the package kimindex
based on the equation from Kim et al. (2021).
1. Install the package
if(!require(remotes)) install.packages("remotes")
if (!requireNamespace("kimindex", quietly = TRUE)) {
remotes::install_github("agronomy4future/kimindex")
}
if (!requireNamespace("kimindex1", quietly = TRUE)) {
remotes::install_github("agronomy4future/kimindex1")
}
library(remotes)
library(kimindex)
library(kimindex1)
2. Practice package using the actual dataset.
if(!require(readr)) install.packages("readr")
library(readr)
github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/Philipp_et_al_2018.csv"
df = data.frame(read_csv(url(github), show_col_types=FALSE))
head(df,5)
grain_weight_mg grain_area_mm2
1 49.08 18.6
2 45.43 18.3
3 41.78 16.7
4 53.42 20.3
5 44.40 17.5
.
.
.
This dataset is from Philipp et al., (2018).
Philipp, N., Weichert, H., Bohra, U., Weschke, W., Schulthess, A.W. and Weber, H., 2018. Grain number and grain yield distribution along the spike remain stable despite breeding for high yield in winter wheat. PloS one, 13(10), p.e0205452.
In this study, grain area (mm²) data were obtained using the MARVIN Seed Analyser (MARViTECH GmbH), along with TKW (g). This data represents the average of several grains rather than individual grains, but I will use this dataset to verify the model equation.
3. Run the package
First, I’ll predict grain weight from the area of bread wheat using the following code.
predicted_gw=kimindex(df, "grain_area_mm2", remove_na= TRUE)
head(predicted_gw,5)
grain_weight_mg grain_area_mm2 predicted_gw
1 49.08 18.6 47.39768
2 45.43 18.3 46.39118
3 41.78 16.7 41.11362
4 53.42 20.3 53.19793
5 44.40 17.5 43.73310
.
.
.
I included grain area data (grain_area_mm²) in the df
dataset and excluded missing values, and this package predicted grain weight from the grain area data (grain_area_mm²) using the model equation, y=x1.32
, where y is the predicted grain weight and x is the grain area.
if(!require(ggplot2)) install.packages("ggplot2")
library(ggplot2)
ggplot(data=predicted_gw, aes(x=grain_weight_mg, y=predicted_gw))+
geom_point(fill= "orange", color= "black", size= 5, shape= 21) +
scale_x_continuous(breaks=seq(0,70,10), limits = c(0,70)) +
scale_y_continuous(breaks=seq(0,70,10), limits = c(0,70)) +
geom_abline (slope=1, linetype= "dashed", color="grey15", size=1) +
labs(x="Actual grain weight (mg)", y="Predicted grain weight (mg)") +
theme_classic(base_size=18, base_family="serif")+
theme(legend.position=c(0.8,0.8),
legend.title=element_blank(),
legend.key=element_rect(color="white", fill="white"),
legend.text=element_text(family="serif", face="plain",
size=13, color= "Black"),
legend.background=element_rect(fill="white"),
axis.line=element_line(linewidth=0.5, colour="black"))
Second, I’ll predict grain area from the wheat of bread wheat using the following code.
predicted_area=kimindex1(df, "grain_weight_mg", remove_na= TRUE)
head(predicted_area,5)
grain_weight_mg grain_area_mm2 predicted_area
1 49.08 18.6 19.09802
2 45.43 18.3 18.01203
3 41.78 16.7 16.90466
4 53.42 20.3 20.36416
5 44.40 17.5 17.70180
.
.
.
I included grain weight data (grain_weight_mg) in the df
dataset and excluded missing values, and this package predicted grain area from the grain weight data (grain_weight_mg) using the model equation, y=exp(log(x)/1.32)
, where y is the predicted grain area and x is the grain weight.
if(!require(ggplot2)) install.packages("ggplot2")
library(ggplot2)
ggplot(data=predicted_area, aes(x=grain_area_mm2, y=predicted_area))+
geom_point(fill= "orange", color= "black", size= 5, shape= 21) +
scale_x_continuous(breaks=seq(0,30,5), limits = c(0,30)) +
scale_y_continuous(breaks=seq(0,30,5), limits = c(0,30)) +
geom_abline (slope=1, linetype= "dashed", color="grey15", size=1) +
labs(x="Actual grain area (mm2)", y="Predicted grain area (mm2)") +
theme_classic(base_size=18, base_family="serif")+
theme(legend.position=c(0.8,0.8),
legend.title=element_blank(),
legend.key=element_rect(color="white", fill="white"),
legend.text=element_text(family="serif", face="plain",
size=13, color= "Black"),
legend.background=element_rect(fill="white"),
axis.line=element_line(linewidth=0.5, colour="black"))
Github: https://github.com/agronomy4future/kimindex
Github: https://github.com/agronomy4future/kimindex1