Browsed by
Category: Statistics

A Practical Approach to Linear Mixed-Effects Modeling in R

A Practical Approach to Linear Mixed-Effects Modeling in R

A Linear Mixed-Effects Model (LMM) is a statistical model that combines both fixed effects and random effects to analyze data with repeated measurements or hierarchical structure. Let’s break down the key components and concepts of a Linear Mixed-Effects Model: 1) Fixed Effects: 2) Random Effects: 3) Linear Mixed-Effects Model Equation: The general equation of a Linear Mixed-Effects Model can be written as: Y= Xβ + Zb + ε 4) Estimation: In summary, Linear Mixed-Effects Models are a powerful statistical tool…

Read More Read More

Understanding Multiple Linear Regression Easily (Part 2: Calculating the Coefficient of Determination Manually)

Understanding Multiple Linear Regression Easily (Part 2: Calculating the Coefficient of Determination Manually)

□ Understanding Multiple Linear Regression Easily (Part 1: Calculating the Regression Equation Manually) In the previous post, we explained how to manually calculate the regression equation in multiple linear regression analysis. Now, in this post, I will explain how to calculate the coefficient of determination (R2) in multiple linear regression analysis. No. Yield (yi) Time (xi1) Moisture (xi2) 1 4.3 4 0.2 2 5.5 5 0.2 3 6.8 6 0.2 4 8.0 7 0.2 5 4.0 4 0.3 6 5.2…

Read More Read More

Understanding Multiple Linear Regression Easily (Part 1: Calculating the Regression Equation Manually)

Understanding Multiple Linear Regression Easily (Part 1: Calculating the Regression Equation Manually)

In my previous posts, I explained the simple linear regression model as five categories. I recommend reading the following posts first. □ Simple linear regression (1/5)- correlation and covariance□ Simple linear regression (2/5)- slope and intercept of linear regression model□ Simple linear regression (3/5)- standard error of slope and intercept□ Simple linear regression (4/5)- t value on the slope and intercept    □ Simple linear regression (5/5)- R_squared In this session, I will explain multiple regression analysis. Multiple regression analysis refers to…

Read More Read More

Step-by-Step Guide: Uploading Data and Conducting Statistical Analysis in SAS Studio

Step-by-Step Guide: Uploading Data and Conducting Statistical Analysis in SAS Studio

SAS Studio is a web version of the SAS program, and it can be used for free. As my current license for the statistical program I’ve been using is about to expire, I was searching for alternatives. Upon discovering SAS Studio, I decided to give it a try. Although I have never used SAS before, I’ve decided to take this opportunity to learn. I will now summarize the very basic learning materials I have covered up to this point. First,…

Read More Read More

Easy-to-Understand Guide to Factorial Experiments and Two-Way ANOVA

Easy-to-Understand Guide to Factorial Experiments and Two-Way ANOVA

Today, I’ll try to explain factorial experiments in the simplest way. When you apply multiple different factors simultaneously to derive experimental results, it’s called factorial experiments. The different treatments within the experiment are referred to as ‘factorials.’ In other words, a factorial is a combination of factors. [Note 1] A factorial experiment is a research design in which multiple independent variables, also known as factors, are manipulated simultaneously to analyze their combined effects on a dependent variable. The goal of…

Read More Read More

Two-Way ANOVA Tutorial Using SAS Studio

Two-Way ANOVA Tutorial Using SAS Studio

I will introduce how to perform a Two-Way ANOVA analysis using SAS Studio. Here is the data that you have available: Upload this Excel file to SAS Studio. After uploading the Excel file to SAS Studio, create a data table named “EXP1” in My Libraries. Then, click on the EXP1 data table. Then, select the icon for generating code located at the top. By doing so, a new tab named “Program 1” will be created, allowing you to generate the…

Read More Read More

Quantifying Phenotypic Plasticity of Crops

Quantifying Phenotypic Plasticity of Crops

Phenotypic plasticity refers to the ability of an individual organism, in this case, a plant, to display varying phenotypic traits or characteristics in response to different environmental conditions. These traits can include physical features, physiological processes, and behaviors. Phenotypic plasticity is a crucial adaptive mechanism that allows organisms to optimize their survival and reproduction in varying environments. Crops are particularly reliant on phenotypic plasticity to cope with changes in factors such as light, temperature, moisture, nutrient availability, and other environmental…

Read More Read More

Statistical Inference on Binomially Distributed Data

Statistical Inference on Binomially Distributed Data

The primary purpose of our experiment is to validate hypotheses regarding the population of the subjects under study. As a result, the experimenter must determine whether to accept or reject these hypotheses based on the experiment’s results. In this context, the method of statistical analysis will vary depending on whether the sample data follows a normal distribution or a binomial distribution. Today, we will introduce statistical testing methods for data that conform to a binomial distribution. Let’s delve into an…

Read More Read More

[STAT Article] Easy Guide to Cook’s Distance Calculation Using Excel and R

[STAT Article] Easy Guide to Cook’s Distance Calculation Using Excel and R

I have 1,000 data points of measurements of the length (mm) and weight (mg) of wheat grains. With this data, I want to analyze the relationship between the length and weight of the wheat grain to propose an equation model that can predict grain weight. I will draw a graph to visualize the data. If you are new to R, you can copy and paste the following code into your R script window to obtain the same graph as shown…

Read More Read More

R-Squared Calculation in Linear Regression with Zero Intercept

R-Squared Calculation in Linear Regression with Zero Intercept

Previously, I scanned wheat grains to obtain the area of each grain, and then measured the weight of each grain corresponding to its area in order to develop a model equation. The following regression demonstrates the relationship between grain area and weight. # Data download https://www.kaggle.com/datasets/agronomy4future/wheat-grain-area-vs-weight I obtained the equation y = 3.3333x – 13.7155, where y is the grain weight (mg) and x is the grain area (mm2), using both Excel and R. However, this model predicts negative values…

Read More Read More

[STAT article] Two-Way ANOVA: An Essential Tool for Understanding Factorial Experiments

[STAT article] Two-Way ANOVA: An Essential Tool for Understanding Factorial Experiments

A factorial experiment involves the simultaneous manipulation of multiple factors or independent variables (x) to study their effects on a dependent variable (y). The experiment is called factorial because it involves testing multiple factors simultaneously. In factorial experiments, the combination of the different levels of each factor being tested is called a factorial, and each factorial represents a unique combination of these levels. For instance, N0_Genotyp1, N0_Genotyp2, N1_Genotyp1, N1_Genotyp2, etc. are different factorials used to conduct the experiment and analyze…

Read More Read More

Augment Models: How to Calculate Contrasts and Analyze Your Data with Excel and R?

Augment Models: How to Calculate Contrasts and Analyze Your Data with Excel and R?

I have the following data. Nitrogen Sulphur Rep Yield 0 0 1 1.0 0 0 2 0.9 0 0 3 0.8 N1 S1 1 1.0 N1 S1 2 1.2 N1 S1 3 1.3 N1 S2 1 2.1 N1 S2 2 2.2 N1 S2 3 2.3 N2 S1 1 1.4 N2 S1 2 1.6 N2 S1 3 1.7 N2 S2 1 2.5 N2 S2 2 2.6 N2 S2 3 2.8 Let’s assume that this data is the result of investigating how…

Read More Read More

The Best Linear Unbiased Estimator (BLUE): Step-by-Step Guide using R (with AllInOne Package)

The Best Linear Unbiased Estimator (BLUE): Step-by-Step Guide using R (with AllInOne Package)

In this session, I will introduce the method of calculating the Best Linear Unbiased Estimator (BLUE). Instead of simply listing formulas as many websites do to explain BLUE, this post aims to help readers understand the process of calculating BLUE with an actual dataset using R. I have the following data. location sulphur (kg/ha) block yield Cordoba 0 1 750 Cordoba 24 1 1250 Cordoba 36 1 1550 Cordoba 48 1 1120 Cordoba 0 2 780 Cordoba 24 2 1280…

Read More Read More

What is the statistical method for comparing whether the slopes and y-intercepts in a regression model are the same or not (Feat. ANCOVA using R and SAS)?

What is the statistical method for comparing whether the slopes and y-intercepts in a regression model are the same or not (Feat. ANCOVA using R and SAS)?

To gain a basic understanding of the topic, I recommend reading the following posts. Analysis of Covariance (ANCOVA) I have a dataset as shown below, and I would like to analyze crop yield, and height based on different fertilizer types (Control, Slow-release, and Fast-release). The experimental design is a Completely Randomized Design (CRD) with 10 replicates. Rep Fertilizer Yield Height Fertilizer Yield Height Fertilizer Yield Height 1 Control 12.2 45.0 Slow 16.6 63.0 Fast 9.5 52.0 2 Control 12.4 52.0…

Read More Read More

What is the F-ratio in statistics?

What is the F-ratio in statistics?

Today, I will explain the meaning of the F-value in testing for significance through statistical processing. Let me give you an example. Suppose we want to determine whether there are differences in the yield according to the varieties (A, B, C). The total experimental unit is 12 (3 varieties x 4 replicates). What would happen if there is a significant difference in yield among varieties A and C? If there is a large difference in yield between these varieties, the…

Read More Read More

Simple linear regression (5/5)- Coefficient of determination

Simple linear regression (5/5)- Coefficient of determination

Here is data for x and y. I would like to perform regression analysis to understand how y changes with x. n x y 1 10 30 2 20 40 3 30 50 4 40 80 5 50 90 6 60 100 7 70 120 I have data for x and y as described above, and want to determine the regression model for this data, where the dependent variable y changes according to the independent variable x, in the form…

Read More Read More

What is logistic regression (feat. odds, odds ratio and model equation)?

What is logistic regression (feat. odds, odds ratio and model equation)?

Logistic regression is a type of statistical analysis used to model the relationship between a binary (yes/no) dependent variable and independent variables. The goal of logistic regression is to find a relationship between the independent variables (x) and the probability of a particular outcome for the dependent variable (y). The logistic regression model calculates the probability of a certain outcome by applying a logistic function to the linear combination of the independent variables. Here is one example. Sulphur improves plant…

Read More Read More

What is split-split-plot design in agronomy research (feat. using R and SAS)?

What is split-split-plot design in agronomy research (feat. using R and SAS)?

In my previous post, I explained what split-plot design and the statistical model is, and also how it is different RCBD. What is split-plot design in agronomy research? I explained the main difference between split-plot design and RCBD is that in split-plot design, error is divided into two (error a and b), increasing the significance of interaction between the main plot and sub-plot. Now our interest lies in cases where we have three factors. In a split-plot design, we typically…

Read More Read More

What is split-plot design in agronomy research?

What is split-plot design in agronomy research?

Split-plot design has been widely used particularly in the agronomy research. In split-plot design, the experimental units are divided into smaller units. Split-plot designs are useful when some factors are difficult or expensive to change or when the levels of the factors cannot be randomized (I’ll explain in detail later). Split-plot design consists of one whole plot and one subplot. The whole plot factor is randomly assigned to the experimental units, while the subplot factor is applied to a smaller…

Read More Read More

An Introduction to Residual Analysis in Simple Linear Regression Models

An Introduction to Residual Analysis in Simple Linear Regression Models

Sample No. x y 1 10 30 2 20 40 3 30 50 4 40 80 5 50 90 6 60 100 7 70 120 Here is a dataset that allows us to analyze the relationship between x and y and obtain the model equation, y= β0 + β1x. Although statistical programs can provide us with results in just 10 seconds, it is more important to understand the principles behind the calculations than to simply know how to run the…

Read More Read More

What is odds, log odds and logit (feat. Slam Dunk story)?

What is odds, log odds and logit (feat. Slam Dunk story)?

Odds and logit is the basic concept to understand logistic regression. Today I’ll explain what it is as much as easily. Do you know a comic book, ‘Slam Dunk’? I’ll explain odds with this story. 1) Odds Now, Shohoku high school is playing games with other high schools in the tournament. In the first round, Shohoku high school won 4 games and lost 6 games out of 10 games. Now the winning odds of Shohoku high school is 4/6 ≈…

Read More Read More

How to analyze quadratic plateau model in R Studio?

How to analyze quadratic plateau model in R Studio?

Previous post□ How to analyze linear plateau model in R Studio? In my previous post, I explained how to analyze linear plateau model. I simulated yield data for five different crop varieties with different sulphur applications, and suggsted the optimum sulphur application would be 23.3 kg/ha based on the linear plateau model. In this time, I’ll explain how to analyze quadratic plateau model with the same data using R studio 1) Data upload If you run the below code, the…

Read More Read More

How to analyze linear plateau model in R Studio?

How to analyze linear plateau model in R Studio?

When we talk about regression, it’s usually about simple linear regression model. This is about the relationship between two variables. FYI□ Simple linear regression (1/5)- correlation and covariance□ Simple linear regression (2/5)- slope and intercept of linear regression model Linear plateau model is similar with simple linear model, but linear plateau model is a segmented model, and this statistical model is interested in the critical value (the x-value above which there is no further increase in y), indicating the plateau value (the statistically highest value…

Read More Read More

Simple linear regression (4/5)- t value on the slope and intercept    

Simple linear regression (4/5)- t value on the slope and intercept    

Simple Linear Regression Series 1) Simple linear regression (1/5)- correlation and covariance 2) Simple linear regression (2/5)- slope and intercept of linear regression model 3) Simple linear regression (3/5)- standard error of slope and intercept 4) Simple linear regression (4/5)- t value on the slope and intercept 5) Simple linear regression (5/5)- Coefficient of determination In my previous post, I explained how to calculate standard error of slope and intercept in simple linear regression model. Now, I’ll explain how to calculate t…

Read More Read More

Simple linear regression (3/5)- standard error of slope and intercept

Simple linear regression (3/5)- standard error of slope and intercept

Previous post!!□ Simple linear regression (1/5)- correlation and covariance□ Simple linear regression (2/5)- slope and intercept of linear regression model In my previous post, I explained how to calculate slope (β1) and intercept (β0) of linear regression model. If you well followed my previous posts, you will get the above result, y= 89.0 + 1.5x  Now our interest is how to calculate standard error in the intercept and slope (Red box). Here is the equation to obtain standard error of…

Read More Read More

Simple linear regression (2/5)- slope and intercept of linear regression model

Simple linear regression (2/5)- slope and intercept of linear regression model

□ Simple linear regression (1/5)- correlation and covariance In my previous post, I explained about correlation and covariance. Now, I’ll explain about slope (β1) and intercept (β0) of linear regression model. In the whole picture to explain a linear regression model, β1 is calculated as β1 = r * Sy / Sx We already know how to calculate correlation (r), and only we need to calculate the ratio between standard deviation of x and y. Let’s go back to the…

Read More Read More

Simple linear regression (1/5)- correlation and covariance

Simple linear regression (1/5)- correlation and covariance

Since today, I’ll explain simple linear regression model. There are lots of information about linear regression on websites, but I believe I’ll tell you about what most people don’t mention. My philosophy on data analysis and statistics is to fully understand the concept, not simply follow what software programs say. Therefore I usually calculate statistical concepts by hand, and only my hand calculation is exactly same as the software programs provide, I say I understand the concept. In this context,…

Read More Read More

What is Probability Density Function (PDF) and Cumulative Distribution Function (CDF): How to calculate using Excel and R ?

What is Probability Density Function (PDF) and Cumulative Distribution Function (CDF): How to calculate using Excel and R ?

When we analyze data, we may need to show graphs depicting normal distributions. These graphs differ from density graphs as they convey various concepts that simple bar graphs cannot. While it is easy to draw these graphs in Excel, understanding the underlying concepts is crucial. In this article, I will explain what the Probability Density Function (PDF) is, and I will show how we can calculate it in both Excel and R. Here is a dataset of 1,000 individual wheat…

Read More Read More

R-Squared in ANOVA: A Practical Approach to Calculation and Interpretation

R-Squared in ANOVA: A Practical Approach to Calculation and Interpretation

Every time we discuss R2, we typically associate it with regression models. However, R2 also has a significant role in ANOVA. There seems to be less information available on how to calculate and interpret R2 in ANOVA, so today’s topic will focus on how to interpret this measure in the context of ANOVA. Let’s consider an example dataset. Suppose we measured the final yield at varying nitrogen levels. We established three replicates as a block. Consequently, this model will be…

Read More Read More

How to calculate PCA (Principal Components Analysis) by hand?

How to calculate PCA (Principal Components Analysis) by hand?

When you conduct PCA (Principal Components Analysis), do you simply accept the result which software programs provide? If we just accept results without any doubts, we never understand the principle of PCA. In this time, I’ll introduce how PCA is calculated step by step, and if you read this post, I believe you can fully understand the concept of PCA. Here is one data. Let’s say we measured kernel number per ear (KN), average of kernel weight per ear (KW)…

Read More Read More