Browsed by
Category: R programming

Practices in Data Normalization using normtools() in R

Practices in Data Normalization using normtools() in R

■ [R package] Normalization Methods for Data Scaling (Feat. normtools) In my previous post, I introduced the R package normtools(), which I developed to normalize data using various methods. This time, I’ll demonstrate how to use the R package normtools() for data normalization. 1. Data upload This data includes kernel number (KN), average kernel weight (AGW), and grain yield (GY) for different corn varieties across various years, populations, and locations. 2. Data normalization This is the normtools() package. First, I’ll…

Read More Read More

[R package] Normalization Methods for Data Scaling (Feat. normtools)

[R package] Normalization Methods for Data Scaling (Feat. normtools)

■ [Data article] Data Normalization Techniques: Excel and R as the Initial Steps in Machine Learning In my previous post, I explained how to normalize data using various methods and demonstrated how to perform the calculations for each method. To simplify these calculations, I recently developed an R package that easily generates normalized data. 1. Install the normtools() package 2. Basic code format 3. Practice with actual dataset (data upload) 4. Normalize data 4.1. Z-test normalization 4.2. Robust Scaling 4.3….

Read More Read More

R GIS: Interpolating and Plotting Corn Grain Yield Data

R GIS: Interpolating and Plotting Corn Grain Yield Data

■ Python GIS: Interpolating and Plotting Corn Grain Yield Data In my previous post, I explained how to create a GIS map using Python. Today, I’ll introduce how to create the same GIS map using R. First, let’s install all the required packages. and I’ll upload a dataset for practice. Next, I’ll extract columns for latitude, longitude, and y (output) and I’ll interpolate data Finally, I’ll create a GIS map using ggplot(). Full code If you copy and paste the…

Read More Read More

Graphing Normal Distributions with Varied Variances

Graphing Normal Distributions with Varied Variances

I want to create a normal distribution graph with a specific variance. First, it’s necessary to create the data. I’ll generate data with a mean of 100 and a variance of 100 (which means the standard deviation is 10). However, it’s important to establish a range. To do this, I’ll set up a range of 6σ, and the dataset will contain 1,000 rows. and I’ll create a normal distribution graph. These are graphs with different variances, ranging from 1σ to…

Read More Read More

[R package] Calculation for Growing Degree Days (GDDs, ºCd)

[R package] Calculation for Growing Degree Days (GDDs, ºCd)

Growing Degree Days (GDDs) are a measure of heat accumulation used to predict crop development rates such as the growth of crops. The GDDs are calculated to provide a simple model to estimate the growth and development of plants, especially crops, based on the daily temperature. To calculate GDDs, the base temperature for each crop should first be identified. The base temperature is the temperature below which crop growth is minimal or stops. This temperature varies by crop. For example,…

Read More Read More

[R package] Prediction of Grain Weight and Area in Bread Wheat (feat. kimindex)

[R package] Prediction of Grain Weight and Area in Bread Wheat (feat. kimindex)

These days, image analysis equipment can easily provide grain area measurements (mm²), and the large datasets acquired instantly from this equipment offer more insights into wheat grains. While grain weight can be a good indicator of wheat yield, obtaining data on grain weight is challenging with the available equipment. Currently, average grain weight is calculated using thousand kernel weight (TWK), a process that is time-consuming and labor-intensive. Therefore, predicting wheat grain weight from the grain area would allow us to…

Read More Read More

[R package] Probability Distribution and Z-Score Calculation Function (feat. probdistz)

[R package] Probability Distribution and Z-Score Calculation Function (feat. probdistz)

■ Introduction ■ What is Probability Density Function (PDF) and Cumulative Distribution Function (CDF): How to calculate using Excel and R? In my previous post, I explained what the Probability Density Function (PDF) and the Cumulative Distribution Function (CDF) are. I also explained the formula for the PDF and demonstrated how to manually calculate it in Excel. Additionally, I mentioned the Excel function that performs the same calculation for the PDF, as follows: I then introduced how to create a probability distribution…

Read More Read More

[R package] Finlay-Wilkinson Regression model (feat. fwrmodel)

[R package] Finlay-Wilkinson Regression model (feat. fwrmodel)

■ What is Finlay-Wilkinson Regression Model? In my previous post, I introduced what Finlay-Wilkinson Regression Model is and how to calculate adaptability (or stability). Actually, adaptability and stability are opposite concept with the same data. Have you ever heard heritability (h2)? Heritability is a key concept in genetics and breeding that measures how much of the variation in a trait within a population is due to genetic differences among individuals. In other words, it quantifies the proportion of phenotypic variation…

Read More Read More

In R Studio, how to exclude missing value (NA)?

In R Studio, how to exclude missing value (NA)?

I’ll create one data. In genotype D, yield data was missed, so it was indicated as NA. Now I’ll calculate the mean of total yield across all genotypes. As you see above, we can’t calculate the mean dud to NA. To obtain the mean of total yield, we should exclude NA. Using subset(), we can simply exclude Genotype D, But, a much simpler way is to use the code na.rm=TRUE, which enables you to avoid using subset(). When the data…

Read More Read More

How to Sample a Portion of Data using R?

How to Sample a Portion of Data using R?

I have one big dataset. Let’s upload to R. This data has 96,319 data rows. I want to use some part of this data. How can I randomly extract some data from the whole dataset. First, I’ll add number from 1 to the end of the data row to provide ID of each data row. Caret package The caret package (short for Classification And REgression Training) is a set of functions that attempt to streamline the process for creating predictive models. You can find…

Read More Read More

Stepwise Regression: A Practical Approach for Model Selection using R

Stepwise Regression: A Practical Approach for Model Selection using R

Stepwise selection, forward selection, and backward elimination are all methods used in the context of building statistical models, specifically regression models, where the goal is to select the most relevant predictors. In this section, I’ll introduce one by one. Let’s generate one dataset. This dataset includes grain yield data, along with measurements of stem biomass, grain weight (agw), and grain number (gn). I would now like to determine which variables are the most critical factors in influencing the final grain…

Read More Read More

In R, how to check the data structure?

In R, how to check the data structure?

When uploading data to R, we first need to check the data structure before analyzing it. Here are some tips for checking the data structure in R. First, I’ll upload a dataset from my GitHub. In this dataset, let’s check the structure of the data. ■ Code to display the first or last certain rows When we examine the data, we can simply run the variable df or use print(df) to display it. However, if we want to quickly understand…

Read More Read More

Coding Light Spectrum Curves for Plant Growth in R

Coding Light Spectrum Curves for Plant Growth in R

Let’s say we collected relative light intensity data across a wide range of the light spectrum in an LED experiment. and I’d like to create light spectrum curves regarding relative light intensity. First, I’ll define wavelength colors. The color at different ranges of wavelengths is always the same, so if we run this code, we can obtain the same color range at wavelength (which would be the x-axis of the graph). and let’s create curve graph. I’ll highlight the ranges…

Read More Read More

[Data article] Data Normalization Techniques: Excel and R as the Initial Steps in Machine Learning

[Data article] Data Normalization Techniques: Excel and R as the Initial Steps in Machine Learning

In my previous post, I introduced the necessity of data normalization in visualizing data. By following that post, you may gain an understanding of how we can organize data according to our preferences. □ Why is data normalization necessary when visualizing data? Today, I’ll introduce various methods for data normalization, utilizing the biomass with N and P uptake data available on my GitHub. R coding Python coding I also aim to create regression graphs illustrating the relationship between biomass and…

Read More Read More

[Data article] Why is data normalization necessary when visualizing data?

[Data article] Why is data normalization necessary when visualizing data?

Data normalization is necessary when visualizing data for several key reasons, and I believe the most important reason is for scale uniformity. Different data variables can have vastly different scales and units. For example, grain yield might be in Mg/ha, while nutrient contents might typically range from %. Normalizing these data to a common scale (like 0 to 1) allows them to be compared and visualized on the same axis without one overshadowing the other due to its scale. Additionally,…

Read More Read More

How to draw a y-axis border when using facet_wrap() in R? (feat. scales=”free”)

How to draw a y-axis border when using facet_wrap() in R? (feat. scales=”free”)

Here is one dataset, and I’ll use facet_wrap() to create bar graphs. First, let’s summarize the data. Then, I’ll create a bar graph using facet_wrap() to divide panels by irrigation. Now, I want to draw a y-axis border for the ‘Irrigation_Yes’ panel. We can achieve this simply by adding scales=”free”. © 2022 – 2023 https://agronomy4future.com

How to randomize treatments using R?

How to randomize treatments using R?

Setting up experimental design according to your experiment goal is the first step to achieve your experiment’s success. In Agronomy studies, experimental design involves the combination of treatments deployed in the field, and these treatments should be randomized. Randomization is important in experimental design as it helps our experiments avoid biases due to physical or biological factors. Of course, there are no specific, unconditional rules for randomization. In a very old-fashioned way, you can write treatment numbers on paper, and…

Read More Read More

Achieving Smooth Curve Graphs with R

Achieving Smooth Curve Graphs with R

□ How to convert character to POSIXct format in R? In my previous post, I created a curve graph like the one shown below. The curve on the graph appears to be not very smooth, and I want to make it smoother. Therefore, I will add geom_smooth(), but the method will be method=”gam” code summary: https://github.com/agronomy4future/r_code/blob/main/Achieving_Smooth_Curve_Graphs_with_R.ipynb © 2022 – 2023 https://agronomy4future.com

How to convert character to POSIXct format in R?

How to convert character to POSIXct format in R?

Here is one dataset Let’s check the data type of each variable. The time column is in character format. When opening the data in Excel, it is considered text. I wish to create a time series graph, but this cannot be accomplished when the variables are in text format. Therefore, we need to convert the text to a time format. Now we can adjust time using scale_x_datetime() full summary: https://github.com/agronomy4future/r_code/blob/main/How_to_convert_character_to_POSIXct_format_in_R.ipynb © 2022 – 2023 https://agronomy4future.com

How to Convert Time to Numeric for Line Graphs in R?

How to Convert Time to Numeric for Line Graphs in R?

Here is one dataset. With this data, I’ll create a line graph to show the change in day length over time. First, let’s transpose the columns to rows using pivot_longer(). I’ll sort the data by Day and Month, but since the month column is in text format, sorting it from January to December directly isn’t feasible. Therefore, I’ll add a number corresponding to each month for sorting purposes. Now, I can sort by ‘month1’ and ‘Day’ from January 1 to…

Read More Read More

Converting Character Values to Numeric in R: A How-To Guide

Converting Character Values to Numeric in R: A How-To Guide

First, let’s create a dataset. and observe the different data formats of each value. I have two sets of yield data: one in character format (yield column) and the other in numeric format (yield1 column). How to convert missing value to 0 when data is numeric? When data is numeric (yield1 column), and if there are missing values, how can we replace it to 0? or you can also use the following code. How to convert missing values to 0…

Read More Read More

How to add separate text to panels divided by facet_wrap() in R?

How to add separate text to panels divided by facet_wrap() in R?

□ Graph Partitioning Using facet_wrap() in R Studio□ How to customize the title format in facet_wrap()? In my previous posts, I introduced how to divide panels in one figure using facet_wrap(). Today, I’ll introduce how to add separate text to panels. First, let’s make sure we have the required packages installed. I’ll create a dataset as shown below: Next, I’ll reshape the dataset into columns to facilitate data analysis. And then, I’ll summarize this data using descriptive statistics. Finally, I’ll…

Read More Read More

In R, Drawing Lines with Different X-axis Starting Positions

In R, Drawing Lines with Different X-axis Starting Positions

In R, I want to draw a line in a graph, and first, I’ll create the data. Next, I’ll create a bar graph. In this graph, I want to draw a horizontal line. The code to draw lines is introduced in the post below. □ Drawing Lines in ggplot() I added a horizontal line to represent the mean yield of all cultivars. Next, I would like to draw a horizontal line starting from Cultivar B. How can this be achieved?…

Read More Read More

Matching Datasets in R: An Approach Comparable to Excel’s VLOOKUP Function

Matching Datasets in R: An Approach Comparable to Excel’s VLOOKUP Function

I have two datasets. Now, I want to combine these two datasets, but the row numbers differ between the two datasets. In dataB, the 3rd replicate for Tr1 and the 2nd replicate for Tr3 were deleted due to environmental errors. In this case, simply combining the two datasets is not feasible. One solution is to merge them row-wise using the rbind() function. This way, the two datasets will be combined by row. However, my goal is to combine the two…

Read More Read More

How to run R codes in Google Colab?

How to run R codes in Google Colab?

Google Colab is essentially a Jupyter notebook environment, which means that typically only Python code works. However, it is also possible to use R code in Google Colab. If you’re unfamiliar with Google Colab, please read the post below to grasp its general concept. □ How to use Google Colab for Python (power tool to analyze data)? When opening a new Google Colab window, navigate to Runtime in the menu, choose Change runtime type, and a new window will appear,…

Read More Read More

How to convert an uploaded data table to a data frame in R?

How to convert an uploaded data table to a data frame in R?

Let’s say I uploaded a dataset to R. Now, I want to save this data as code so that I can store it in my web note. This is because it would be difficult to find the original dataset after a long time. Therefore, I want to save it as text code in a list on my web note. 1) using dput() First, we can use dput() function. 2) using datapasta() Second, we can use datapasta() function 3) using constructive()…

Read More Read More

How to Upload Data from GitHub Using R and Python?

How to Upload Data from GitHub Using R and Python?

I have soybean yield data that I want to upload to Github and access from R. First, let’s upload the data to Github. The data should be in .csv format. Click Add file, choose Upload files, and, after uploading, select the Raw button to view the data in .csv format as text. and you can find the address for this data, starting with https://raw.githubusercontent.com/… Let’s copy this address. Next, I’ll bring this data into R from Github. Before that, let’s…

Read More Read More

Customizing R Graphs: Splitting Text into Two Rows

Customizing R Graphs: Splitting Text into Two Rows

I have a dataset as below. Now, I want to create a bar graph about this data. First, let’s summarize the data. Then, let’s create a bar graph. Now, to save space, I’d like to split the x-axis text into two rows using the following code. When you run the same code to create a bar graph, the resulting graph is shown below. Code summary https://github.com/agronomy4future/r_code/blob/main/Customizing_R_Graphs_Splitting_Text_into_Two_Rows.ipynb © 2022 – 2023 https://agronomy4future.com