Browsed by
Category: R programming

Displaying Axis Values as Percentages in R Studio with Simple Code

Displaying Axis Values as Percentages in R Studio with Simple Code

Let’s create a simple dataset and draw a bar graph with this data The values on the y-axis are in decimal points. I would like to display them as percentages. So, I will insert the code labels=scales::percent inside the scale_y_continuous() function. The complete code is as follows: The values on the y-axis have changed to percentages.

Uploading Excel Data in R and Converting it to Code for Improved Management

Uploading Excel Data in R and Converting it to Code for Improved Management

Recently, I have uploaded a large excel dataset into R. This data consists of 7,849 columns. Upon checking the excel file size, it’s approximately 1MB in size. Now, I’d like to share this data with someone else. However, instead of attaching it as an excel file, I want to send it as R code, allowing them to work with the data directly in R. Therefore, all I need to do is convert this data into code. Below is how you…

Read More Read More

Exporting Individual Graph Images with R Studio and ggsave()

Exporting Individual Graph Images with R Studio and ggsave()

After creating a graph using R, repeatedly copying and pasting it to move it becomes a cumbersome task. Today, I’ll demonstrate how to easily relocate the graph. Let’s generate some data and draw a graph to demonstrate. Running the code like this will display the graph in the Plot window. Then, each time, you’ll need to click Export, save it with a different name, or copy the image to place the graph where you want. In reality, this task is…

Read More Read More

Exploring Axis Title and Text Spacing Adjustment in R Studio for Graphs

Exploring Axis Title and Text Spacing Adjustment in R Studio for Graphs

If you visit FAOSTAT (https://www.fao.org/faostat/en/), you can download high-quality data related to agriculture. Recently, I conducted an analysis of the trends in global and European wheat harvest quantities. As a result, I performed data analysis similar to the following. The complete code for the above graph is as follows: In the above graph, it seems that the axis title labels are too close to the axis text. I’d like to increase the spacing a bit. From now on, I’ll be…

Read More Read More

Creating a Data Frame in R Studio

Creating a Data Frame in R Studio

Today, I will show you how to create a data frame using R Studio. We have several variables that we will combine into a data frame. The ‘nation’ variable consists of five countries: “USA”, “GERMANY”, “NETHERLANDS”, “DENMARK”, and “KOREA”. We also have some survey data on the happiness and economic power of each country. To create a data frame in R, we can use the data.frame() function to combine all variables. In this example, I have written the code as…

Read More Read More

Creating Visual Emphasis: Adding Dotted Boxes to Graphs in R Studio

Creating Visual Emphasis: Adding Dotted Boxes to Graphs in R Studio

I’ll explain how to insert a box in a graph to highlight it. I’ll generate some data. “This data pertains to the yield and standard error for five different genotypes. I’ll create a bar chart to visualize it. In this graph, genotypes D and E exhibit greater yields compared to the other genotypes. My current objective is to emphasize genotypes D and E by adding a dotted box. To achieve this, we can utilize the geom_rect(). For geom_rect(), I set…

Read More Read More

How to automatically insert linear regression equation in graph in RSTUDIO?

How to automatically insert linear regression equation in graph in RSTUDIO?

Sometimes, we need to insert a linear regression equation inside a graph, but it’s an annoying to type an equation every time when generating a linear regression graph. Using stat_poly_eq(), we can automatically insert a linear regression equation. Let’s generate one data frame. Then, I’ll generate a regression graph. Now let’s analyze a linear regression. The linear model equation is y= 9.1429 + 1.5357x and R2 is 0.9245. Now I’ll insert this equation model automatically using stat_poly_eq(). I’ll add the…

Read More Read More

In R STUDIO, how to apply the same font type and size in ggplot?

In R STUDIO, how to apply the same font type and size in ggplot?

First, let’s generate a simple data. Then I’ll make a bar graph using ggplot2. Now, I made a bar graph like above, but in the code to make this bar graph, I repeated font type and size over and over to set up the same font type and size in both graph title and text (also in x and y axis). I want to reduce this repeated codes, and the solution is using theme_grey(). axis.title.x= element_text (family=”serif”, size=15, color=”black”), axis.title.y=…

Read More Read More

Creating Stacked Bar Graphs in R Studio: A Step-by-Step Guide

Creating Stacked Bar Graphs in R Studio: A Step-by-Step Guide

Today, I’ll be introducing how to create stacked bar graphs using R Studio. To start, I will generate a data table as shown below. I’ll make stacked bar graphs using this data table. First of all, it’s necessary to summarize the data. I’ll use ddply() function. If I use this code, the error message pops up This is because when generating data, I used double quotation marks such as yield = c(rep(“15”, 5), rep(“18”, 5), rep(“20”, 8), rep(“14”, 7), rep(“21”,…

Read More Read More

How to conduct Least Significant difference (LSD) test using R STUDIO?

How to conduct Least Significant difference (LSD) test using R STUDIO?

For the mean comparison among variables, Least Significant difference (LSD) test is the most common method. Today I’ll introduce LSD test using R Studio. Here is one data. This data is about the yield difference of CV1 in response to 4 different nitrogen fertilizer (N0 ,N1, N2, N3). First of all, let’s check the mean per each nitrogen fertilizer. It seems that yield is different from nitrogen fertilizers, but we need to confirm it statistically. First, I’ll run One-Way ANOVA…

Read More Read More

Graph Partitioning Using facet_wrap() in R Studio

Graph Partitioning Using facet_wrap() in R Studio

While creating graphs, you can certainly draw multiple graphs in a single panel. However, you can also use the facet_wrap() function to divide graphs based on specific variables. First, let’s generate a dataset. I intend to create a bar graph using this data. Therefore, I need to summarize the data. To do this, I must reorganize the data from being divided into columns to being arranged in rows. I have reorganized the data into rows using the reshape2::melt() function. Now,…

Read More Read More

Utilizing stat_summary() in R Studio to Summarize Data Graphically

Utilizing stat_summary() in R Studio to Summarize Data Graphically

When creating graphs using data, especially those involving error bars, it is necessary to calculate the standard error by summarizing the data. There are various methods to summarize the data. □ Utilizing R Studio for Data Grouping and Mean/Standard Error Calculation (feat ddply) Today I will introduce a method of creating graphs all at once using stat_summary() without the need for such data summarization. Below is an example dataset: Now I want to display the data as points by placing…

Read More Read More

Transforming Data: Stacking Multiple Columns into Rows Using R

Transforming Data: Stacking Multiple Columns into Rows Using R

One common mistake when organizing data is collecting it as depicted below (## represents the result). While it might seem easy to input values into each column, listing data in this manner complicates statistical analysis. To conduct statistical analysis on such listed data, it is necessary for the data to be arranged as shown below. In other words, statistical analysis of an experimental design with three replicates for the two variables, Genotype and Field, can be conducted when the data…

Read More Read More

Stacking Data Vertically from Multiple Columns in R Studio (feat. reshape package)

Stacking Data Vertically from Multiple Columns in R Studio (feat. reshape package)

Previously, I posted about how to change the data structure in the following scenario. □ Combining Factors from Separate Datasets into a Single Column Using R Studio (feat. dplyr package) This time, I will introduce a method for changing the structure of data as shown below. Specifically, this is about cases where there are multiple columns within a single dataset, not two different datasets. I will create a simple dataset for illustration purposes. This is a method for taking input…

Read More Read More

Combining Factors from Separate Datasets into a Single Column Using R Studio (feat. dplyr package)

Combining Factors from Separate Datasets into a Single Column Using R Studio (feat. dplyr package)

When data is divided into two separate datasets, it needs to be combined into a single column. Using R, we can simply combine the two datasets. I will create a simple dataset. Now I will combine these two datasets into one. 1) using data.frame() To explain the below code simply, we are using the function data.frame() to combine two datasets. Regarding the repetition of the text “Defoliation,” it indicates repeating it by the number of times corresponding to the values…

Read More Read More

Utilizing R Studio for Data Grouping and Mean/Standard Error Calculation (feat ddply)

Utilizing R Studio for Data Grouping and Mean/Standard Error Calculation (feat ddply)

The function I will introduce today is ddply(). This function is convenient for summarizing large amounts of data and can also calculate standard errors, making it easy to create bar graphs. First, install the package. Once the installation is complete, let’s upload some data. This dataset consists of results from cultivating 4 genotypes under 4 different nitrogen treatment conditions with 4 replicates each. In other words, it comprises a total of 64 data points (4 x 4 x 4). When…

Read More Read More

Performing a Two-Way ANOVA with Blocks using R Studio

Performing a Two-Way ANOVA with Blocks using R Studio

I’ll upload one data in R. I have 10 corn varieties and want to analyze the impact of nitrogen treatments (N0 and N1), variety, and their interaction on grain yield. Since replicates are considered as blocks, I will conduct a two-way ANOVA analysis with blocks as the statistical model. The statistical model for two-way ANOVA with blocks is as follows: yijk = μ  + αi + βj + δij + γk + εijk where yijk: observed values for treatment (ij; i…

Read More Read More

Streamlined Data Summary in R STUDIO: Enhancing Bar Graphs with Error Bars

Streamlined Data Summary in R STUDIO: Enhancing Bar Graphs with Error Bars

When working with data in R, there are situations where you might need to examine summarized information, such as means, standard deviations, and more. Today, I will introduce the methods that can be employed for this purpose. Let’s start by loading a dataset. As I engage in various tasks involving this data, I aim to summarize it. Therefore, I will introduce methods applicable to such situations. 1) using plyr package First, install and activate the package. I want to summarize…

Read More Read More

Enhancing Graph Points in R Studio: Adding Distinct Borders

Enhancing Graph Points in R Studio: Adding Distinct Borders

I have datasets below. This data pertains to the differences in grain yield and height resulting from various fertilizer treatments. Now I’ll create a graph using the following code. Since I assigned the color based on the condition of the “Fertilizer,”; geom_point (aes(color=Fertilizer)), the colors are automatically distinguished, resulting in the graph being plotted. However, I would like to add borders to these points in order to represent them more distinctly. By the way, if you include shape=Fertilizer within aes(),…

Read More Read More

How to select/delete specific variables using R STUDIO?

How to select/delete specific variables using R STUDIO?

□ How to select and delete specific columns using R STUDIO? In my previous post, I explained how to select or delete specific columns. This time, I’ll elaborate on selecting or deleting specific variables within columns. Once again, I’ll generate a new set of data. If I want to divide the data by genotype, I use the code below. But what if I simply want to delete all instances of the CV2 genotype? The code is below. Alternatively, the code…

Read More Read More

Modifying Graph Axis Style using ggplot in R Studio

Modifying Graph Axis Style using ggplot in R Studio

This is a method for adjusting the axis formatting of a graph drawn using ggplot() in R Studio. I will create a simple dataset as shown below. Next, I will use this data to create a bar graph. I have created the bar graph as shown above. Now, let’s proceed to adjust the axis formatting of the graph. 1) Setting Axis Range I will adjust the range of the y-axis values. I want to set the range from -1 to…

Read More Read More

Illustrating Data Trends with a Line Graph in R Studio

Illustrating Data Trends with a Line Graph in R Studio

I’ll Introduce the method of creating a line graph using R. I will utilize the geom_line() within ggplot(). First, let’s load the file. This data pertains to the changes in chlorophyll content and leaf greenness over time (days after planting). This dataset contains information from two distinct locations (Northern, Southern) and genotypes (CV1, CV2), each with three stress treatments (Control, Stress_1, Stress_2). Now, I’d like to know how chlorophyll content is changed by stress treatments per genotype at each location….

Read More Read More

How to select/delete specific columns using R STUDIO?

How to select/delete specific columns using R STUDIO?

It would be helpful if you read the below post before starting!! □ Data filtering using R Studio I’ll generate one data. Let’s say this is a math and English score for 8 students from different countries. Let’s do several things with this data. ■ How to delete certain column? I’d like to delete math column. I use the below code. In case I want to delete both math and eng columns, I use the below code. Without using subset(),…

Read More Read More

How to Combine Data in R Studio using c/r bind() and merge() Functions?

How to Combine Data in R Studio using c/r bind() and merge() Functions?

I will post a method for combining the columns and rows of two datasets in R. I have created two datasets simply for this purpose. And now, let’s combine these two datasets. I have combined these two tables into one. In fact, combining columns is simple because you can just put them side by side. However, when combining rows, it is important to check if the names of each column are the same before merging to prevent data from being…

Read More Read More

R 에서 각 변수간 그룹의 평균값을 계산하여 새로운 열에 삽입해 보자

R 에서 각 변수간 그룹의 평균값을 계산하여 새로운 열에 삽입해 보자

아래와 같은 데이터가 있습니다. 이 데이터에서 treatment 별로 평균값을 계산하여 새로운 열에 삽입하고 싶습니다. 그래서 아래와 같은 코드를 사용해 보겠습니다. 하지만 이 코드는 너무 길어 보입니다. 아주 간편하게 위와 동일한 계산을 수행하는 코드가 있으면 아주 편할것 같습니다. tapply() 코드는 위와 동일한 계산을 가능하게 하는 코드입니다. mean2 라는 열을 생성해서 그곳에 각 변수의 평균값을 계산하여 삽입하고 싶습니다. 첫번째 일일이 계산했던 방법과 동일한 값을 얻을 수 있습니다. 즉, 아래와 같이 길었던 코드와 동일한 계산식을 제공하는 코드가 tapply() 입니다. 만일 변수가 하나가 아니라…

Read More Read More