How to select/delete specific columns using R STUDIO?

How to select/delete specific columns using R STUDIO?


It would be helpful if you read the below post before starting!!


Data filtering using R Studio


I’ll generate one data.

name=c("Jack","Kate","John","Jane","David","Min","Hyuk","Jisoo")
math=c(90,85,95,75,80,90,90,85)
eng=c(85,90,90,88,95,85,87,88)
country=c("USA","Spain","France","Germany","Netherlands", rep("Korea",3))
gender=c(rep(c("Male","Female"),times=4))
enroll=c(rep(c("Yes","No"),each=4))
grade=data.frame(name,math,eng,country,gender,enroll)

grade
   name math eng     country gender enroll
1  Jack   90  85         USA   Male    Yes
2  Kate   85  90       Spain Female    Yes
3  John   95  90      France   Male    Yes
4  Jane   75  88     Germany Female    Yes
5 David   80  95 Netherlands   Male     No
6   Min   90  85       Korea Female     No
7  Hyuk   90  87       Korea   Male     No
8 Jisoo   85  88       Korea Female     No

Let’s say this is a math and English score for 8 students from different countries.

Let’s do several things with this data.


How to delete certain column?

I’d like to delete math column. I use the below code.

grade2=subset(grade, select=-math)
grade2
   name eng     country gender enroll
1  Jack  85         USA   Male    Yes
2  Kate  90       Spain Female    Yes
3  John  90      France   Male    Yes
4  Jane  88     Germany Female    Yes
5 David  95 Netherlands   Male     No
6   Min  85       Korea Female     No
7  Hyuk  87       Korea   Male     No
8 Jisoo  88       Korea Female     No

In case I want to delete both math and eng columns, I use the below code.

grade3=subset(grade,select=c(-math,-eng))
grade3
   name     country gender enroll
1  Jack         USA   Male    Yes
2  Kate       Spain Female    Yes
3  John      France   Male    Yes
4  Jane     Germany Female    Yes
5 David Netherlands   Male     No
6   Min       Korea Female     No
7  Hyuk       Korea   Male     No
8 Jisoo       Korea Female     No

Without using subset(), we can delete columns using below code.

variable name [-row number, - column number]

For example, If I write a code like grade [, -2] which means I want to delete the 2nd column. In the same way, if I write a code like grade [-2,] which means I want to delete the 2nd row.

grade4=grade[,-2]
grade4
   name eng     country gender enroll
1  Jack  85         USA   Male    Yes
2  Kate  90       Spain Female    Yes
3  John  90      France   Male    Yes
4  Jane  88     Germany Female    Yes
5 David  95 Netherlands   Male     No
6   Min  85       Korea Female     No
7  Hyuk  87       Korea   Male     No
8 Jisoo  88       Korea Female     No

How about delecting both 2nd and 3rd column? The code is below.

grade5=grade[,c(-2,-3)]
grade5
   name     country gender enroll
1  Jack         USA   Male    Yes
2  Kate       Spain Female    Yes
3  John      France   Male    Yes
4  Jane     Germany Female    Yes
5 David Netherlands   Male     No
6   Min       Korea Female     No
7  Hyuk       Korea   Male     No
8 Jisoo       Korea Female     No

Using dplyr() pacakge

if (require("dplyr") == F) install.packages("dplyr")
library(dplyr)

grade_1=grade %>%
              dplyr::select(-math,-eng)

grade_1
   name     country gender enroll
1  Jack         USA   Male    Yes
2  Kate       Spain Female    Yes
3  John      France   Male    Yes
4  Jane     Germany Female    Yes
5 David Netherlands   Male     No
6   Min       Korea Female     No
7  Hyuk       Korea   Male     No
8 Jisoo       Korea Female     No


How to select certain column?

Now, I’ll explain how to select certain columns. Now, I’d like to select name, math and country columns.

grade6=subset(grade,select=c(name,math,country))

grade6
   name math     country
1  Jack   90         USA
2  Kate   85       Spain
3  John   95      France
4  Jane   75     Germany
5 David   80 Netherlands
6   Min   90       Korea
7  Hyuk   90       Korea
8 Jisoo   85       Korea

or this code will be also possible.

grade7=grade[,c(1,2,4)]

grade7
   name math     country
1  Jack   90         USA
2  Kate   85       Spain
3  John   95      France
4  Jane   75     Germany
5 David   80 Netherlands
6   Min   90       Korea
7  Hyuk   90       Korea
8 Jisoo   85       Korea

Using dplyr() pacakge

if (require("dplyr") == F) install.packages("dplyr")
library(dplyr)

grade_2=grade %>%
              dplyr::select(name,math,country)

grade_2
   name math     country
1  Jack   90         USA
2  Kate   85       Spain
3  John   95      France
4  Jane   75     Germany
5 David   80 Netherlands
6   Min   90       Korea
7  Hyuk   90       Korea
8 Jisoo   85       Korea


Leave a Reply

If you include a website address in the comment section, I cannot see your comment as it will be automatically deleted and will not be posted. Please refrain from including website addresses.