How to delete and change specific texts within a column in R?
When we want to change texts within a columns, you can have several methods which I already introduced before.
□ How to Rename Variables within Columns in R?
However, changing all texts and specific texts would be different. Let’s upload a data.
library(readr)
github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/phosphorous_data.csv"
dataA= data.frame(read_csv(url(github), show_col_types= FALSE))
dataA
Location ID OM P pH
1 West.Hill Delta_1 4.4 30 6.0
2 West.Hill Delta_2 4.2 16 6.1
3 West.Hill Delta_3 3.6 11 6.2
4 West.Hill Delta_4 3.6 12 5.4
5 West.Hill Delta_5 4.7 15 6.3
6 West.Hill Delta_6 4.5 33 6.5
7 West.Hill Delta_7 3.7 11 6.7
8 East.Plain Delta_1 3.6 11 6.6
9 East.Plain Delta_2 2.8 6 7.1
10 East.Plain Delta_3 4.0 11 6.4
11 East.Plain Delta_4 5.8 29 7.0
12 East.Plain Delta_5 4.3 14 6.2
13 East.Plain Delta_6 4.2 14 5.8
14 East.Plain Delta_7 4.6 16 6.2
15 North.Rainshelter Delta_1 4.5 19 6.7
16 North.Rainshelter Delta_2 4.6 35 5.6
17 North.Rainshelter Delta_3 3.8 27 6.2
18 North.Rainshelter Delta_4 4.1 19 6.0
19 North.Rainshelter Delta_5 3.5 26 5.8
20 North.Rainshelter Delta_6 4.1 19 6.0
21 North.Rainshelter Delta_7 3.6 15 5.9
Now, we can change the variables name as following code:
dataA$Location[dataA$Location=="East.Plain"]="East farmland"
dataA
Location ID OM P pH
1 West.Hill Delta_1 4.4 30 6.0
2 West.Hill Delta_2 4.2 16 6.1
3 West.Hill Delta_3 3.6 11 6.2
4 West.Hill Delta_4 3.6 12 5.4
5 West.Hill Delta_5 4.7 15 6.3
6 West.Hill Delta_6 4.5 33 6.5
7 West.Hill Delta_7 3.7 11 6.7
8 East farmland Delta_1 3.6 11 6.6
9 East farmland Delta_2 2.8 6 7.1
10 East farmland Delta_3 4.0 11 6.4
11 East farmland Delta_4 5.8 29 7.0
12 East farmland Delta_5 4.3 14 6.2
13 East farmland Delta_6 4.2 14 5.8
14 East farmland Delta_7 4.6 16 6.2
15 North.Rainshelter Delta_1 4.5 19 6.7
16 North.Rainshelter Delta_2 4.6 35 5.6
17 North.Rainshelter Delta_3 3.8 27 6.2
18 North.Rainshelter Delta_4 4.1 19 6.0
19 North.Rainshelter Delta_5 3.5 26 5.8
20 North.Rainshelter Delta_6 4.1 19 6.0
21 North.Rainshelter Delta_7 3.6 15 5.9
How about changing the text in the ID column? I want to remove ‘Delta_’ and keep only the numbers. Will you change the text one by one as follows?
dataA$ID[dataA$ID=="Delta_1"]="1"
dataA$ID[dataA$ID=="Delta_2"]="2"
dataA$ID[dataA$ID=="Delta_3"]="3"
.
.
.
dataA$ID[dataA$ID=="Delta_7"]="7"
It’s a waste of time. Here’s a simpler way.
dataA$ID=sub("Delta_","", dataA$ID)
dataA
Location ID OM P pH
1 West.Hill 1 4.4 30 6.0
2 West.Hill 2 4.2 16 6.1
3 West.Hill 3 3.6 11 6.2
4 West.Hill 4 3.6 12 5.4
5 West.Hill 5 4.7 15 6.3
6 West.Hill 6 4.5 33 6.5
7 West.Hill 7 3.7 11 6.7
8 East farmland 1 3.6 11 6.6
9 East farmland 2 2.8 6 7.1
10 East farmland 3 4.0 11 6.4
11 East farmland 4 5.8 29 7.0
12 East farmland 5 4.3 14 6.2
13 East farmland 6 4.2 14 5.8
14 East farmland 7 4.6 16 6.2
15 North.Rainshelter 1 4.5 19 6.7
16 North.Rainshelter 2 4.6 35 5.6
17 North.Rainshelter 3 3.8 27 6.2
18 North.Rainshelter 4 4.1 19 6.0
19 North.Rainshelter 5 3.5 26 5.8
20 North.Rainshelter 6 4.1 19 6.0
21 North.Rainshelter 7 3.6 15 5.9
How about changing Rainshelter to Shelter in Location column?
dataA$Location=sub("Rainshelter","shelter", dataA$Location)
dataA
Location ID OM P pH
1 West.Hill 1 4.4 30 6.0
2 West.Hill 2 4.2 16 6.1
3 West.Hill 3 3.6 11 6.2
4 West.Hill 4 3.6 12 5.4
5 West.Hill 5 4.7 15 6.3
6 West.Hill 6 4.5 33 6.5
7 West.Hill 7 3.7 11 6.7
8 East farmland 1 3.6 11 6.6
9 East farmland 2 2.8 6 7.1
10 East farmland 3 4.0 11 6.4
11 East farmland 4 5.8 29 7.0
12 East farmland 5 4.3 14 6.2
13 East farmland 6 4.2 14 5.8
14 East farmland 7 4.6 16 6.2
15 North.shelter 1 4.5 19 6.7
16 North.shelter 2 4.6 35 5.6
17 North.shelter 3 3.8 27 6.2
18 North.shelter 4 4.1 19 6.0
19 North.shelter 5 3.5 26 5.8
20 North.shelter 6 4.1 19 6.0
21 North.shelter 7 3.6 15 5.9