Equivalent Functions: IF function in Excel vs. ifelse() in R

Equivalent Functions: IF function in Excel vs. ifelse() in R


When working with Excel, I believe you use the IF function from time to time, especially when categorizing values. The IF function is particularly useful for this purpose.

Here is one example.

=IF(AND(E2<1,E2>=0),"<1.0",IF(AND(E2>=1,E2<2),"1.0≤OM<2.0",IF(AND(E2>=2,E2<3),"2.0≤OM<3.0",IF(AND(E2>=3,E2<4),"3.0≤OM<4.0",IF(AND(E2>=4,E2<5),"4.0≤OM<5.0","OM≥5.0")))))

I want to categorize organic matter (%) by unit 1.0. This process involves converting numeric variables to categorical variables. To achieve this, I have used the IF function as shown above.

Then, you can categorize organic matter in the F column as shown above. Now, my next question is how to perform this process in R.

Let’s practice together. First, I’ll upload the same dataset as above.

library(readr)
github="https://raw.githubusercontent.com/agronomy4future/raw_data_practice/main/organic_matter.csv"
dataA= data.frame(read_csv(url(github), show_col_types= FALSE))

dataA
    Crop Block Treatment Yield_Mg_ha Organic_Matter_.
1  Wheat     1   Control        3.90              3.2
2  Wheat     2   Control        4.03              3.5
3  Wheat     3   Control        3.80              3.4
4  Wheat     1       TR1        2.80              1.2
5  Wheat     2       TR1        0.90              0.9
6  Wheat     3       TR1        1.10              4.5
7  Wheat     1       TR2        3.99              3.9
8  Wheat     2       TR2        4.03              4.0
9  Wheat     3       TR2        4.10              3.7
10 Wheat     1       TR3        3.99              3.6
11 Wheat     2       TR3        2.50              4.1
12 Wheat     3       TR3        4.09              4.7
13 Wheat     1       TR4        4.07              3.8
14 Wheat     2       TR4        4.13              3.4
15 Wheat     3       TR4        4.10              4.6
16 Wheat     1       TR5        4.09              3.8
17 Wheat     2       TR5        4.03              4.0
18 Wheat     3       TR5        3.90              4.1
19 Wheat     1       TR6        3.94              4.2
20 Wheat     2       TR6        4.06              4.4
21 Wheat     3       TR6        4.02              2.8
22 Wheat     1       TR7        3.98              4.8
23 Wheat     2       TR7        4.00              1.9
24 Wheat     3       TR7        3.96              5.3
25 Wheat     1       TR8        4.10              4.0
26 Wheat     2       TR8        4.07              4.6
27 Wheat     3       TR8        5.10              3.3
28 Wheat     1       TR9        5.50              5.1
29 Wheat     2       TR9        4.50              3.3
30 Wheat     3       TR9        4.90              3.7

Let’s perform a similar operation to the Excel IF function in R. I’ll use ifelse().

When uploading data to R, the column name ‘Organic_Matter_%’ was changed to ‘Organic_Matter_’—missing the ‘%.’ To avoid any text errors, I’ll further change the column name to ‘OM.’

Tip: When organizing data, it’s better to avoid using symbols like % and (), or spaces in column names, as they can lead to changes in text formatting.

colnames(dataA)[5]=c("OM")

Then, I’ll use the code below.

dataA$OM_Category= ifelse(dataA$OM < 1 & dataA$OM >= 0, "<1.0",
                   ifelse(dataA$OM >= 1.0 & dataA$OM < 2.0, "1.0≤OM<2.0",
                   ifelse(dataA$OM >= 2.0 & dataA$OM < 3.0, "2.0≤OM<3.0",
                   ifelse(dataA$OM >= 3.0 & dataA$OM < 4.0, "3.0≤OM<4.0",
                   ifelse(dataA$OM >= 4.0 & dataA$OM < 5.0, "4.0≤OM<5.0",
                   ifelse(dataA$OM >= 5.0 & dataA$OM < 6.0, "5.0≤OM<6.0",
                   ifelse(dataA$OM >= 6.0 & dataA$OM < 7.0, "6.0≤OM<7.0",
                   "OM≥7.0")))))))

Let’s check data again.

    Crop Block Treatment Yield_Mg_ha  OM  OM_Category
1  Wheat     1   Control        3.90 3.2  3.0≤OM<4.0
2  Wheat     2   Control        4.03 3.5  3.0≤OM<4.0
3  Wheat     3   Control        3.80 3.4  3.0≤OM<4.0
4  Wheat     1       TR1        2.80 1.2  1.0≤OM<2.0
5  Wheat     2       TR1        0.90 0.9        <1.0
6  Wheat     3       TR1        1.10 4.5  4.0≤OM<5.0
7  Wheat     1       TR2        3.99 3.9  3.0≤OM<4.0
8  Wheat     2       TR2        4.03 4.0  4.0≤OM<5.0
9  Wheat     3       TR2        4.10 3.7  3.0≤OM<4.0
10 Wheat     1       TR3        3.99 3.6  3.0≤OM<4.0
11 Wheat     2       TR3        2.50 4.1  4.0≤OM<5.0
12 Wheat     3       TR3        4.09 4.7  4.0≤OM<5.0
13 Wheat     1       TR4        4.07 3.8  3.0≤OM<4.0
14 Wheat     2       TR4        4.13 3.4  3.0≤OM<4.0
15 Wheat     3       TR4        4.10 4.6  4.0≤OM<5.0
16 Wheat     1       TR5        4.09 3.8  3.0≤OM<4.0
17 Wheat     2       TR5        4.03 4.0  4.0≤OM<5.0
18 Wheat     3       TR5        3.90 4.1  4.0≤OM<5.0
19 Wheat     1       TR6        3.94 4.2  4.0≤OM<5.0
20 Wheat     2       TR6        4.06 4.4  4.0≤OM<5.0
21 Wheat     3       TR6        4.02 2.8  2.0≤OM<3.0
22 Wheat     1       TR7        3.98 4.8  4.0≤OM<5.0
23 Wheat     2       TR7        4.00 1.9  1.0≤OM<2.0
24 Wheat     3       TR7        3.96 5.3  5.0≤OM<6.0
25 Wheat     1       TR8        4.10 4.0  4.0≤OM<5.0
26 Wheat     2       TR8        4.07 4.6  4.0≤OM<5.0
27 Wheat     3       TR8        5.10 3.3  3.0≤OM<4.0
28 Wheat     1       TR9        5.50 5.1  5.0≤OM<6.0
29 Wheat     2       TR9        4.50 3.3  3.0≤OM<4.0
30 Wheat     3       TR9        4.90 3.7  3.0≤OM<4.0

We can obtain the exact same data as we did in Excel using the IF function.



Comments are closed.