What is Wilcoxon Rank-Sum Test?
The reason why we use t-test, not z-test is because we don’t know the variance (σ2) of the population.
The sample mean (x̅
) is an unbiased estimator for the population mean (μ
), and therefore we can estimate μ
from x̅
(E(x̅) = μ
). How about variance? If we know σ2, sample variance could be estimated by dividing σ2 by sample size (n); σ2x̄ = σ2/n
. However, if we don’t know σ2, we should use standard deviation (s) of samples; s2x̄ = s2/n
The following question is,, why we use Wilcoxon Rank-Sum Test, not t-test? When the data does not show normality or sample size is too small, it might not be correct to use t-test which is assumed that the population is normally distributed and instead, we should non-parametric methods. Wilcoxon Rank-Sum Test is one of the non-parametric methods
Here is two groups of data about yield. We do not know whether the population is normally distributed or not.
Now, I’d like to know yield in treatment A is greater than in treatment B.
Simply, we can analyze by t-test. In this case, it will be 2-sample-t-test.
Mean of treatment_A: 170.2
Mean of treatment_B: 161.025
pooled_variance ≈ 69.75
How to calculate pooled variance when including block in the experimental design?
t= (170.2 – 161.025) / √69.75* √(1/4+1/4) ≈ 1.5536
It’s two-tailed test (i.e. same, not the same), which means p-value 0.0856 (α = 0.05). Therefore, there is no yield differences between two treatments.
# two sample t test
a<- c(166.7,172.2,165.0,176.9)
b<- c(158.6,176.4,153.1,156.0)
t.test(a, b, mu=0, var.equal=T, conf.level=0.95, alternative="greater")
If we run R, you can obtain the same result.
It would be enough to say there is no yield differences between two treatments, but in principle, we cannot use t-test because the population is normally distributed (and also too small sample size). Therefore, as a non-parametric method, I’ll analyze the data by Wilcoxon Rank-Sum Test.
1. Hypothesis
• the null hypothesis (H0): x̄A = x̄B
• the alternative hypothesis (Ha): x̄A > x̄B
2. Rank transformation (low to high)
First of all, we need to arrange all data from the lowest to highest (descending order).
3. Test statistic W
Then, calculate W about one treatment we hypothesized it would be greater. W is the sum of ranking
W = 4 + 5 + 6 + 8= 23
4. p-value
For clear visualization, I use PQRS program (https://pqrs.software.informer.com/).
In Distribution, choose ‘Wilcoxon Rank-Sum’ and input 4 in m (sample number in first group) and n (sample number in second group).
Then, input 23 (this is our Test statistic W). Probability when W ≥ 23 will be 0.0429 + 0.0571 = 0.1. That is, the p-value is 0.1. If we set α=0.05, we’ll accept null hypothesis (H0): x̄A = x̄B. There is no yield differences between two treatments. This result is the same as in t-test.
a<- c(166.7,172.2,165.0,176.9)
b<- c(158.6,176.4,153.1,156.0)
wilcox.test(a, b, alternative="greater")
In R, W was calculated as treatment B, which is 13 (=1+2+3+7), and the p-value is 0.1. Therefore, we’ll accept null hypothesis (H0): x̄A = x̄B.
If Test statistic W becomes greater?
If the yield in treatment A is 200, 210, 220, 230, how will W be changed? All values are greater than values in treatment B, and the ranking will be 5 , 6, 7, 8. So, W will be 5 + 6 + 7 +8 = 26.
In this case, p-value will be 0.0143, and we can reject null hypothesis (H0): x̄A = x̄B. That is, there is yield differences between two treatments.