I am writing to ask your opinion on how to interpret this case.
I have two vectors “a” and “b” that I am trying to compare.
The wilcoxon test is giving me a pvalue of 5.139217e-303 of a over b with the alternative “greater”. Now if I make a summary on each of them I have the following
> summary(a)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000000 0.0001411 0.0002381 0.0002671 0.0003623 0.0012910
> summary(c)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000000 0.0000000 0.0000000 0.0004947 0.0002972 1.0000000
The mean ratio is then around 0.5399031 which naively goes in opposite direction of the wilcoxon test ( I was expecting to find a ratio >> 1)
Even after removing outlier using the outlier package, I still have the same thing.
Can someone help me explain why I do have this result and how to explain it ?
Thanks in advance
As you haven’t posted your data this is indeed a difficult question and posting your data (e.g., via
dput()would make things a lot easier. Other than that we can only specify. See here for more on posing a nice question with a reproducible example.However, your data has some properties we can see from the
summarythat allows for an answer.Wilcoxon test does seem to be inappropriate for your data. Remember that Wilcoxon uses the rank of each observation. The rank is difficult to obtain as there are ties in the data. You seem to have a lot of ties (
minandmedianforcare both0). There are ways to deal with ties, but other methods are better.As you do seem to not want to use the t-test (reasonable given that the distributions appear to be really different, e.g.
median(a) < mean (a)butmedian(c) > mean(c)) another approach would be to use a permutation test.My package
afex(on CRAN, based oncoin) contains the functioncompare.2.vectorscomparing two vectors using (e.g.) t-test, Wilcoxon and most notably permutation test. If your n is small, you can even use a exact test distribution for the permutation test. Given two vectors a and c the result could be (trying to simulate your data):You see the same pattern, positive test statistics for Wilcoxon, but negative for all other tests. So better not use the Wilcoxon, but one of the other tests, all agreeing.
PS: I am happy for comments on the function. Any more tests that would make sense?