Wednesday, May 28, 2014

Comparing climbing performance

This post shows how to use the IDP statistical package to compare sport performance. 


As case study, I have considered (just for fun) the comparison of my climbing performance in two consecutive editions (2013 and 2014) of the "Tre Valli Bresciane" cycling race.

The following table reports my ascent time on 6 different climbs on the 2013 and 2014 editions of the race.
                                        
Climb 2013 2014
A 29m16s 29m14s
B 29m03s 29m02s
C 23m28s 23m01s
D 48m04s 45m56s
E 28m51s 30m43s
F 15m09s 14m53s

A classical non-parametric statistical hypothesis test used when comparing two  matched samples is the Wilcoxon signed-rank test. Our goal is to employ this test to assess whether my climbing performance on 2013 are worse (larger ascent time) than that on 2014. Therefore, we are going to perform  a one-sided test.

In R this test can be performed by means of the function wilcox.test as follows:
 T14 <- c(29*60+14, 29*60+02, 23*60+01, 45*60+56, 30*60+43, 14*60+53)
 T13 <- c(29*60+16, 29*60+03, 23*60+28, 48*60+04, 28*60+51, 15*60+09)
 wilcox.test(T13,T14,"greater", paired=TRUE)

while, in Matlab, this test can be performed by means of the function signrank:

 T14=[29*60+14, 29*60+02, 23*60+01, 45*60+56, 30*60+43, 14*60+53]; 
 T13=[29*60+16, 29*60+03, 23*60+28, 48*60+04, 28*60+51, 15*60+09]; 
 [p,h]=signrank(T13,T14,'tail','right')

Note that  the conversion to seconds is actually not necessary in a rank test.
As result (in both cases) we obtain the p-value, p = 0.156, and since p>0.05 (0.05 is the default significance level), the null hypothesis cannot be rejected (h = 0).
Therefore, the difference is declared not significant.

Now let us perform the same test using the Imprecise Dirichlet Process (IDP). 
The details of the test can be found in this [paper] and to run the test you need to download and run the code [here].
In R, this test can be performed as follows

 isignrank.test(T14,T13,"greater")

while in Matlab:

 [prob,h]=isignrank(T13,T14,'tail','right','alpha',0.05);

The result is shown (in both cases) in the below figure. The main differences w.r.t. the classical Wilcoxon signed-rank test are: (i) the test is Bayesian and, thus, it returns the posterior probability of the hypothesis "T13 is larger than T14"; (ii) the test is imprecise, which means that it actually returns the lower and upper probabilities of the hypothesis "T13 is larger than T14". 
The lower and upper probabilities are obtained by considering the set of all possible probability base measures for the Dirichlet process. This means that the test is also robust to the choice of the probability base measure.

Looking at the figure, it can be observed that, since the upper (and, thus, the lower) probability is less than 0.95, we cannot say that "T13 is larger than T14" with posterior probability equal to 1-alpha=0.95.
The IDP based test and the Wilcoxon signed-rank test agree in this case.

However, the IDP gives us additional information: the posterior probabilities.  In fact, since the lower probability is about 0.75, we can actually declare that "T13 is larger than T14" with posterior probability 1-alpha=0.75.
Therefore, we can say that my performance on 2014 is better than that on 2013, with reliability (posterior probability ) of 75%.

In all the other cases, 0.75<1-alpha<0.93, we are in an indeterminate situation.
This means that the result of the hypothesis test is prior independent, i.e., it changes with the choice of the prior base measure of the Dirichlet process. In other words, this means that the evidence from the observations is not enough to declare either that the probability of the hypothesis being true is larger or smaller than the desired value 1 − alpha (the result is prior dependent); more measurements are necessary to take a decision.



No comments:

Post a Comment