References

Contents

References#

Methods#

Interquartile Mean (IQM)

Agarwal, R., Schwarzer, M., Castro, P. S., Courville, A. C., & Bellemare, M. G. (2021). Deep Reinforcement Learning at the Edge of the Statistical Precipice. Advances in Neural Information Processing Systems, 34. https://arxiv.org/abs/2108.13264


Bayesian Pairwise Comparison

Benavoli, A., Corani, G., Demšar, J., & Zaffalon, M. (2017). Time for a Change: a Tutorial for Comparing Multiple Classifiers Through Bayesian Analysis. Journal of Machine Learning Research, 18(77), 1–36. https://jmlr.org/papers/v18/16-305.html


Dolan-Moré Performance Profiles

Dolan, E. D., & Moré, J. J. (2002). Benchmarking Optimization Software with Performance Profiles. Mathematical Programming, 91(2), 201–213. https://doi.org/10.1007/s101070100263


Libraries#

baycomp

Janez Demšar. baycomp: Bayesian comparison of classifiers. janezd/baycomp