evaluma.methods.bayesian

evaluma.methods.bayesian#

Functions#

compute_bayesian(→ evaluma.results.BayesianResult)

Compute pairwise Bayesian comparisons using a signed-rank test.

Module Contents#

evaluma.methods.bayesian.compute_bayesian(scores_matrix: pandas.DataFrame, *, rope=0.01, reference=None, pairs=None, random_state=None) evaluma.results.BayesianResult#

Compute pairwise Bayesian comparisons using a signed-rank test.

For each pair, baycomp.two_on_multiple() returns the posterior probability that model A is better, that they are practically equivalent (within rope), and that model B is better.

Parameters:
  • scores_matrix – Normalized model × dataset score matrix.

  • rope – Region of practical equivalence half-width in normalized score space (0–1). Differences smaller than rope are treated as practically equivalent.

  • reference – If provided, only compare every other model against this reference model.

  • pairs – Explicit list of (model_a, model_b) pairs to test. Overrides reference.

  • random_state – Seed forwarded to baycomp.

Returns:

Result with .table containing columns model_a, model_b, p_a_better, p_equiv, p_b_better.

Return type:

BayesianResult