evaluma.plot
============

.. py:module:: evaluma.plot


Functions
---------

.. autoapisummary::

   evaluma.plot.plot_aggregate_ranking
   evaluma.plot.plot_iqm_ranking
   evaluma.plot.plot_bayesian_heatmap
   evaluma.plot.plot_bayesian_reference_bars
   evaluma.plot.plot_performance_profiles


Module Contents
---------------

.. py:function:: plot_aggregate_ranking(table: pandas.DataFrame, *, figsize=None, model_colors=None, title=None, ax=None)

   Render aggregate scores as a horizontal bar chart (no CI whiskers).

   :param table: DataFrame with columns ``model`` and ``score``.
   :param figsize: Figure size ``(width, height)`` in inches.
   :param model_colors: List of colors, one per model in row order.
   :param title: Optional axes title.
   :param ax: Existing axes to draw into; a new figure is created if ``None``.

   :returns: The rendered figure.
   :rtype: matplotlib.figure.Figure


.. py:function:: plot_iqm_ranking(table: pandas.DataFrame, *, figsize=None, model_colors=None, title=None, ax=None)

   Render IQM scores as a horizontal bar chart with CI error bars.

   :param table: DataFrame with columns ``model``, ``IQM``, ``CI_low``,
                 ``CI_high`` as produced by
                 :func:`~evaluma.methods.iqm.compute_iqm`.
   :param figsize: Figure size ``(width, height)`` in inches.
   :param model_colors: List of colors, one per model in row order.
   :param title: Optional axes title.
   :param ax: Existing axes to draw into; a new figure is created if
              ``None``.

   :returns: The rendered figure.
   :rtype: matplotlib.figure.Figure


.. py:function:: plot_bayesian_heatmap(table: pandas.DataFrame, *, title=None, figsize=None, **_kwargs)

   Render Bayesian pairwise probabilities as a matplotlib heatmap.

   Each cell ``(i, j)`` shows ``P(model_i > model_j)``.

   :param table: DataFrame with columns ``model_a``, ``model_b``,
                 ``p_a_better``, ``p_equiv``, ``p_b_better``.
   :param title: Optional figure title.
   :param figsize: Figure size ``(width, height)`` in inches.

   :returns: The rendered figure.
   :rtype: matplotlib.figure.Figure


.. py:function:: plot_bayesian_reference_bars(table: pandas.DataFrame, reference: str, *, title=None, figsize=None)

   Render Bayesian comparison against a reference as stacked horizontal bars.

   Each bar represents one model compared to the reference. Blue = P(model >
   reference), grey = P(equivalent), red = P(reference > model). Bars are
   sorted by P(model > reference) descending.

   :param table: DataFrame with columns ``model_a``, ``model_b``,
                 ``p_a_better``, ``p_equiv``, ``p_b_better``. Expects
                 ``model_a == reference`` for all rows (as produced by
                 :func:`~evaluma.methods.bayesian.compute_bayesian` in reference
                 mode).
   :param reference: Name of the reference model.
   :param title: Optional figure title.
   :param figsize: Figure size ``(width, height)`` in inches.

   :returns: The rendered figure.
   :rtype: matplotlib.figure.Figure


.. py:function:: plot_performance_profiles(table: pandas.DataFrame, *, figsize=None, model_colors=None, title=None, ax=None)

   Render Dolan-Moré performance profile curves.

   The x-axis uses a native log₁₀ scale with raw τ ratio values (1, 2, 5, 10…),
   following ML-GYM (Batra et al., 2025) and the AutoML Decathlon (Roberts et al.,
   2022). τ = 1 means tied for best; τ = 10 means 10× worse than the best model.

   :param table: Long-format DataFrame with columns ``tau``, ``model``,
                 ``fraction_within_tau``.
   :param figsize: Figure size in inches.
   :param model_colors: Dict mapping model names to colors, or a list in
                        model order.
   :param title: Optional axes title.
   :param ax: Existing axes to draw into; a new figure is created if
              ``None``.

   :returns: The rendered figure.
   :rtype: matplotlib.figure.Figure


