evaluma.methods.iqm
===================

.. py:module:: evaluma.methods.iqm


Functions
---------

.. autoapisummary::

   evaluma.methods.iqm.compute_iqm


Module Contents
---------------

.. py:function:: compute_iqm(raw_runs, norm_bounds, n_bootstrap=1000, random_state=None)

   Compute Agarwal IQM on the flat run×dataset array with stratified bootstrap CIs.

   Implements the IQM from Agarwal et al. 2021 (rliable): trim the outer 25%
   of the concatenated per-dataset, per-seed normalized scores and average the
   remainder. Bootstrap CIs are stratified — seeds are resampled independently
   within each dataset stratum.

   :param raw_runs: Long-format DataFrame with columns
                    ``["model", "dataset", "seed", "score"]``.
   :param norm_bounds: ``(low, high, metric_direction)`` where ``low`` and
                       ``high`` are per-dataset ``pd.Series`` of normalization bounds
                       and ``metric_direction`` is a ``{dataset: "min"|"max"}`` dict
                       (or ``None``).
   :param n_bootstrap: Number of stratified bootstrap replicates for the 95% CI.
   :param random_state: Seed for :func:`numpy.random.default_rng`.

   :returns: Result with ``.table`` sorted descending by IQM.
   :rtype: IQMResult