evaluma.normalize
=================

.. py:module:: evaluma.normalize


Functions
---------

.. autoapisummary::

   evaluma.normalize.normalize
   evaluma.normalize._resolve_bound


Module Contents
---------------

.. py:function:: normalize(matrix, *, norm_ref_low=None, norm_ref_high=None, metric_direction=None)

   Apply per-dataset min-max normalization to a score matrix.

   :param matrix: Model × dataset score matrix (models as rows, datasets as
                  columns).
   :param norm_ref_low: Lower bound for normalization — scalar, model name
                        (row label), or ``{dataset: value}`` dict. ``None`` uses the
                        per-dataset observed minimum and emits a ``UserWarning``.
   :param norm_ref_high: Upper bound for normalization, same format as
                         ``norm_ref_low``. ``None`` uses the per-dataset observed
                         maximum.
   :param metric_direction: Dict mapping dataset names to ``"min"`` or
                            ``"max"``. Entries mapped to ``"min"`` cause the matrix to be
                            negated before normalization.

   :returns: Normalized matrix with the same shape and index
             as ``matrix``, values in ``[0, 1]`` within the reference bounds.
   :rtype: pandas.DataFrame

   :raises ValueError: If ``norm_ref_low`` or ``norm_ref_high`` is a string
       that does not name a row in ``matrix``.


.. py:function:: _resolve_bound(mat, bound, use_min)

   Resolve a normalization bound specification to a per-column Series.

   :param mat: Score matrix (models × datasets).
   :param bound: ``None`` (use data min/max), a scalar, a model-name string,
                 or a ``{dataset: value}`` dict.
   :param use_min: When ``bound`` is ``None``, return the column-wise minimum
                   if ``True``, maximum if ``False``.

   :returns: Per-column (dataset) bound values.
   :rtype: pandas.Series

   :raises ValueError: If ``bound`` is a string not present in ``mat.index``.