evaluma.normalize

evaluma.normalize#

`normalize`(matrix, *[, norm_ref_low, norm_ref_high, ...])	Apply per-dataset min-max normalization to a score matrix.
`_resolve_bound`(mat, bound, use_min)	Resolve a normalization bound specification to a per-column Series.

evaluma.normalize.normalize(matrix, *, norm_ref_low=None, norm_ref_high=None, metric_direction=None)#

Apply per-dataset min-max normalization to a score matrix.

Parameters:

matrix – Model × dataset score matrix (models as rows, datasets as columns).
norm_ref_low – Lower bound for normalization — scalar, model name (row label), or {dataset: value} dict. None uses the per-dataset observed minimum and emits a UserWarning.
norm_ref_high – Upper bound for normalization, same format as norm_ref_low. None uses the per-dataset observed maximum.
metric_direction – Dict mapping dataset names to "min" or "max". Entries mapped to "min" cause the matrix to be negated before normalization.

Returns:

Normalized matrix with the same shape and index as matrix, values in [0, 1] within the reference bounds.

Return type:

pandas.DataFrame

Raises:

ValueError – If norm_ref_low or norm_ref_high is a string that does not name a row in matrix.

evaluma.normalize._resolve_bound(mat, bound, use_min)#

Resolve a normalization bound specification to a per-column Series.

Parameters:

mat – Score matrix (models × datasets).
bound – None (use data min/max), a scalar, a model-name string, or a {dataset: value} dict.
use_min – When bound is None, return the column-wise minimum if True, maximum if False.

Returns:

Per-column (dataset) bound values.

Return type:

pandas.Series

Raises:

ValueError – If bound is a string not present in mat.index.