evaluma.normalize#
Functions#
|
Apply per-dataset min-max normalization to a score matrix. |
|
Resolve a normalization bound specification to a per-column Series. |
Module Contents#
- evaluma.normalize.normalize(matrix, *, norm_ref_low=None, norm_ref_high=None, metric_direction=None)#
Apply per-dataset min-max normalization to a score matrix.
- Parameters:
matrix – Model × dataset score matrix (models as rows, datasets as columns).
norm_ref_low – Lower bound for normalization — scalar, model name (row label), or
{dataset: value}dict.Noneuses the per-dataset observed minimum and emits aUserWarning.norm_ref_high – Upper bound for normalization, same format as
norm_ref_low.Noneuses the per-dataset observed maximum.metric_direction – Dict mapping dataset names to
"min"or"max". Entries mapped to"min"cause the matrix to be negated before normalization.
- Returns:
Normalized matrix with the same shape and index as
matrix, values in[0, 1]within the reference bounds.- Return type:
pandas.DataFrame
- Raises:
ValueError – If
norm_ref_lowornorm_ref_highis a string that does not name a row inmatrix.
- evaluma.normalize._resolve_bound(mat, bound, use_min)#
Resolve a normalization bound specification to a per-column Series.
- Parameters:
mat – Score matrix (models × datasets).
bound –
None(use data min/max), a scalar, a model-name string, or a{dataset: value}dict.use_min – When
boundisNone, return the column-wise minimum ifTrue, maximum ifFalse.
- Returns:
Per-column (dataset) bound values.
- Return type:
pandas.Series
- Raises:
ValueError – If
boundis a string not present inmat.index.