timeeval.metrics.thresholding package

timeeval.metrics.thresholding.ThresholdingStrategy

class timeeval.metrics.thresholding.ThresholdingStrategy

Bases: ABC

Takes an anomaly scoring and ground truth labels to compute and apply a threshold to the scoring.

Subclasses of this abstract base class define different strategies to put a threshold over the anomaly scorings. All strategies produce binary labels (0 or 1; 1 for anomalous) in the form of an integer NumPy array. The strategy NoThresholding is a special no-op strategy that checks for already existing binary labels and keeps them untouched. This allows applying the metrics on existing binary classification results.

abstract find_threshold(y_true: ndarray, y_score: ndarray) float

Abstract method containing the actual code to determine the threshold. Must be overwritten by subclasses!

fit(y_true: ndarray, y_score: ndarray) None

Calls find_threshold() to compute and set the threshold.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

fit_transform(y_true: ndarray, y_score: ndarray) ndarray

Determines the threshold and applies it to the scoring in one go.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

See also

fit

fit-function to determine the threshold.

transform

transform-function to calculate the binary predictions.

transform(y_score: ndarray) ndarray

Applies the threshold to the anomaly scoring and returns the corresponding binary labels.

Parameters

y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

timeeval.metrics.thresholding.NoThresholding

class timeeval.metrics.thresholding.NoThresholding

Bases: ThresholdingStrategy

Special no-op strategy that checks for already existing binary labels and keeps them untouched. This allows applying the metrics on existing binary classification results.

find_threshold(y_true: ndarray, y_score: ndarray) float

Does nothing (no-op).

Parameters
  • y_true (np.ndarray) – Ignored.

  • y_score (np.ndarray) – Ignored.

Return type

0.5 as default threshold between 0 and 1.

fit(y_true: ndarray, y_score: ndarray) None

Does nothing (no-op).

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

transform(y_score: ndarray) ndarray

Checks if the provided scoring y_score is actually a binary classification prediction of integer type. If this is the case, the prediction is returned. If not, a ValueError is raised.

Parameters

y_score (np.ndarray) – Anomaly scoring with binary predictions.

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

timeeval.metrics.thresholding.FixedValueThresholding

class timeeval.metrics.thresholding.FixedValueThresholding(threshold: float = 0.8)

Bases: ThresholdingStrategy

Thresholding approach using a fixed threshold value.

Parameters

threshold (float) – Fixed threshold to use. All anomaly scorings are scaled to the interval [0, 1]

find_threshold(y_true: ndarray, y_score: ndarray) float

Returns the fixed threshold.

fit(y_true: ndarray, y_score: ndarray) None

Calls find_threshold() to compute and set the threshold.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

fit_transform(y_true: ndarray, y_score: ndarray) ndarray

Determines the threshold and applies it to the scoring in one go.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

See also

fit

fit-function to determine the threshold.

transform

transform-function to calculate the binary predictions.

transform(y_score: ndarray) ndarray

Applies the threshold to the anomaly scoring and returns the corresponding binary labels.

Parameters

y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

timeeval.metrics.thresholding.PercentileThresholding

class timeeval.metrics.thresholding.PercentileThresholding(percentile: int = 90)

Bases: ThresholdingStrategy

Use the xth-percentile of the anomaly scoring as threshold.

Parameters

percentile (int) – The percentile of the anomaly scoring to use. Must be between 0 and 100.

find_threshold(y_true: ndarray, y_score: ndarray) float

Computes the xth-percentile ignoring NaNs and using a linear interpolation.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

threshold – The xth-percentile of the anomaly scoring as threshold.

Return type

float

fit(y_true: ndarray, y_score: ndarray) None

Calls find_threshold() to compute and set the threshold.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

fit_transform(y_true: ndarray, y_score: ndarray) ndarray

Determines the threshold and applies it to the scoring in one go.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

See also

fit

fit-function to determine the threshold.

transform

transform-function to calculate the binary predictions.

transform(y_score: ndarray) ndarray

Applies the threshold to the anomaly scoring and returns the corresponding binary labels.

Parameters

y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

timeeval.metrics.thresholding.TopKPointsThresholding

class timeeval.metrics.thresholding.TopKPointsThresholding(k: Optional[int] = None)

Bases: ThresholdingStrategy

Calculates a threshold so that exactly k points are marked anomalous.

Parameters

k (optional int) – Number of expected anomalous points. If k is None, the ground truth data is used to calculate the real number of anomalous points.

find_threshold(y_true: ndarray, y_score: ndarray) float

Computes a threshold based on the number of expected anomalous points.

The threshold is determined by taking the reciprocal ratio of expected anomalous points to all points as target percentile. We, again, ignore NaNs and use a linear interpolation. If k is None, the ground truth data is used to calculate the real ratio of anomalous points to all points. Otherwise, k is used as the number of expected anomalous points.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

threshold – Threshold that yields k anomalous points.

Return type

float

fit(y_true: ndarray, y_score: ndarray) None

Calls find_threshold() to compute and set the threshold.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

fit_transform(y_true: ndarray, y_score: ndarray) ndarray

Determines the threshold and applies it to the scoring in one go.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

See also

fit

fit-function to determine the threshold.

transform

transform-function to calculate the binary predictions.

transform(y_score: ndarray) ndarray

Applies the threshold to the anomaly scoring and returns the corresponding binary labels.

Parameters

y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

timeeval.metrics.thresholding.TopKRangesThresholding

class timeeval.metrics.thresholding.TopKRangesThresholding(k: Optional[int] = None)

Bases: ThresholdingStrategy

Calculates a threshold so that exactly k anomalies are found. The anomalies are either single-points anomalies or continuous anomalous ranges.

Parameters

k (optional int) – Number of expected anomalies. If k is None, the ground truth data is used to calculate the real number of anomalies.

find_threshold(y_true: ndarray, y_score: ndarray) float

Computes a threshold based on the number of expected anomalous subsequences / ranges (number of anomalies).

This method iterates over all possible thresholds from high to low to find the first threshold that yields k or more continuous anomalous ranges.

If k is None, the ground truth data is used to calculate the real number of anomalies (anomalous ranges).

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

threshold – Threshold that yields k anomalies.

Return type

float

fit(y_true: ndarray, y_score: ndarray) None

Calls find_threshold() to compute and set the threshold.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

fit_transform(y_true: ndarray, y_score: ndarray) ndarray

Determines the threshold and applies it to the scoring in one go.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

See also

fit

fit-function to determine the threshold.

transform

transform-function to calculate the binary predictions.

transform(y_score: ndarray) ndarray

Applies the threshold to the anomaly scoring and returns the corresponding binary labels.

Parameters

y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

timeeval.metrics.thresholding.SigmaThresholding

class timeeval.metrics.thresholding.SigmaThresholding(factor: float = 3.0)

Bases: ThresholdingStrategy

Computes a threshold \(\theta\) based on the anomaly scoring’s mean \(\mu_s\) and the standard deviation \(\sigma_s\):

\[\theta = \mu_{s} + x \cdot \sigma_{s}\]
Parameters

factor (float) – Multiples of the standard deviation to be added to the mean to compute the threshold (\(x\)).

find_threshold(y_true: ndarray, y_score: ndarray) float

Determines the mean and standard deviation ignoring NaNs of the anomaly scoring and computes the threshold using the mentioned equation.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

threshold – Computed threshold based on mean and standard deviation.

Return type

float

fit(y_true: ndarray, y_score: ndarray) None

Calls find_threshold() to compute and set the threshold.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

fit_transform(y_true: ndarray, y_score: ndarray) ndarray

Determines the threshold and applies it to the scoring in one go.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

See also

fit

fit-function to determine the threshold.

transform

transform-function to calculate the binary predictions.

transform(y_score: ndarray) ndarray

Applies the threshold to the anomaly scoring and returns the corresponding binary labels.

Parameters

y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

timeeval.metrics.thresholding.PyThreshThresholding

class timeeval.metrics.thresholding.PyThreshThresholding(pythresh_thresholder: BaseThresholder, random_state: Any = None)

Bases: ThresholdingStrategy

Uses a thresholder from the PyThresh package to find a scoring threshold and to transform the continuous anomaly scoring into binary anomaly predictions.

Warning

You need to install PyThresh before you can use this thresholding strategy:

pip install pythresh>=0.2.8

Please note the additional package requirements for some available thresholders of PyThresh.

Parameters
  • pythresh_thresholder (pythresh.thresholds.base.BaseThresholder) – Initiated PyThresh thresholder.

  • random_state (Any) –

    Seed used to seed the numpy random number generator used in some thresholders of PyThresh. Note that PyThresh uses the legacy global RNG (np.random) and we try to reset the global RNG after calling PyThresh. Can be left at its default value for most thresholders that don’t use random numbers or provide their own way of seeding. Please consult the PyThresh Documentation for details about the individual thresholders.

    Deprecated since version 1.2.8: Since pythresh version 0.2.8, thresholders provide a way to set their RNG state correctly. So the parameter random_state is not needed anymore. Please use the pythresh thresholder’s parameter to seed it. This function’s parameter is kept for compatibility with pythresh<0.2.8.

Examples

from timeeval.metrics.thresholding import PyThreshThresholding
from pythresh.thresholds.regr import REGR
import numpy as np

thresholding = PyThreshThresholding(
    REGR(method="theil")
)

y_scores = np.random.default_rng().random(1000)
y_labels = np.zeros(1000)
y_pred = thresholding.fit_transform(y_labels, y_scores)
find_threshold(y_true: ndarray, y_score: ndarray) float

Uses the passed thresholder from the PyThresh package to determine the threshold. Beforehand, the scores are forced to be finite by replacing NaNs with 0 and (Neg)Infs with 1.

PyThresh thresholders directly compute the binary predictions. Thus, we cache the predictions in the member _predictions and return them when calling transform().

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

threshold – Threshold computed by the internal thresholder.

Return type

float

fit(y_true: ndarray, y_score: ndarray) None

Calls find_threshold() to compute and set the threshold.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

fit_transform(y_true: ndarray, y_score: ndarray) ndarray

Determines the threshold and applies it to the scoring in one go.

Parameters
  • y_true (np.ndarray) – Ground truth binary labels.

  • y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray

See also

fit

fit-function to determine the threshold.

transform

transform-function to calculate the binary predictions.

transform(y_score: ndarray) ndarray

Applies the threshold to the anomaly scoring and returns the corresponding binary labels.

Parameters

y_score (np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).

Returns

y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.

Return type

np.ndarray