timeeval.metrics.thresholding package¶
timeeval.metrics.thresholding.ThresholdingStrategy¶
- class timeeval.metrics.thresholding.ThresholdingStrategy¶
Bases:
ABCTakes an anomaly scoring and ground truth labels to compute and apply a threshold to the scoring.
Subclasses of this abstract base class define different strategies to put a threshold over the anomaly scorings. All strategies produce binary labels (0 or 1; 1 for anomalous) in the form of an integer NumPy array. The strategy
NoThresholdingis a special no-op strategy that checks for already existing binary labels and keeps them untouched. This allows applying the metrics on existing binary classification results.- abstract find_threshold(y_true: ndarray, y_score: ndarray) float¶
Abstract method containing the actual code to determine the threshold. Must be overwritten by subclasses!
- fit(y_true: ndarray, y_score: ndarray) None¶
Calls
find_threshold()to compute and set the threshold.- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- fit_transform(y_true: ndarray, y_score: ndarray) ndarray¶
Determines the threshold and applies it to the scoring in one go.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
- transform(y_score: ndarray) ndarray¶
Applies the threshold to the anomaly scoring and returns the corresponding binary labels.
- Parameters
y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
timeeval.metrics.thresholding.NoThresholding¶
- class timeeval.metrics.thresholding.NoThresholding¶
Bases:
ThresholdingStrategySpecial no-op strategy that checks for already existing binary labels and keeps them untouched. This allows applying the metrics on existing binary classification results.
- find_threshold(y_true: ndarray, y_score: ndarray) float¶
Does nothing (no-op).
- Parameters
y_true (
np.ndarray) – Ignored.y_score (
np.ndarray) – Ignored.
- Return type
0.5 as default threshold between 0and1.
- fit(y_true: ndarray, y_score: ndarray) None¶
Does nothing (no-op).
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- transform(y_score: ndarray) ndarray¶
Checks if the provided scoring y_score is actually a binary classification prediction of integer type. If this is the case, the prediction is returned. If not, a
ValueErroris raised.- Parameters
y_score (
np.ndarray) – Anomaly scoring with binary predictions.- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
timeeval.metrics.thresholding.FixedValueThresholding¶
- class timeeval.metrics.thresholding.FixedValueThresholding(threshold: float = 0.8)¶
Bases:
ThresholdingStrategyThresholding approach using a fixed threshold value.
- Parameters
threshold (
float) – Fixed threshold to use. All anomaly scorings are scaled to the interval [0, 1]
- fit(y_true: ndarray, y_score: ndarray) None¶
Calls
find_threshold()to compute and set the threshold.- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- fit_transform(y_true: ndarray, y_score: ndarray) ndarray¶
Determines the threshold and applies it to the scoring in one go.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
- transform(y_score: ndarray) ndarray¶
Applies the threshold to the anomaly scoring and returns the corresponding binary labels.
- Parameters
y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
timeeval.metrics.thresholding.PercentileThresholding¶
- class timeeval.metrics.thresholding.PercentileThresholding(percentile: int = 90)¶
Bases:
ThresholdingStrategyUse the xth-percentile of the anomaly scoring as threshold.
- Parameters
percentile (
int) – The percentile of the anomaly scoring to use. Must be between 0 and 100.
- find_threshold(y_true: ndarray, y_score: ndarray) float¶
Computes the xth-percentile ignoring NaNs and using a linear interpolation.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
threshold – The xth-percentile of the anomaly scoring as threshold.
- Return type
- fit(y_true: ndarray, y_score: ndarray) None¶
Calls
find_threshold()to compute and set the threshold.- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- fit_transform(y_true: ndarray, y_score: ndarray) ndarray¶
Determines the threshold and applies it to the scoring in one go.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
- transform(y_score: ndarray) ndarray¶
Applies the threshold to the anomaly scoring and returns the corresponding binary labels.
- Parameters
y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
timeeval.metrics.thresholding.TopKPointsThresholding¶
- class timeeval.metrics.thresholding.TopKPointsThresholding(k: Optional[int] = None)¶
Bases:
ThresholdingStrategyCalculates a threshold so that exactly k points are marked anomalous.
- Parameters
k (
optional int) – Number of expected anomalous points. If k is None, the ground truth data is used to calculate the real number of anomalous points.
- find_threshold(y_true: ndarray, y_score: ndarray) float¶
Computes a threshold based on the number of expected anomalous points.
The threshold is determined by taking the reciprocal ratio of expected anomalous points to all points as target percentile. We, again, ignore NaNs and use a linear interpolation. If k is None, the ground truth data is used to calculate the real ratio of anomalous points to all points. Otherwise, k is used as the number of expected anomalous points.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
threshold – Threshold that yields k anomalous points.
- Return type
- fit(y_true: ndarray, y_score: ndarray) None¶
Calls
find_threshold()to compute and set the threshold.- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- fit_transform(y_true: ndarray, y_score: ndarray) ndarray¶
Determines the threshold and applies it to the scoring in one go.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
- transform(y_score: ndarray) ndarray¶
Applies the threshold to the anomaly scoring and returns the corresponding binary labels.
- Parameters
y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
timeeval.metrics.thresholding.TopKRangesThresholding¶
- class timeeval.metrics.thresholding.TopKRangesThresholding(k: Optional[int] = None)¶
Bases:
ThresholdingStrategyCalculates a threshold so that exactly k anomalies are found. The anomalies are either single-points anomalies or continuous anomalous ranges.
- Parameters
k (
optional int) – Number of expected anomalies. If k is None, the ground truth data is used to calculate the real number of anomalies.
- find_threshold(y_true: ndarray, y_score: ndarray) float¶
Computes a threshold based on the number of expected anomalous subsequences / ranges (number of anomalies).
This method iterates over all possible thresholds from high to low to find the first threshold that yields k or more continuous anomalous ranges.
If k is None, the ground truth data is used to calculate the real number of anomalies (anomalous ranges).
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
threshold – Threshold that yields k anomalies.
- Return type
- fit(y_true: ndarray, y_score: ndarray) None¶
Calls
find_threshold()to compute and set the threshold.- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- fit_transform(y_true: ndarray, y_score: ndarray) ndarray¶
Determines the threshold and applies it to the scoring in one go.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
- transform(y_score: ndarray) ndarray¶
Applies the threshold to the anomaly scoring and returns the corresponding binary labels.
- Parameters
y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
timeeval.metrics.thresholding.SigmaThresholding¶
- class timeeval.metrics.thresholding.SigmaThresholding(factor: float = 3.0)¶
Bases:
ThresholdingStrategyComputes a threshold \(\theta\) based on the anomaly scoring’s mean \(\mu_s\) and the standard deviation \(\sigma_s\):
\[\theta = \mu_{s} + x \cdot \sigma_{s}\]- Parameters
factor (
float) – Multiples of the standard deviation to be added to the mean to compute the threshold (\(x\)).
- find_threshold(y_true: ndarray, y_score: ndarray) float¶
Determines the mean and standard deviation ignoring NaNs of the anomaly scoring and computes the threshold using the mentioned equation.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
threshold – Computed threshold based on mean and standard deviation.
- Return type
- fit(y_true: ndarray, y_score: ndarray) None¶
Calls
find_threshold()to compute and set the threshold.- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- fit_transform(y_true: ndarray, y_score: ndarray) ndarray¶
Determines the threshold and applies it to the scoring in one go.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
- transform(y_score: ndarray) ndarray¶
Applies the threshold to the anomaly scoring and returns the corresponding binary labels.
- Parameters
y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
timeeval.metrics.thresholding.PyThreshThresholding¶
- class timeeval.metrics.thresholding.PyThreshThresholding(pythresh_thresholder: BaseThresholder, random_state: Any = None)¶
Bases:
ThresholdingStrategyUses a thresholder from the PyThresh package to find a scoring threshold and to transform the continuous anomaly scoring into binary anomaly predictions.
Warning
You need to install PyThresh before you can use this thresholding strategy:
pip install pythresh>=0.2.8
Please note the additional package requirements for some available thresholders of PyThresh.
- Parameters
pythresh_thresholder (
pythresh.thresholds.base.BaseThresholder) – Initiated PyThresh thresholder.random_state (
Any) –Seed used to seed the numpy random number generator used in some thresholders of PyThresh. Note that PyThresh uses the legacy global RNG (
np.random) and we try to reset the global RNG after calling PyThresh. Can be left at its default value for most thresholders that don’t use random numbers or provide their own way of seeding. Please consult the PyThresh Documentation for details about the individual thresholders.Deprecated since version 1.2.8: Since pythresh version 0.2.8, thresholders provide a way to set their RNG state correctly. So the parameter
random_stateis not needed anymore. Please use the pythresh thresholder’s parameter to seed it. This function’s parameter is kept for compatibility with pythresh<0.2.8.
Examples
from timeeval.metrics.thresholding import PyThreshThresholding from pythresh.thresholds.regr import REGR import numpy as np thresholding = PyThreshThresholding( REGR(method="theil") ) y_scores = np.random.default_rng().random(1000) y_labels = np.zeros(1000) y_pred = thresholding.fit_transform(y_labels, y_scores)
- find_threshold(y_true: ndarray, y_score: ndarray) float¶
Uses the passed thresholder from the PyThresh package to determine the threshold. Beforehand, the scores are forced to be finite by replacing NaNs with 0 and (Neg)Infs with 1.
PyThresh thresholders directly compute the binary predictions. Thus, we cache the predictions in the member
_predictionsand return them when callingtransform().- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
threshold – Threshold computed by the internal thresholder.
- Return type
- fit(y_true: ndarray, y_score: ndarray) None¶
Calls
find_threshold()to compute and set the threshold.- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- fit_transform(y_true: ndarray, y_score: ndarray) ndarray¶
Determines the threshold and applies it to the scoring in one go.
- Parameters
y_true (
np.ndarray) – Ground truth binary labels.y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).
- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray
- transform(y_score: ndarray) ndarray¶
Applies the threshold to the anomaly scoring and returns the corresponding binary labels.
- Parameters
y_score (
np.ndarray) – Anomaly scoring with continuous anomaly scores (same length as y_true).- Returns
y_pred – Array of binary labels; 0 for normal points and 1 for anomalous points.
- Return type
np.ndarray