timeeval.heuristics package¶
- timeeval.heuristics.TimeEvalHeuristic(signature: str) TimeEvalParameterHeuristic¶
Factory function for TimeEval heuristics based on their string-representation.
This wrapper allows using the heuristics by name without the need for imports. It is primarily used in the
timeeval.heuristics.inject_heuristic_values()function. The following heuristics are currently supported:- Parameters
signature (
str) – String representation of the heuristic to be created. Must be of the form<heuristic_name>(<heuristic_parameters>)- Returns
heuristic – The created heuristic object.
- Return type
- timeeval.heuristics.inject_heuristic_values(params: T, algorithm: Algorithm, dataset_details: Dataset, dataset_path: Path) T¶
This function parses the supplied parameter mapping in
paramsand replaces all heuristic definitions with their actual values.The heuristics are generally evaluated in the order they are defined in the parameter mapping. However,
The order within multiple ``ParameterDependenceHeuristic`sis not defined. If a heuristic returnsNone, the corresponding parameter is removed from the parameter mapping. Heuristics can be defined by using the following syntax as the parameter value:"heuristic:<heuristic_name>(<heuristic_parameters>)"Heuristics can use the following information to compute their values:
properties of the algorithm
properties of the dataset
the full dataset (supplied as a path to the dataset)
the (current) parameter mapping (later evaluated heuristics can see the changes of previous heuristics)
- Parameters
params (
T) – The current parameter mapping, whose values should be updated by the heuristics. If a immutable mapping is passed, no changes will be made.algorithm (
Algorithm) – The algorithm (Algorithm) for which the parameter mapping is valid.dataset_details (
Dataset) – The dataset for which the parameter mapping is supposed to be used.dataset_path (
Path) – The path to the dataset.
- Returns
params – The updated parameter mapping.
- Return type
T
timeeval.heuristics.TimeEvalParameterHeuristic¶
- class timeeval.heuristics.TimeEvalParameterHeuristic¶
Bases:
ABCBase class for TimeEval parameter heuristics.
Heuristics are used to calculate parameter values for algorithms based on information about the algorithm, the dataset, or other parameters. They are evaluated in the driver process when TimeEval is configured. This means that the datasets must be available on the node executing the driver process. The calculated parameter values are then injected into the algorithm configuration and the algorithm is executed on the cluster.
See also
timeeval.heuristics.inject_heuristic_values()Function that uses the heuristics to calculate parameter values for algorithms.
- classmethod get_param_names() List[str]¶
Get parameter names (arguments) for the heuristic.
Adopted from https://github.com/scikit-learn/scikit-learn/blob/2beed5584/sklearn/base.py.
timeeval.heuristics.AnomalyLengthHeuristic¶
- class timeeval.heuristics.AnomalyLengthHeuristic(agg_type: str = 'median')¶
Bases:
TimeEvalParameterHeuristicHeuristic to use the anomaly length of the dataset as parameter value. Uses ground-truth labels, and should therefore only be used for testing purposes.
Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({"window_size": "heuristic:AnomalyLengthHeuristic(agg_type='max')"})
- Parameters
agg_type (
str) – Type of aggregation to use for calculating the anomaly length when multiple anomalies are present in the time series. Must be one of min, median, or max. (default: median)
timeeval.heuristics.CleanStartSequenceSizeHeuristic¶
- class timeeval.heuristics.CleanStartSequenceSizeHeuristic(max_factor: float = 0.1)¶
Bases:
TimeEvalParameterHeuristicHeuristic to compute the number of time steps until the first anomaly occurs, and use it as parameter value. Uses ground-truth labels. Allows to specify a maximum fraction of the entire time series length. The minimum of the computed value and the maximum fraction is used as parameter value.
Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({"n_init": "heuristic:CleanStartSequenceSizeHeuristic(max_factor=0.1)"})
- Parameters
max_factor (
float) – Maximum fraction of the entire time series length to use as parameter value. This limits the parameter value. (default: 0.1)
timeeval.heuristics.ContaminationHeuristic¶
- class timeeval.heuristics.ContaminationHeuristic¶
Bases:
TimeEvalParameterHeuristicHeuristic to use the time series’ contamination as parameter value. The contamination is defined as the fraction of anomalous points to all points in the time series.
Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({"fraction": "heuristic:ContaminationHeuristic()"})
timeeval.heuristics.DatasetIdHeuristic¶
- class timeeval.heuristics.DatasetIdHeuristic¶
Bases:
TimeEvalParameterHeuristicHeuristic to pass the dataset ID as a parameter value.
The dataset ID is a tuple of the collection name and the dataset name, such as
("KDD-TSAD", "022_UCR_Anomaly_DISTORTEDGP711MarkerLFM5z4").Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({"dataset_id": "heuristic:DatasetIdHeuristic()"})
timeeval.heuristics.DefaultExponentialFactorHeuristic¶
- class timeeval.heuristics.DefaultExponentialFactorHeuristic(exponent: int = 0, zero_fb: float = 1.0)¶
Bases:
TimeEvalParameterHeuristicHeuristic to use the default value multiplied by a factor of $10^{exponent}$ as parameter value.
This allows easier specification of exponential parameter search spaces based on the default value. E.g. if we consider a learning rate parameter with default value 0.01, we can use this heuristic to specify a search space of [0.0001, 0.001, 0.01, 0.1, 1] by using the following parameter values:
"heuristic:DefaultExponentialFactorHeuristic(exponent=-2)""heuristic:DefaultExponentialFactorHeuristic(exponent=-1)""heuristic:DefaultExponentialFactorHeuristic()""heuristic:DefaultExponentialFactorHeuristic(exponent=1)""heuristic:DefaultExponentialFactorHeuristic(exponent=2)"
But if the default parameter value is 0.5, the search space would be [0.005, 0.05, 0.5, 5, 50].
Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({"window_size": "heuristic:DefaultExponentialFactorHeuristic(exponent=1, zero_fb=200)"})
timeeval.heuristics.DefaultFactorHeuristic¶
- class timeeval.heuristics.DefaultFactorHeuristic(factor: float = 1.0, zero_fb: float = 1.0)¶
Bases:
TimeEvalParameterHeuristicHeuristic to use the default value multiplied by a factor as parameter value.
This allows easier specification of parameter search spaces based on the default value. E.g. if we consider a n_clusters parameter with default value 50, we can use this heuristic to specify a search space of [10, 25, 50, 75, 100] by using the following parameter values:
"heuristic:DefaultExponentialFactorHeuristic(factor=0.2)""heuristic:DefaultExponentialFactorHeuristic(factor=0.5)""heuristic:DefaultExponentialFactorHeuristic()""heuristic:DefaultExponentialFactorHeuristic(factor=1.5)""heuristic:DefaultExponentialFactorHeuristic(factor=2.0)"
But if the default parameter value is 100, the search space would be [20, 50, 100, 150, 200].
Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({"window_size": "heuristic:DefaultFactorHeuristic(factor=1, zero_fb=200)"})
timeeval.heuristics.EmbedDimRangeHeuristic¶
- class timeeval.heuristics.EmbedDimRangeHeuristic(base_factor: float = 1.0, base_fb_value: int = 50, dim_factors: Optional[List[float]] = None)¶
Bases:
TimeEvalParameterHeuristicHeuristic to use a range of embedding dimensions as parameter value.
The base dimensionality is calculated based on the
PeriodSizeHeuristic, base factor, and base fallback value. The base dimensionality is then multiplied by the factors specified indim_factorsto create the embedding dimension range.Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({ ... "embed_dim": "heuristic:EmbedDimRangeHeuristic(base_factor=1, base_fb_value=50, dim_factors=[0.5, 1.0, 1.5])" ... })
- Parameters
base_factor (
float) – Factor to use for the base dimensionality. Directly passed on to thePeriodSizeHeuristic. (default: 1.0)base_fb_value (
int) – Fallback value to use for the base dimensionality. Directly passed on to thePeriodSizeHeuristic. (default: 50)dim_factors (
List[float]) – Factors to use for the creation of the embedding dimension range. (default: [0.5, 1.0, 1.5])
timeeval.heuristics.ParameterDependenceHeuristic¶
- class timeeval.heuristics.ParameterDependenceHeuristic(source_parameter: str, fn: Optional[Callable[[Any], Any]] = None, factor: Optional[float] = None)¶
Bases:
TimeEvalParameterHeuristicHeuristic to use the value of another parameter as parameter value.
ParameterDependenceHeuristiccan be used to create a parameter value that depends on another parameter. This can be done by either supplying a mapping function or a factor. If a mapping function is supplied, it is called with the value of the source parameter as the only argument. If a factor is supplied, the value of the source parameter is multiplied by the factor. You cannot supply both a mapping function and a factor! This heuristic is evaluated after all other heuristics, so you can use it to create a parameter value that depends on the values of other parameters filled by heuristics.Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({ ... "latent_dim": "heuristic:ParameterDependenceHeuristic(source_parameter='window_size', factor=0.5)" ... })
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({ ... "latent_dims": "heuristic:ParameterDependenceHeuristic(source_parameter='window_size', fn=lambda x: [x // 2, x, x * 2])" ... })
timeeval.heuristics.PeriodSizeHeuristic¶
- class timeeval.heuristics.PeriodSizeHeuristic(factor: float = 1.0, fb_anomaly_length_agg_type: Optional[str] = None, fb_value: int = 1)¶
Bases:
TimeEvalParameterHeuristicHeuristic to use the period size of the dataset as parameter value.
Not all datasets have a period size, so this heuristic uses the following fallbacks in order:
1. If
fb_anomaly_length_agg_typeis specified, theAnomalyLengthHeuristicwith the specified aggregation type is used as fallback. 2. Iffb_valueis specified, it is directly used as fallback.Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({ ... "window_size": "heuristic:PeriodSizeHeuristic(factor=1.0, fb_anomaly_length_agg_type='median', fb_value=100)" ... })
- Parameters
factor (
float) – Factor to use for the period size. (default: 1.0)fb_anomaly_length_agg_type (
str, optional) – Aggregation type to use for theAnomalyLengthHeuristicfallback. (default: None)fb_value (
int, optional) – Value to use as fallback if no period size is available. (default: 1)
timeeval.heuristics.RelativeDatasetSizeHeuristic¶
- class timeeval.heuristics.RelativeDatasetSizeHeuristic(factor: float = 0.1)¶
Bases:
TimeEvalParameterHeuristicHeuristic to set a parameter value depending on the size of the dataset (length of the time series).
Examples
>>> from timeeval.params import FixedParameters >>> params = FixedParameters({"n_init": "heuristic:RelativeDatasetSizeHeuristic(factor=0.1)"})
- Parameters
factor (
float) – Factor to multiply the dataset length with to get the parameter value. (default: 0.1)