timeeval package¶
An Evaluation Tool for Anomaly Detection Algorithms on Time Series Data.
How to use the documentation:
Documentation is available in two forms: docstrings provided with the code, and a loose standing reference guide, available on timeeval.readthedocs.io.
Code snippets in the docstring examples are indicated by three greater-than signs:
>>> x = 42
>>> x = x + 1
Use the built-in help function to view a function’s or class’ docstring:
>>> from timeeval import TimeEval
>>> help(TimeEval)
...
timeeval.TimeEval¶
- class timeeval.TimeEval(dataset_mgr: Datasets, datasets: List[Tuple[str, str]], algorithms: List[Algorithm], results_path: Path = PosixPath('results'), repetitions: int = 1, distributed: bool = False, remote_config: Optional[RemoteConfiguration] = None, resource_constraints: Optional[ResourceConstraints] = None, disable_progress_bar: bool = False, metrics: Optional[List[Metric]] = None, skip_invalid_combinations: bool = True, force_training_type_match: bool = False, force_dimensionality_match: bool = False, n_jobs: int = - 1, experiment_combinations_file: Optional[Path] = None, module_configs: Mapping[str, Any] = {})¶
Main class of TimeEval.
This class is the main utility to configure and execute evaluation experiments. First select your algorithms and datasets and, then, pass them to TimeEval and use its constructor arguments to configure your evaluation run. Per default, TimeEval evaluates all algorithms on all datasets (cross product). You can use the parameters
skip_invalid_combinations,force_training_type_match,force_dimensionality_match, andexperiment_combinations_fileto control which algorithm runs on which dataset. See the description of the other arguments for further configuration details.After you have created your TimeEval object, holding the experiment run configuration, you can execute the experiments by calling
run(). Afterward, the evaluation summary results are accessible in theresults_pathand fromget_results().Examples
Simple example experiment evaluating a single algorithm on the test datasets using the default metrics (just
ROC_AUC):>>> from timeeval import TimeEval, DefaultMetrics, Algorithm, TrainingType, InputDimensionality, DatasetManager >>> from timeeval.adapters import DockerAdapter >>> from timeeval.params import FixedParameters >>> >>> dm = DatasetManager(Path("tests/example_data")) >>> datasets = dm.select() >>> >>> algorithms = [ >>> Algorithm( >>> name="COF", >>> main=DockerAdapter(image_name="ghcr.io/timeeval/cof"), >>> param_config=FixedParameters({"n_neighbors": 20, "random_state": 42}), >>> data_as_file=True, >>> training_type=TrainingType.UNSUPERVISED, >>> input_dimensionality=InputDimensionality.MULTIVARIATE >>> ), >>> ] >>> >>> timeeval = TimeEval(dm, datasets, algorithms, metrics=DefaultMetrics.default_list()) >>> timeeval.run() >>> results = timeeval.get_results(aggregated=False) >>> print(results)
- Parameters
dataset_mgr (
Datasets) – The dataset manager provides the metadata about the datasets. You can either use aDatasetManageror aMultiDatasetManager.datasets (
List[Tuple[str,str]]) – List of dataset IDs consisting of collection name and dataset name to uniquely identify each dataset. The datasets must be known by thedataset_mgr. You can callselect()on thedataset_mgrto get a list of dataset IDs.algorithms (
List[Algorithm]) – List of algorithms to evaluate on the datasets. The algorithm specification also contains the hyperparameter configurations that TimeEval will test.results_path (
Path) – Use this parameter to change the path where all evaluation results are stored. If TimeEval is used in distributed mode, this path will be created on all nodes!repetitions (
int) – Execute each unique combination of dataset, algorithm, and hyperparameter-setting multiple times. This allows you to use TimeEval to measure runtimes more precisely by aggregating the runtime measurements over multiple repetitions.distributed (
bool) – Run TimeEval in distributed mode. In this case, you should also supply aremote_config.remote_config (
Optional[RemoteConfiguration]) – Configuration of the Dask cluster used for distributed execution of TimeEval. SeeRemoteConfigurationfor details.resource_constraints (
Optional[ResourceConstraints]) –You can supply a
ResourceConstraints-object to limit the amount of (CPU, memory, or runtime) resources available to each experiment. These options apply to each experiment to ensure a fair comparison.Warning
Resource constraints are currently only implemented by the
DockerAdapter. If you rely on resource constraints, please make sure that all algorithms use theDockerAdapter-implementation.disable_progress_bar (
bool) – Enable / disable showing the tqdm progress bars.metrics (
Optional[List[Metric]]) – Supply a list ofMetricto evaluate the algorithms with. TimeEval computes all supplied metrics over all experiments. If you don’t specify any metric (None), the default metric listdefault_list()is used instead.skip_invalid_combinations (
bool) –Not all algorithms can be executed on all datasets. If this flag is set to
True, TimeEval will skip all invalid combinations of algorithms and datasets based on their input dimensionality and training type. It is automatically enabled if eitherforce_training_type_matchorforce_dimensionality_matchis set toTrue. Per default (force_training_type_match == force_dimensionality_match == False), the following combinations are not executed:supervised algorithms on semi-supervised or unsupervised datasets (datasets cannot be used to train the algorithm)
semi-supervised algorithm on supervised or unsupervised datasets (datasets cannot be used to train the algorithm)
univariate algorithms on multivariate datasets (algorithm cannot process the dataset)
force_training_type_match (
bool) – Narrow down the algorithm-dataset combinations further by executing an algorithm only on datasets with the same training type, e.g. unsupervised algorithms only on unsupervised datasets. This flag impliesskip_invalid_combinations==True.force_dimensionality_match (
bool) – Narrow down the algorithm-dataset combinations furthter by executing an algorithm only on datasets with the same input dimensionality, e.g. multivariate algorithms only on multivariate datasets. This flag impliesskip_invalid_combinations==True.n_jobs (
int) – Set the number of jobs / processes used to fetch the results from the remote machine. This setting is used only in distributed mode.-1instructs TimeEval to use all locally available cores.experiment_combinations_file (
Optional[Path]) –Supply a path to an experiment combinations CSV-File. Using this file, you can specify explicitly which combinations of algorithms, datasts, and hyperparameters should be executed. The file should contain CSV data with a single header line and four columns with the following names:
algorithm - name of the algorithm
collection - name of the dataset collection
dataset - name of the dataset
hyper_params_id - ID of the hyperparameter configuration
Only experiments that are present in the TimeEval configuration and this file are scheduled and executed. This allows you to circumvent the cross-product that TimeEval will perform in its default configuration.
module_configs (
Mapping[str,Any], optional) –Use this parameter to pass additional configuration options for automatically loaded TimeEval modules. This is currently used only for the implementation of the Bayesian hyperparameter optimization prozedure using Optuna. See
timeeval.integration.optuna.OptunaModuleandtimeeval.params.bayesian.OptunaParameterSearchfor details.You can access loaded modules via the
modulesattribute (Dict[str, TimeEvalModule) of the TimeEval instance, e.g.timeeval.modules["optuna"].
- DEFAULT_RESULT_PATH = PosixPath('results')¶
Default path for the results.
If you don’t specify the
results_path, TimeEval will store the evaluation results in the folderresultswithin the current working directory.
- RESULT_KEYS = ['algorithm', 'collection', 'dataset', 'algo_training_type', 'algo_input_dimensionality', 'dataset_training_type', 'dataset_input_dimensionality', 'train_preprocess_time', 'train_main_time', 'execute_preprocess_time', 'execute_main_time', 'execute_postprocess_time', 'status', 'error_message', 'repetition', 'hyper_params', 'hyper_params_id']¶
This list contains all the _fixed_ result data frame’s column headers. TimeEval dynamically adds the metrics and execution times depending on its configuration.
For metrics, their
name()will be used as column header, and TimeEval will add the following runtime measurements depending on whether they are applicable to the algorithms in the run or not:train_preprocess_time: if
preprocess()is definedtrain_main_time: if the algorithm is semi-supervised or supervised
execute_preprocess_time: if
preprocess()is definedexecute_main_time: always
execute_postprocess_time: if
postprocess()is defined
- get_results(aggregated: bool = True, short: bool = True) DataFrame¶
Return the (aggregated) evaluation results of a previous evaluation run.
The results are returned in a Pandas
DataFrameand contain the mean runtime and metrics of the algorithms for each dataset. You can tweak the output using the parameters.Note
Must be called after
run(), otherwise the returned DataFrame is empty.- Parameters
aggregated (
bool) – IfTrue, returns the aggregated results (controled by parametershort), otherwise all collected information is returned.short (
bool) – This parameter is used only in aggregation mode and controls the aggregation level and functions. IfTrue, the aggregation is over algorithms and datasets, and the mean of the metrics, training time, and execution time is returned. IfFalse, the aggregation is over algorithms, datasets, and parameter combinations, and the mean and standard deviation of all runtime measurements and metrics are computed.
- Return type
DataFramecontaining the evaluation results.
- rsync_results() None¶
Fetches the evaluation results of the current evaluation run from all remote machines merging the temporary data and results together on the local host. This method is automatically executed by TimeEval at the end of an evaluation run started by calling
run().See also
- static rsync_results_from(results_path: Path, hosts: List[str], disable_progress_bar: bool = False, n_jobs: int = - 1) None¶
Fetches evaluation results of an independent TimeEval run from remote machines merging the temporary data and results together on the local host.
- Parameters
results_path (
Path) – Path to the evaluation results. Must be the same for all hosts.hosts (
List[str]) – List of hostnames or IP addresses that took part in the evaluation run.disable_progress_bar (
bool) – If a progress bar should be displayed or not.n_jobs (
int) – Number of parallel processes used to fetch the results. The parallelism is limited by the number of external hosts and the maximum number of available CPU cores.
- run() None¶
Starts the configured evaluation run.
Each TimeEval run consists of a number of experiments that are executed independently of each other. There are three phases: PREPARE, EVALUATION, FINALIZE.
_PREPARE_ phase: In the first phase, the execution environment is prepared, the result folder is created, and algorithm adapter-dependent preparation steps, such as pulling Docker images for the
DockerAdapter, are executed._EVALUATION_ phase: In the evaluation phase, the experiments are executed and the results are recorded and stored to disk.
_FINALIZE_ phase: In the last phase, the execution environment is cleaned up, and algorithm adapter-dependent finalization steps, such as removing the temporary Docker containers for the
DockerAdapter, are executed.
This method executes all three phases after each other and returns after they are finished. You can access the evaluation results either using
get_results()programmatically or in the results folder from the file system.
- save_results(results_path: Optional[Path] = None) None¶
Store the evaluation results to a CSV-file in the provided results_path. This method is automatically executed by TimeEval at the end of an evaluation run when calling
run().- Parameters
results_path (
Optional[Path]) – Path, where the results should be stored at. If it is not supplied, the results path of the current TimeEval run (timeeval.TimeEval.results_path) is used.
timeeval.Status¶
timeeval.Algorithm¶
- class timeeval.Algorithm(name: str, main: Adapter, preprocess: Optional[TSFunction] = None, postprocess: Optional[TSFunctionPost] = None, data_as_file: bool = False, param_schema: Dict[str, Dict[str, Any]] = <factory>, param_config: ParameterConfig = <timeeval.params.base.FixedParameters object>, training_type: TrainingType = TrainingType.UNSUPERVISED, input_dimensionality: InputDimensionality = InputDimensionality.UNIVARIATE)¶
This class is a wrapper for any Adapter and an instruction plan for the TimeEval tool. It tells TimeEval what algorithm to execute, what pre- and post-steps to perform and how the parameters and data are provided to the algorithm. Moreover, it defines attributes that are necessary to help TimeEval know what kind of time series can be put into the algorithm.
- Parameters
name (
str) – The name of the Algorithm shown in the results.main (
timeeval.adapters.base.Adapter) – The adapter implementation that contains the algorithm to evaluate.preprocess (
Optional[TSFunction]) – Optional function to perform beforemainto modify input data.postprocess (
Optional[TSFunctionPost]) – Optional function to perform aftermainto modify output data.data_as_file (
bool) – Whether the data input is aPathor anumpy.ndarray.param_schema (
Dict[str,Dict[str,Any]]) –Optional schema of the algorithm’s input parameters needed by
timeeval_experiments.algorithm_configurator.AlgorithmConfigurator. Schema definition:[ "param_name": { "name": str "defaultValue": Any "description": str "type": str }, ]
param_config (
timeeval.params.ParameterConfig) – Optional object of type ParameterConfig to define a search grid or fixed parameters.training_type (
timeeval.data_types.TrainingType) – Definition of training type to receive the correct dataset formats (needed if TimeEval is run withforce_training_type_matchconfig).input_dimensionality (
timeeval.data_types.InputDimensionality) – Definition of training type to receive the correct dataset formats (needed if TimeEval is run withforce_dimensionality_matchconfig option).
Examples
Create a baseline algorithm that always assigns a normal anomaly score:
>>> import numpy as np >>> from timeeval import Algorithm >>> from timeeval.adapters import FunctionAdapter >>> my_fn = lambda X, args: np.zeros(len(X)) >>> Algorithm(name="Test Algorithm", main=FunctionAdapter(my_fn), data_as_file=False)
- input_dimensionality: InputDimensionality = 'univariate'¶
- param_config: ParameterConfig = <timeeval.params.base.FixedParameters object>¶
- postprocess: Optional[TSFunctionPost] = None¶
- preprocess: Optional[TSFunction] = None¶
- training_type: TrainingType = 'unsupervised'¶
timeeval.InputDimensionality¶
- class timeeval.InputDimensionality(value)¶
Bases:
EnumInput dimensionality supported by an algorithm or of a dataset.
TimeEval distinguishes between univariate and multivariate datasets / time series.
- MULTIVARIATE = 'multivariate'¶
Multivariate datasets have 2 or more features/dimensions/channels.
A multivariate algorithm can process univariate or multivariate datasets.
- UNIVARIATE = 'univariate'¶
Univariate datasets consist of a single feature/dimension/channel.
An univariate algorithm can process only a dataset with a single feature/dimension/channel.
- static from_dimensions(n: int) InputDimensionality¶
Converts the feature/dimension/channel count to an Enum-object.
timeeval.TrainingType¶
- class timeeval.TrainingType(value)¶
Bases:
EnumTraining type of algorithm or dataset.
TimeEval distinguishes between unsupervised, semi-supervised, and supervised algorithms.
- SEMI_SUPERVISED = 'semi-supervised'¶
A semi-supervised algorithm requires normal data for training.
A semi-supervised dataset consists of a training time series with normal data (no anomalies; all labels are 0) and a test time series.
- SUPERVISED = 'supervised'¶
A supervised algorithm requires training data with anomalies.
A supervised dataset consists of a training time series with anomalies and a test time series.
- UNSUPERVISED = 'unsupervised'¶
An unsupervised algorithm does not require any training data.
An unsupervised dataset consists only of a single test time series.
- static from_text(name: str) TrainingType¶
Converts the string-representation to an Enum-object.
timeeval.RemoteConfiguration¶
- class timeeval.RemoteConfiguration(scheduler_host: str = 'localhost', scheduler_port: int = 8786, worker_hosts: ~typing.List[str] = <factory>, remote_python: str = <factory>, kwargs_overwrites: ~typing.Dict[str, ~typing.Any] = <factory>, dask_logging_file_level: str = 'INFO', dask_logging_console_level: str = 'INFO', dask_logging_filename: str = 'dask.log')¶
This class holds the configuration for distributed TimeEval.
TimeEval uses a
dask.distributed.SSHClusterto distribute the evaluation tasks to multiple compute nodes. Please read the Dask documentation carefully and then use the constructor arguments to setup a TimeEval cluster.- Parameters
scheduler_host (
str) – IP address or hostname for thedistributed.Scheduler. This node will be responsible to coordinate the cluster. The scheduler does not perform any evaluations.scheduler_port (
int) – Port for the scheduler.worker_hosts (
List[str]) – List of IP address or hostnames for thedistributed.Worker. These nodes will execute the evaluation tasks.remote_python (
str) – Path to the Python-executable. If you set up all your nodes in the same way, the default is fine.kwargs_overwrites (
dict) –Use this option to overwrite any configuration options of the
SSHCluster.Warning
Only use if you know what you are doing!
dask_logging_file_level (
str) – Logging level for the file-based Dask logger.dask_logging_console_level (
str) – Logging level for the console-based Dask logger.dask_logging_filename (
str) – Name of the Dask logging file without any parent paths. Each node will write its own logging file and TimeEval will automatically postfix the filenames with the hostname and place the Dask logging files into theresults_path.
Examples
Two-node cluster where the first node hosts the scheduler but also takes part in the evaluation:
>>> from timeeval import TimeEval, RemoteConfiguration >>> config = RemoteConfiguration(scheduler_host="192.168.1.1", worker_hosts=["192.168.1.1", "192.168.1.2"]) >>> TimeEval(dm=..., datasets=[], algorithms=[], distributed=True, remote_config=config)
timeeval.ResourceConstraints¶
- class timeeval.ResourceConstraints(tasks_per_host: int = 1, task_memory_limit: ~typing.Optional[int] = None, task_cpu_limit: ~typing.Optional[float] = None, train_timeout: ~durations.duration.Duration = <Duration 8 hours>, execute_timeout: ~durations.duration.Duration = <Duration 8 hours>, use_preliminary_model_on_train_timeout: bool = True, use_preliminary_scores_on_execute_timeout: bool = True)¶
Use this class to configure resource constraints and how TimeEval deals with preliminary results.
Warning
Resource constraints are just supported by the
DockerAdapter!For docker: Swap is always disabled. Resource constraints are enforced using explicit resource limits on the Docker container.
- Parameters
tasks_per_host (
int) –Specify, how many evaluation tasks are executed on each host. This setting influences the default memory and CPU limits if
task_memory_limitandtask_cpu_limitareNone: the available resources of the node are shared equally between the tasks.Because each tasks, in effect, trains or executes a time series anomaly detection algorithm, the tasks are resource-intensive, which means that over-provisioning is not useful and could decrease overall performance. If runtime measurements are taken, make sure that no resources are shared between the tasks!
task_memory_limit (
Optional[int]) – Specify the maximum allowed memory in Bytes. You can useMBandGBfor better readability. This setting limits the available main memory per task to a fixed value.task_cpu_limit (
Optional[float]) – Specify the maximum allowed CPU usage in fractions of CPUs (e.g. 0.25 means: only use 1/4 of a single CPU core). Usually, it is advisable to use whole CPU cores (e.g. 1.0 for 1 CPU core, 2.0 for 2 CPU cores, etc.).train_timeout (
Duration) – Default timeout for training an algorithm. This value can be overridden for each algorithm in itsDockerAdapter.execute_timeout (
Duration) – Default timeout for executing an algorithm. This value can be overridden for each algorithm in itsDockerAdapter.use_preliminary_model_on_train_timeout (
bool) – If this option is enabled (default), then algorithms can save preliminary models (model checkpoints) to disk and TimeEval will use the last preliminary model if the training step runs into the training timeout. This is especially useful for machine learning algorithms that use an iterative training process (e.g. using SGD). As long as the algorithm implementation stores the best-so-far model after each training epoch, the training must not be limited by the number of epochs but just by the training time.use_preliminary_scores_on_execute_timeout (
bool) – If this option is enabled (default) and an algorithm exceeds the execution timeout, TimeEval will look for any preliminary result. This allows the evaluation of progressive algorithms that output a rough result, refine it over time, and would otherwise run into the execution timeout.
- static default_constraints() ResourceConstraints¶
Creates a configuration object with the default resource constraints.
- execute_timeout: Duration = <Duration 8 hours>¶
- get_compute_resource_limits(memory_overwrite: Optional[int] = None, cpu_overwrite: Optional[float] = None) Tuple[int, float]¶
Calculates the resource constraints for a single task.
There are three sources for resource limits (in decreasing priority):
Overwrites (passed to this function as arguments)
Explicitly set resource limits (on this object using task_memory_limit and task_cpu_limit)
Default resource constraints
Overall default:
1 task per node using all available cores and RAM (except small margin for OS).
When multiple tasks are specified, the resources are equally shared between all concurrent tasks. This means that CPU limit is set based on node CPU count divided by the number of tasks and memory limit is set based on total memory of node minus 1 GB (for OS) divided by the number of tasks.
Attention
Must be called on the node that will execute the task!
- Parameters
- Returns
memory_limit, cpu_limit – Tuple of memory and CPU limit. Memory limit is expressed in Bytes and CPU limit is expressed in fractions of CPUs (e.g. 0.25 means: only use 1/4 of a single CPU core).
- Return type
Tuple[int,float]
- get_execute_timeout(timeout_overwrite: Optional[Duration] = None) Duration¶
Returns the maximum runtime of an execution task in seconds.
- Parameters
timeout_overwrite (
Duration) – If this is set, it will overwrite the global timeout.- Returns
execute_timeout – The execution timeout with the highest precedence (method overwrite then global configuration).
- Return type
Duration
- get_train_timeout(timeout_overwrite: Optional[Duration] = None) Duration¶
Returns the maximum runtime of a training task in seconds.
- Parameters
timeout_overwrite (
Duration) – If this is set, it will overwrite the global timeout.- Returns
train_timeout – The training timeout with the highest precedence (method overwrite then global configuration).
- Return type
Duration
- train_timeout: Duration = <Duration 8 hours>¶
- timeeval.resource_constraints.GB = 1073741824¶
\(1 GB = 2^{30} \text{Bytes}\)
Can be used to set the memory limit.
Examples
>>> from timeeval.resource_constraints import ResourceConstraints, GB >>> ResourceConstraints(task_memory_limit=1 * GB)
- timeeval.resource_constraints.MB = 1048576¶
\(1 MB = 2^{20} \text{Bytes}\)
Can be used to set the memory limit.
Examples
>>> from timeeval.resource_constraints import ResourceConstraints, MB >>> ResourceConstraints(task_memory_limit=500 * MB)
timeeval.constants¶
- class timeeval.constants.HPI_CLUSTER¶
Cluster constant for the HPI cluster.
These constants are applicable only for the HPI infrastructure and might not be useful for you.
- BENCHMARK = 'benchmark'¶
- CORRELATION_ANOMALIES = 'correlation-anomalies'¶
- MULTIVARIATE_ANOMALY_TEST_CASES = 'multivariate-anomaly-test-cases'¶
- MULTIVARIATE_TEST_CASES = 'multivariate-test-cases'¶
- UNIVARIATE_ANOMALY_TEST_CASES = 'univariate-anomaly-test-cases'¶
- VARIABLE_LENGTH_TEST_CASES = 'variable-length'¶
- akita_dataset_paths: Dict[str, Path] = {'benchmark': PosixPath('/home/projects/akita/data/benchmark-data/data-processed'), 'correlation-anomalies': PosixPath('/home/projects/akita/data/correlation-anomalies'), 'multivariate-anomaly-test-cases': PosixPath('/home/projects/akita/data/multivariate-anomaly-test-cases'), 'multivariate-test-cases': PosixPath('/home/projects/akita/data/multivariate-test-cases'), 'univariate-anomaly-test-cases': PosixPath('/home/projects/akita/data/univariate-anomaly-test-cases'), 'variable-length': PosixPath('/home/projects/akita/data/variable-length')}¶
This dictionary contains the paths to the dataset collection folders.
- nodes: List[str] = ['odin01', 'odin02', 'odin03', 'odin04', 'odin05', 'odin06', 'odin07', 'odin08', 'odin09', 'odin10', 'odin11', 'odin12', 'odin13', 'odin14']¶
All nodes of the homogenous HPI cluster.
- nodes_ip: List[str] = ['172.20.11.101', '172.20.11.102', '172.20.11.103', '172.20.11.104', '172.20.11.105', '172.20.11.106', '172.20.11.107', '172.20.11.108', '172.20.11.109', '172.20.11.110', '172.20.11.111', '172.20.11.112', '172.20.11.113', '172.20.11.114']¶
All IP addresses of the nodes in the homogenous HPI cluster.
timeeval.data_types.ExecutionType¶
- class timeeval.data_types.ExecutionType(value)¶
Bases:
EnumEnum used to indicate the execution type of algorithms.
TimeEval calls each algorithm up to two times with two different execution types and passes the current execution type as an object of this class to the algorithm adapter implementation.
Depending on the algorithm’s
timeeval.TrainingType, it requires a training step. TimeEval will call these algorithms first with the execution type set toTRAIN. Then, for all algorithms, the algorithm is called with execution typeEXECUTE.- EXECUTE = 'execute'¶
- TRAIN = 'train'¶