TimeEval results¶
On configuring and executing TimeEval, TimeEval applies the algorithms with their configured hyperparameter values on all the datasets.
It measures the algorithms’ runtimes and checks their effectiveness using evaluation measures (metrics).
The results are stored in a summary file called results.csv and a nested folder structure in the results-folder (./results/<timestamp> per default).
The output directory has the following structure:
results/<timestamp>/
├── results.csv
├── <algorithm_1>/<hyper_params_id>/
| ├── <collection_name>/<dataset_name_1>/<repetition_number>/
│ | ├── raw_anomaly_scores.ts
│ | ├── anomaly_scores.ts
│ | ├── docker-algorithm-scores.csv
│ | ├── execution.log
│ | ├── hyper_params.json
│ | └── metrics.csv
| └── <collection_name>/<dataset_name_2>/<repetition_number>/
│ ├── raw_anomaly_scores.ts
│ ├── anomaly_scores.ts
│ ├── docker-algorithm-scores.csv
│ ├── execution.log
| ├── model.pkl
│ ├── hyper_params.json
│ └── metrics.csv
└── <algorithm_2>/<hyper_params_id>/
└── <collection_name>/<dataset_name_1>/<repetition_number>/
├── raw_anomaly_scores.ts
├── anomaly_scores.ts
├── docker-algorithm-scores.csv
├── execution.log
├── hyper_params.json
└── metrics.csv
We provide a description of each file below.
Summary file (result.csv)¶
For a given dataset, different algorithms with varying hyperparameters yield distinct results.
The file result.csv provides an overview of the evaluation run and contains the following attributes:
Column Name |
Datatype |
Description |
|---|---|---|
algorithm |
str |
name of the algorithm as defined in |
collection |
str |
name of the dataset collection. A collection contains similar datasets. |
dataset |
str |
name of the dataset |
algo_training_type |
str |
specifies, whether a dataset has a training time series with anomaly labels (supervised), with normal data only (semi-supervised), or no training time series at all (unsupervised) |
algo_input_dimensionality |
str |
specifies if the dataset has multiple channels (multivariate) or not (univariate) |
dataset_training_type |
str |
specifies, whether an algorithm requires training data with anomalies (supervised), without normal data only (semi-supervised), or does not require training data (unsupervised) |
dataset_input_dimensionality |
str |
univariate or multivariate (see above) |
train_preprocess_time |
float |
runtime of the preprocessing step during training in seconds |
train_main_time |
float |
runtime of the training in seconds (does not include pre-processing time) |
execute_preprocess_time |
float |
runtime of the preprocessing step during execution in seconds |
execute_main_time |
float |
runtime of the execution of the algorithm on the test time series in seconds (does not include pre- or post-processing times) |
execute_postprocess_time |
float |
runtime of the post-processing step during execution |
status |
str |
specifies, whether the algorithm executed successfully ( |
error_message |
str |
optional detailed error message |
repetition |
int |
repetition number if a dataset-hyperparameter-dataset combination was executed multiple times |
hyper_params |
float |
actual hyperparameter values for this execution |
hyper_params_id |
float |
alphanumerical hash of the hyperparameter configuration |
metric_1 |
float |
value of the first performance metric |
… |
… |
… |
Directory (<algorithm_1>/<hyper_params_id>/<collection_name>/<dataset_name_1>/<repetition_number>/)¶
For every experiment in the configured evaluation run, TimeEval creates a new directory in the result folder. It stores all the results and temporary files for this single combination of dataset, algorithm, algorithm hyperparameter values, and repetition number. The directories are structured in nested folders named by first the algorithm name, followed by the ID of the hyperparameter settings, the dataset collection name, the dataset name, and finally the repetition number. Each experiment directory contains at least the following files:
raw_anomaly_scores.ts: The raw anomaly scores produced by the algorithm after the post-processing function was executed. The file contains no header and a single floating point value in each row for each time step of the input time series. The value range depends on the algorithm.anomaly_scores.ts: Normalized anomaly scores. The value range is from 0 (normal) to 1 (most anomalous).execution.log: Unstructured log-file of the experiment execution. Contains debugging information from the Adapter, the algorithm, and the metric calculation. If an algorithm fails, its error message usually appear in this log.metrics.csv: This file lists the metric and runtime measurements for the corresponding experiment. The used metrics are defined by the user. Find more information in the API documentation: timeeval.metrics packagehyper_params.json: Contains a JSON-object with the hyperparameter values used to execute the algorithm on the dataset. If hyperparameter heuristics were defined, the heuristic’ values are already resolved.
All other files are optional and depend on the used algorithm Adapter.
For example, the DockerAdapter usually produces a temporary file called docker-algorithm-scores.csv
to pass the algorithm result from the Docker container to TimeEval, and (semi-)supervised algorithms store their trained model in model.pkl-files.