TimeEval results¶

On configuring and executing TimeEval, TimeEval applies the algorithms with their configured hyperparameter values on all the datasets. It measures the algorithms’ runtimes and checks their effectiveness using evaluation measures (metrics). The results are stored in a summary file called results.csv and a nested folder structure in the results-folder (./results/<timestamp> per default). The output directory has the following structure:

results/<timestamp>/
├── results.csv
├── <algorithm_1>/<hyper_params_id>/
|   ├── <collection_name>/<dataset_name_1>/<repetition_number>/
│   |   ├── raw_anomaly_scores.ts
│   |   ├── anomaly_scores.ts
│   |   ├── docker-algorithm-scores.csv
│   |   ├── execution.log
│   |   ├── hyper_params.json
│   |   └── metrics.csv
|   └── <collection_name>/<dataset_name_2>/<repetition_number>/
│       ├── raw_anomaly_scores.ts
│       ├── anomaly_scores.ts
│       ├── docker-algorithm-scores.csv
│       ├── execution.log
|       ├── model.pkl
│       ├── hyper_params.json
│       └── metrics.csv
└── <algorithm_2>/<hyper_params_id>/
    └── <collection_name>/<dataset_name_1>/<repetition_number>/
        ├── raw_anomaly_scores.ts
        ├── anomaly_scores.ts
        ├── docker-algorithm-scores.csv
        ├── execution.log
        ├── hyper_params.json
        └── metrics.csv

We provide a description of each file below.

Summary file (`result.csv`)¶

For a given dataset, different algorithms with varying hyperparameters yield distinct results. The file result.csv provides an overview of the evaluation run and contains the following attributes:

Column Name	Datatype	Description
algorithm	str	name of the algorithm as defined in `Algorithm`#name attribute
collection	str	name of the dataset collection. A collection contains similar datasets.
dataset	str	name of the dataset
algo_training_type	str	specifies, whether a dataset has a training time series with anomaly labels (supervised), with normal data only (semi-supervised), or no training time series at all (unsupervised)
algo_input_dimensionality	str	specifies if the dataset has multiple channels (multivariate) or not (univariate)
dataset_training_type	str	specifies, whether an algorithm requires training data with anomalies (supervised), without normal data only (semi-supervised), or does not require training data (unsupervised)
dataset_input_dimensionality	str	univariate or multivariate (see above)
train_preprocess_time	float	runtime of the preprocessing step during training in seconds
train_main_time	float	runtime of the training in seconds (does not include pre-processing time)
execute_preprocess_time	float	runtime of the preprocessing step during execution in seconds
execute_main_time	float	runtime of the execution of the algorithm on the test time series in seconds (does not include pre- or post-processing times)
execute_postprocess_time	float	runtime of the post-processing step during execution
status	str	specifies, whether the algorithm executed successfully (`OK`), exceeded the time limit (`TIMEOUT`), exceeded the memory limit (`OOM`), or failed (`ERROR`)
error_message	str	optional detailed error message
repetition	int	repetition number if a dataset-hyperparameter-dataset combination was executed multiple times
hyper_params	float	actual hyperparameter values for this execution
hyper_params_id	float	alphanumerical hash of the hyperparameter configuration
metric_1	float	value of the first performance metric
…	…	…

Directory (`<algorithm_1>/<hyper_params_id>/<collection_name>/<dataset_name_1>/<repetition_number>/`)¶

For every experiment in the configured evaluation run, TimeEval creates a new directory in the result folder. It stores all the results and temporary files for this single combination of dataset, algorithm, algorithm hyperparameter values, and repetition number. The directories are structured in nested folders named by first the algorithm name, followed by the ID of the hyperparameter settings, the dataset collection name, the dataset name, and finally the repetition number. Each experiment directory contains at least the following files:

raw_anomaly_scores.ts: The raw anomaly scores produced by the algorithm after the post-processing function was executed. The file contains no header and a single floating point value in each row for each time step of the input time series. The value range depends on the algorithm.
anomaly_scores.ts: Normalized anomaly scores. The value range is from 0 (normal) to 1 (most anomalous).
execution.log: Unstructured log-file of the experiment execution. Contains debugging information from the Adapter, the algorithm, and the metric calculation. If an algorithm fails, its error message usually appear in this log.
metrics.csv: This file lists the metric and runtime measurements for the corresponding experiment. The used metrics are defined by the user. Find more information in the API documentation: timeeval.metrics package
hyper_params.json: Contains a JSON-object with the hyperparameter values used to execute the algorithm on the dataset. If hyperparameter heuristics were defined, the heuristic’ values are already resolved.

All other files are optional and depend on the used algorithm Adapter. For example, the DockerAdapter usually produces a temporary file called docker-algorithm-scores.csv to pass the algorithm result from the Docker container to TimeEval, and (semi-)supervised algorithms store their trained model in model.pkl-files.

TimeEval results¶

Summary file (result.csv)¶

Directory (<algorithm_1>/<hyper_params_id>/<collection_name>/<dataset_name_1>/<repetition_number>/)¶

Summary file (`result.csv`)¶

Directory (`<algorithm_1>/<hyper_params_id>/<collection_name>/<dataset_name_1>/<repetition_number>/`)¶