🚧 This website is currently under construction
Open for contributions

TimeNetStandardized Benchmarks for TSLMs

An initiative to build a standardized evaluation benchmark and an initial corpus of training datasets for Time-Series Language Models (TSLMs), inspired by ImageNet's role in computer vision. Currently, only few time-series datasets with textual annotations for interpretation or reasoning are openly available. TimeNet provides the standards, tools, and initial datasets that empower the community to fill this gap.

Scroll

What TimeNet Provides

📐

Standardized Format

A universal schema for TSLM-compatible datasets

Data Loading SDK

Python infrastructure for loading and filtering data

📊

Initial Benchmark

Curated datasets converted to our format

🔬

Evaluation Suite

HuggingFace-compatible model evaluation

Standardized Data Format

A universal schema that separates samples (raw time series + metadata) from records (grouped samples with annotations). Supports multiple annotation tasks per sample.

✓ Multi-sample records

✓ Multiple annotation types

✓ Domain & task filtering

✓ Reproducible splits

dataset.yaml
samples:  ecg_001:    sample_id: ecg_001    data: [[0.08, 0.21, 0.40, ...]]    meta_data:      length: 1000      sampling_rate: 250.0      duration: 4      duration_type: "SECONDS"      description: "Lead 1 ECG recording"records:  record001:    record_id: record001    samples: [ecg_001, spo2_001]        annotations:      - task: Reasoning        domain: Health        text: >          Regular sinus rhythm without          ectopic beats. SpO₂ stable at 98%.            - sample_id: ecg_001        task: Classification        domain: Health        label: "normal"

Supported Tasks

TimeNet supports multiple annotation tasks, enabling diverse research directions from classification to complex temporal reasoning.

Reasoning

Complex temporal pattern analysis and causal inference from time series data

Example output

In the ECG over the last 4 seconds we see a regular sinus rhythm without ectopic beats...

Python SDK

Infrastructure code to load data in the standardized format. Filter by task, domain, and get reproducible train/val/test splits.

🚧 Currently under development

  • Load datasets in standardized format
  • Filter by task (classification, reasoning, etc.)
  • Filter by domain (healthcare, manufacturing, etc.)
  • Reproducible train/val/test splits
  • HuggingFace-compatible evaluation
Terminal

Initial Benchmark Suite

We provide an initial dataset and benchmark based on existing data, converted to our format.

🚧 Currently in preparation

DatasetTaskStatus
Synthetic (OpenTSLM)Reasoning, CaptioningAvailable
TSQAMultiple Choice ClassificationAvailable
TimeSeriesExamMulti-task EvaluationIn Progress
Clinical (Stanford)Healthcare ReasoningPlanned

Open-Source Dataset Repository

A community-curated collection of text-annotated time series datasets across domains. Click any domain to explore available datasets.

🚧 Currently sourcing datasets — contributions welcome!

Loading datasets from Google Sheets...

Contribute Your Data

Have time series data with annotations? Help build the largest open repository of text-annotated time series data for research.

Upload Dataset

Who We Are

We are a team of scientists and engineers from ETH, Stanford, Harvard, Cambridge, TUM, CDTM, and Meta including the original authors of the OpenTSLM paper.

Robert Jakob,Kevin O'Sullivan,Markus Kreft,Patrick Langer,Thomas Kaar,Max Rosenblattl,Juncheng Liu,Arvind Pillai,Ning Wang,Paul Schmiedmayer,Maxwell Xu,and more joining...