TimeNetStandardized Benchmarks for TSLMs

An initiative to build a standardized evaluation benchmark and an initial corpus of training datasets for Time-Series Language Models (TSLMs), inspired by ImageNet's role in computer vision. Currently, only few time-series datasets with textual annotations for interpretation or reasoning are openly available. TimeNet provides the standards, tools, and initial datasets that empower the community to fill this gap.

View Data Format Python SDKpip install timenet

Scroll

What TimeNet Provides

📐

Standardized Format

A universal schema for TSLM-compatible datasets

⚡

Data Loading SDK

Python infrastructure for loading and filtering data

📊

Initial Benchmark

Curated datasets converted to our format

🔬

Evaluation Suite

HuggingFace-compatible model evaluation

Standardized Data Format

A universal schema that separates samples (raw time series + metadata) from records (grouped samples with annotations). Supports multiple annotation tasks per sample.

✓ Multi-sample records

✓ Multiple annotation types

✓ Domain & task filtering

✓ Reproducible splits

dataset.yaml

samples:  ecg_001:    sample_id: ecg_001    data: [[0.08, 0.21, 0.40, ...]]    meta_data:      length: 1000      sampling_rate: 250.0      duration: 4      duration_type: "SECONDS"      description: "Lead 1 ECG recording"records:  record001:    record_id: record001    samples: [ecg_001, spo2_001]        annotations:      - task: Reasoning        domain: Health        text: >          Regular sinus rhythm without          ectopic beats. SpO₂ stable at 98%.            - sample_id: ecg_001        task: Classification        domain: Health        label: "normal"

Supported Tasks

TimeNet supports multiple annotation tasks, enabling diverse research directions from classification to complex temporal reasoning.

Reasoning

Complex temporal pattern analysis and causal inference from time series data

Example output

In the ECG over the last 4 seconds we see a regular sinus rhythm without ectopic beats...

Python SDK

Infrastructure code to load data in the standardized format. Filter by task, domain, and get reproducible train/val/test splits.

🚧 Currently under development

✓Load datasets in standardized format
✓Filter by task (classification, reasoning, etc.)
✓Filter by domain (healthcare, manufacturing, etc.)
✓Reproducible train/val/test splits
✓HuggingFace-compatible evaluation

Terminal

▌

Initial Benchmark Suite

We provide an initial dataset and benchmark based on existing data, converted to our format.

🚧 Currently in preparation

Dataset	Task	Status
Synthetic (OpenTSLM)	Reasoning, Captioning	Available
TSQA	Multiple Choice Classification	Available
TimeSeriesExam	Multi-task Evaluation	In Progress
Clinical (Stanford)	Healthcare Reasoning	Planned

Open-Source Dataset Repository

A community-curated collection of text-annotated time series datasets across domains. Click any domain to explore available datasets.

🚧 Currently sourcing datasets — contributions welcome!

Loading datasets from Google Sheets...

Contribute Your Data

Have time series data with annotations? Help build the largest open repository of text-annotated time series data for research.

Upload Dataset

Who We Are

We are a team of scientists and engineers from ETH, Stanford, Harvard, Cambridge, TUM, CDTM, and Meta including the original authors of the OpenTSLM paper.

and more joining...