TimeNetStandardized Benchmarks for TSLMs
An initiative to build a standardized evaluation benchmark and an initial corpus of training datasets for Time-Series Language Models (TSLMs), inspired by ImageNet's role in computer vision. Currently, only few time-series datasets with textual annotations for interpretation or reasoning are openly available. TimeNet provides the standards, tools, and initial datasets that empower the community to fill this gap.
What TimeNet Provides
Standardized Format
A universal schema for TSLM-compatible datasets
Data Loading SDK
Python infrastructure for loading and filtering data
Initial Benchmark
Curated datasets converted to our format
Evaluation Suite
HuggingFace-compatible model evaluation
Standardized Data Format
A universal schema that separates samples (raw time series + metadata) from records (grouped samples with annotations). Supports multiple annotation tasks per sample.
✓ Multi-sample records
✓ Multiple annotation types
✓ Domain & task filtering
✓ Reproducible splits
samples: ecg_001: sample_id: ecg_001 data: [[0.08, 0.21, 0.40, ...]] meta_data: length: 1000 sampling_rate: 250.0 duration: 4 duration_type: "SECONDS" description: "Lead 1 ECG recording"records: record001: record_id: record001 samples: [ecg_001, spo2_001] annotations: - task: Reasoning domain: Health text: > Regular sinus rhythm without ectopic beats. SpO₂ stable at 98%. - sample_id: ecg_001 task: Classification domain: Health label: "normal"Supported Tasks
TimeNet supports multiple annotation tasks, enabling diverse research directions from classification to complex temporal reasoning.
Reasoning
Complex temporal pattern analysis and causal inference from time series data
In the ECG over the last 4 seconds we see a regular sinus rhythm without ectopic beats...
Python SDK
Infrastructure code to load data in the standardized format. Filter by task, domain, and get reproducible train/val/test splits.
🚧 Currently under development
- ✓Load datasets in standardized format
- ✓Filter by task (classification, reasoning, etc.)
- ✓Filter by domain (healthcare, manufacturing, etc.)
- ✓Reproducible train/val/test splits
- ✓HuggingFace-compatible evaluation
▌Initial Benchmark Suite
We provide an initial dataset and benchmark based on existing data, converted to our format.
🚧 Currently in preparation
| Dataset | Task | Status |
|---|---|---|
| Synthetic (OpenTSLM) | Reasoning, Captioning | Available |
| TSQA | Multiple Choice Classification | Available |
| TimeSeriesExam | Multi-task Evaluation | In Progress |
| Clinical (Stanford) | Healthcare Reasoning | Planned |
Open-Source Dataset Repository
A community-curated collection of text-annotated time series datasets across domains. Click any domain to explore available datasets.
🚧 Currently sourcing datasets — contributions welcome!
📚 Major Time Series Repositories
Contribute Your Data
Have time series data with annotations? Help build the largest open repository of text-annotated time series data for research.
Upload DatasetWho We Are
We are a team of scientists and engineers from ETH, Stanford, Harvard, Cambridge, TUM, CDTM, and Meta including the original authors of the OpenTSLM paper.