Documentation | CompareHub

What is CompareHub?

CompareHub is a platform for running blind, reproducible LLM comparisons. Instead of relying on vibes or brand names, you run the same task across multiple models and let an ensemble of judges evaluate the outputs based on clear criteria.

Key Features

•Blind mode - Model names hidden until you reveal them to prevent brand bias
•Multiple judges - Ensemble scoring with visible agreement metrics and rationales
•Reproducible - Every run pins dataset, prompt, and model versions in a shareable permalink
•Cost & speed tracking - See real-time cost (€) and latency (ms) for every model
•Export & embed - Download CSV/JSON or embed reports in your docs

How It Works

1.Compose your prompt - Pick a task template or paste your own prompt with variables
2.Select models - Choose multiple models to compare (names hidden by default)
3.Run & judge - Models respond, judges evaluate, you get scores + permalink
4.Share & export - Share the report, export data, or re-run with same conditions

New to CompareHub? Start with the Getting Started guide to run your first comparison in under 5 minutes.

CompareHub Documentation

Getting Started

How Scoring Works

Privacy & Keys

API Reference

What is CompareHub?

Key Features

How It Works

Popular Topics