Test AI models on your actual prompts, not generic benchmarks
Compare quality, speed, and cost across 300+ models — in Cursor, VS Code, or Claude Code.
Example
Mobile Safari checkout failed — expired SSL cert on payment subdomain. Renewed; customer confirmed.
Checkout failures on mobile Safari. Root cause: expired SSL cert on payment subdomain. Resolved; fix confirmed.
Repeated checkout failures on iPhone Safari traced to expired payment subdomain SSL cert. Renewed; customer verified.
Tools
Search, test, and compare
find_modelsSearch 300+ models by what you need — price, speed, context window, or capabilities like vision and function calling.
get_modelsGet current pricing, limits, and capabilities for any model. Updated from OpenRouter every 30 minutes.
test_modelSend your prompt to multiple models. Compare outputs, latency, and cost — measured, not estimated.
Comparison
Evidence over intuition
- Model pricing and specs may be weeks old
- Quality based on generic benchmarks, not your task
- Testing a model means a throwaway script or manual switching
- Models synced from OpenRouter every 30 minutes
- Quality measured on prompts for your specific task
- Test and compare in your editor — no scripts, no switching
Setup
Add to your MCP config
{
"mcpServers": {
"index9": {
"command": "npx",
"args": ["-y", "@index9/mcp"]
}
}
}Recommended workflow
Add these to your assistant rules to guide model selection:
- 1Use find_models to shortlist candidates based on task requirements (cost, speed, context window, capabilities).
- 2Use get_models to confirm pricing, limits, and capabilities for the shortlist.
- 3Use test_model with dryRun=true to estimate cost, then run live tests with a task-representative prompt. Compare outputs first, then optimize for speed/cost.
test_model (for live tests) requires an OpenRouter API key. Add OPENROUTER_API_KEY to your config env for live tests. dryRun=true does not require a key; keys are passed per-request and never stored or logged.
FAQ