How SubChoice Rates AI Plans
The exact scoring methodology behind every SubChoice rating: 5 criteria, 8 dimensions, and calibrated anchors.
Scoring Criteria
Each dimension score is derived from five weighted criteria:
| Criterion | Weight | What It Measures |
|---|---|---|
| Feature coverage | 30% | How many relevant features for this use case the plan includes |
| Model quality | 25% | Quality/capability of the AI models available in the plan |
| Usage limits | 20% | How generous the plan's usage allowances are for this use case |
| Value | 15% | Price relative to what you get for this specific use case |
| Bundled tools | 10% | Relevant bundled tools/integrations included in the plan |
Dimension Scale Definitions
Plans are scored 1–10 per dimension. Each tier has concrete, observable criteria — not subjective impressions.
| Score | Label | Meaning |
|---|---|---|
| 9–10 | Excellent | Purpose-built for this use case |
| 7–8 | Strong | Very capable, minor gaps |
| 5–6 | Adequate | Usable but not optimized |
| 3–4 | Limited | Can technically do it, significant limitations |
| 1–2 | Not designed | Not intended for this use case |
Coding
Software development, debugging, code review, and code generation — evaluated on how well the plan supports a developer's day-to-day workflow.
| Score | Tier | Criteria |
|---|---|---|
| 9–10 | Excellent | Dedicated AI IDE or deep IDE extension with inline completions, multi-file context editing, code agents with autonomous execution, terminal integration, and support for all major languages and frameworks |
| 7–8 | Strong | Excellent code generation via chat, large context window enabling multi-file review, some agentic coding capability or IDE integration, supports all major languages |
| 5–6 | Adequate | Can write and explain code in chat but lacks IDE integration, limited context window constrains multi-file work, no autonomous code execution |
| 3–4 | Limited | Basic code output via general-purpose chat, no specialized coding tools, no IDE integration, struggles with complex multi-file codebases |
| 1–2 | Not designed | Writing, SEO, or creative tool with no dedicated code features — incidental code output only |
Writing
Blog posts, marketing copy, long-form articles, creative writing, and content optimization — evaluated on language model quality for text generation and any writing-specific tooling.
| Score | Tier | Criteria |
|---|---|---|
| 9–10 | Excellent | Purpose-built writing platform with brand voice customization, SEO integration, template library (50+), bulk content generation, plagiarism detection, tone analysis, and top-tier language models |
| 7–8 | Strong | Top-tier language model with exceptional prose quality, long-form document support, revision workflow, voice consistency; possibly missing specialized writing features but output quality is excellent |
| 5–6 | Adequate | Good language model capable of writing assistance, but limited document context, no brand voice, no writing-specific templates or workflows |
| 3–4 | Limited | Can produce written content but not optimized for it — designed for another purpose (coding, image gen), limited text context, basic prose |
| 1–2 | Not designed | Coding IDE or specialized tool where writing is a side effect, not a feature |
Research
Deep research, information synthesis, multi-source analysis, and knowledge retrieval — evaluated on web search quality, context capacity for document analysis, and ability to synthesize complex information.
| Score | Tier | Criteria |
|---|---|---|
| 9–10 | Excellent | Purpose-built research platform with multi-source web search, automatic citation, academic database access, multi-document synthesis in a single session, and structured report generation |
| 7–8 | Strong | High-quality web search with citation, large context window enabling full-document analysis, strong synthesis capability, can handle multi-step research questions accurately |
| 5–6 | Adequate | Has web search but limited depth, moderate context window, can answer factual questions but struggles with complex multi-source synthesis or long document analysis |
| 3–4 | Limited | Primarily uses training data, limited or no web search, cannot analyze uploaded documents thoroughly, struggles with questions requiring current information |
| 1–2 | Not designed | Coding IDE or creative tool where research is not a designed capability; provides no web search or document analysis |
Creative
Image generation, video creation, graphic design, and visual creative projects — evaluated on native image/video generation capability, quality of creative output, and breadth of visual tools.
| Score | Tier | Criteria |
|---|---|---|
| 9–10 | Excellent | Purpose-built creative platform with native high-quality image or video generation, multiple style options, editing/inpainting tools, commercial license, high output volume |
| 7–8 | Strong | Native image generation with good quality and style diversity, sufficient for most creative tasks; may lack video or advanced editing features |
| 5–6 | Adequate | Has image generation but limited quality, style range, or generation quota; not the primary use case of the platform |
| 3–4 | Limited | Basic image generation as a side feature, very limited quota, lower quality relative to dedicated creative tools |
| 1–2 | Not designed | No native image or video generation; tool is built for text, code, or research — creative output is not a designed capability |
Business
Project management, business documentation, team productivity, meeting notes, and organizational workflows — evaluated on team collaboration features, document management, workflow automation, and integrations with business tools.
| Score | Tier | Criteria |
|---|---|---|
| 9–10 | Excellent | Purpose-built business platform with project management, databases, shared knowledge base, workflow automation, SSO/SCIM, admin controls, deep Slack/Jira/Google Workspace integration |
| 7–8 | Strong | Strong document creation and summarization, good business writing, some workflow automation; team features present but not the primary focus; works well in business settings |
| 5–6 | Adequate | Useful for business writing and document review, but minimal native collaboration, no project management, limited integrations |
| 3–4 | Limited | Can generate business documents in chat but lacks any native business tooling — no PM features, no integrations, no team collaboration |
| 1–2 | Not designed | Coding IDE or creative tool — business productivity is incidental, no collaboration or document management features |
Learning
Education, tutoring, skill development, and structured knowledge acquisition — evaluated on ability to explain concepts at varying levels, generate quizzes/exercises, provide Socratic dialogue, and support a learner's comprehension arc.
| Score | Tier | Criteria |
|---|---|---|
| 9–10 | Excellent | Purpose-built tutoring platform with structured curriculum, adaptive difficulty, spaced repetition, quiz generation, progress tracking, and expert tutors for specific domains |
| 7–8 | Strong | Excellent at explaining complex concepts at any level, generates practice problems and quizzes on demand, engages in Socratic dialogue, large context for extended learning sessions |
| 5–6 | Adequate | Can explain concepts and answer follow-up questions, but limited session context, no structured curriculum, does not adapt to learner level proactively |
| 3–4 | Limited | Can answer factual questions about a topic but not optimized for teaching — no quiz generation, no adaptive explanation depth, no structured pedagogy |
| 1–2 | Not designed | Coding IDE or narrow-domain tool with no teaching capability; explanations are incidental to primary function |
General
Daily assistant tasks — Q&A, casual chat, task management, scheduling assistance, general productivity, and anything that doesn't fit a specialized category — evaluated on versatility, response quality, and breadth of handled task types.
| Score | Tier | Criteria |
|---|---|---|
| 9–10 | Excellent | Highly versatile assistant that handles any daily task well; fast, accurate, multi-modal, persistent memory, proactive suggestions, handles ambiguous or casual requests gracefully |
| 7–8 | Strong | Very capable general assistant; handles most daily tasks reliably; good conversation quality; may lack memory or be slightly slower; minimal refusals on non-sensitive topics |
| 5–6 | Adequate | Useful for general questions and casual chat but has noticeable gaps — limited memory, topic restrictions, slower, or less accurate on off-the-wall requests |
| 3–4 | Limited | Can answer basic questions but designed for a specific context; feels awkward for casual or unrelated daily tasks; limited instruction-following for varied requests |
| 1–2 | Not designed | Purpose-built tool where general chat is actively off-scope (coding IDE, SEO tool) — general queries are tolerated but not supported |
Automation
Workflow automation, AI agents, multi-step automated tasks, pipelines, and autonomous task execution — evaluated on native agent capabilities, API/integration access, multi-step reasoning reliability, and ability to complete tasks with minimal human supervision.
| Score | Tier | Criteria |
|---|---|---|
| 9–10 | Excellent | Purpose-built automation platform or agent framework with visual workflow builder, 100+ integrations, reliable multi-step autonomous execution, error handling, and scheduling |
| 7–8 | Strong | Native agent mode with multi-step task execution, tool use (web browsing, code execution, file management), API access, and reliable task completion on complex workflows |
| 5–6 | Adequate | Some agentic capability but limited reliability on complex multi-step tasks, limited integrations, requires more human oversight than a dedicated automation tool |
| 3–4 | Limited | Basic task chaining in chat, no true autonomous execution, API available but not automation-native; can describe workflows but cannot reliably execute them |
| 1–2 | Not designed | No agent or automation features; tool is designed for synchronous interactive use only |
Calibration Anchor Table
All scores are calibrated against these five anchor vendors (Pro tier). Non-anchor plans derive scores relative to their anchor using documented tier delta rules. Source: Board Round 4 (2026-03-23) — 7-member consensus.
| Plan | Coding | Writing | Research | Creative | Business | Learning | General | Automation |
|---|---|---|---|---|---|---|---|---|
| ChatGPT Plus | 7 | 8 | 8 | 7 | 7 | 8 | 9 | 4 |
| Claude Pro | 8 | 9 | 8 | 5 | 7 | 8 | 8 | 5 |
| Gemini Pro | 6 | 7 | 7 | 6 | 6 | 7 | 8 | 3 |
| Cursor Pro | 9 | 2 | 2 | 1 | 2 | 3 | 3 | 5 |
| Windsurf Pro | 8 | 2 | 2 | 1 | 2 | 3 | 3 | 4 |
See these scores in action
The AI Stack Optimizer uses this scoring methodology to recommend the best AI tool combination for your workflow — with real savings calculations.
Try the Stack Optimizer →Frequently Asked Questions
How does SubChoice score AI tools?
SubChoice rates each AI tool plan across 8 use-case dimensions on a 1-10 integer scale. Each score is derived from 5 weighted criteria: Feature coverage (30%), Model quality (25%), Usage limits (20%), Value (15%), and Bundled tools (10%). Scores are calibrated against anchor vendors to ensure consistency.
What does a score of 9 or 10 mean?
A score of 9-10 means best-in-class for that use case. The plan offers comprehensive, specialized features with minimal limitations. For example, a coding score of 10 indicates a dedicated code-first platform with advanced IDE integration, agent capabilities, and generous usage limits.
How often are scores updated?
Scores are reviewed whenever a vendor updates their pricing,
features, or model lineup. Each vendor file includes a
last_verified date showing when the data was last confirmed against the vendor's live
pricing page.
Who decides the scores?
Scores are determined by a documented algorithm based on observable criteria, not subjective opinion. The algorithm was reviewed and approved by SubChoice's advisory board (CTO, CDO, and Skeptic roles). Every score has a written rationale in the scores-rationale document.
Can I see the full scoring algorithm?
Yes. The complete scoring algorithm, including per-dimension field mappings, tier definitions, and delta rules for pricing tiers, is published in our documentation. This page summarizes the key elements; the full technical specification is available in our open-source repository.
Ready to compare? Compare AI plans side by side.