WealthAI Tokenomics

Right model. Right task.
Across the whole lifecycle.

A tiered architecture with Claude Opus for reasoning and Gemini Flash for volume delivers enterprise-grade quality at a fraction of the cost — from code generation to nightly agentic workflows.

Build
AI-Assisted Dev
Run
In-App Intelligence
Run
Multimodal & Latency
Extend
Agentic Orchestration
Complex tasks

Claude Opus 4.8

Frontier reasoning

$5 / $25 per 1M tokens

Standard tasks

Gemini 3.5 Flash

Fast, native multimodal

$1.50 / $9 per 1M tokens

Simple lookups

Gemini 3.1 Flash-Lite

High-volume routing

$0.25 / $1.50 per 1M tokens

Scenario 1Build Phase

AI-Assisted Software Development

Your engineers don't need frontier intelligence for every task.

75% of coding tasks are routine — tests, scaffolds, docs, CRUD. A tiered strategy routes them to Flash at ⅓ the cost, reserving Opus for architecture and hard debugging.

Routing Strategy

Team Configuration

Team Size
12 devs
5 devs50 devs
Sprint Count
13 sprints
1 sprints26 sprints
Tasks per dev per sprint50
Total tasks over period7,800

Tiered Cost

$0.00

over 13 sprints · 12 developers

You save

$0.00

0% less

All-Frontier / Sprint

$337.50

100% Claude Opus 4.8

Tiered / Sprint

$202.50

Flash + Opus mix

Task Board — 20 tasks

Hover for routing rationale

Design RAG architecture for fund-prospectus searchOpus
Write compliance rules engine for SEC 17a-4Opus
Debug portfolio dashboard state race conditionOpus
Design multi-account aggregation schemaOpus
Plan agent orchestration graph for rebalancingOpus
Generate unit tests for 40 API endpointsFlash
Scaffold 15 React components from Figma specFlash
Write CRUD for client profilesFlash
Generate OpenAPI docs from route handlersFlash
Write CSV statement import transformsFlash
Add input validation to all form fieldsFlash
Write seed/fixture data for 12 account typesFlash
Create Storybook stories for UI componentsFlash
Write SQL migration for positions tableFlash
Add error boundary wrappers to all routesFlash
Generate i18n string keys for settings pagesFlash
Write Dockerfile and docker-compose for local devFlash
Add TypeScript strict-mode fixes to utils/Flash
Write E2E test flows for onboarding wizardFlash
Convert REST endpoints to tRPC proceduresFlash
15 Flash·5 Opus

Task Mix

Routine (15) → Flash
Complex (5) → Opus

Cost Comparison

All-FrontierTiered

Model Pricing (per 1M tokens)

Claude Opus 4.8$5 in / $25 out
Gemini 3.5 Flash$1.5 in / $9 out

Bottom line: A 12-person team saves $1.75K over 13 sprints by routing routine tasks to Flash — with zero quality trade-off on complex work.

Scenario 2 — Run Phase

In-App Intelligence

Same question. One answer costs 10–20× less. Can you tell?

Ask your portfolio

Select a question

Select a question above to compare model responses

Monthly Query Volume

Adjust to match your expected in-app traffic

5.0Mqueries/mo
100K10M

Monthly COGS Comparison

All-Frontier (100% Opus)

$0.000000

$1.2M/yr

Tiered Routing

$0.000000

$359K/yr

Tiered routing mix

Flash-Lite 60%Flash 20%Opus 20%
Annual Savings
$0.000000/yr
70.1% reductionvs. all-frontier at 5.0M queries/mo

Monthly: $70K saved  ·  Tiered: $30K/mo vs All-Frontier: $100K/mo

Honesty note:The 60/20/20 tiered split (Flash-Lite / Flash / Opus) reflects real query distribution in wealth management. Routine lookups and FAQs (~80%) achieve parity across models. Complex planning questions (~20%) show measurably better output from Opus — that's why they still route to frontier. The savings come from volume, not from cutting quality where it matters.

Scenario 3 — Run Phase

Multimodal & Latency

Client photographs a statement and asks “what changed?” — comparing pure-Claude, pure-Gemini, and hybrid pipelines on speed, cost, and quality.

Incoming Statements

Pipeline Race — Text Mode

Pure Claude (Opus vision)3,800 ms target
Opus OCR
Opus Analysis
Opus Structured
Pure Gemini (3.5 Flash, native)900 ms target
Flash Vision
Flash Analysis
Flash Structured
Hybrid (Flash ingest → Opus analyze)2,700 ms target
Flash Vision
Opus Analysis
Opus Structured
0 ms3,800 ms

Extend · Scenario 4

Agentic Orchestration

Every night an agent reviews 50,000 portfolios. It decomposes each rebalancing job into 18 steps—only the planner and reviewer need frontier reasoning. The 16 executor steps run on Flash.

Strategy:
Per run: $0.3925
Agent Execution TraceClick any node for details
Planner
$0.0750
Executors × 16
$0.2400 total
Reviewer
$0.0775
Claude Opus 4.8 (Frontier)
Gemini 3.5 Flash (Fast)
2 steps on Opus, 16 on Flash — tiered savings

Per-Run Cost Breakdown

Executor volume dominates total cost — this is where tiering shines.

Scale Simulator

Drag to set nightly portfolio volume. Costs scale linearly.

Portfolios / night50,000
1K25K50K75K100K

Nightly (All-Opus)

$44K

Nightly (Hybrid)

$20K

Annual Totals50,000 × 365 nights

All-Opus

$15.9M

Hybrid

$7.2M

Annual Savings

$8.8M

55% reduction

The planner thinks with Opus. The executors run with Flash.

Only 2 of 18 agent steps require frontier reasoning — the planner that decomposes the rebalancing task and the compliance reviewer that signs off. The 16 executor steps (data retrieval, calculations, formatting) are pattern-following work. Routing them to Flash cuts per-run cost from $0.8725 to $0.3925. At 50K portfolios nightly, that's $8.8M/yr saved.

Lifecycle Summary

Accumulated savings across Build → Run → Extend, driven by your current slider configurations.

Total Annual Savings
$0

Cost by Lifecycle Phase

Total Annual Comparison

Build (SDLC)
Scroll to activate
Run: In-App
Scroll to activate
Run: Multimodal
Scroll to activate
Extend: Agent
Scroll to activate