Agency owners and freelance AI consultants get burned by underquoted projects. The build looks profitable, but six months in the client's monthly token bill is eating the maintenance retainer. Here is how to quote AI projects so you actually make money — and how to talk to clients about ongoing API costs without losing the deal.
Every AI project quote should have three line items:
If you bundle all three into a single fixed monthly fee, you take on the variance risk. Clients love it. You hate it the first time a model upgrade or a usage spike eats your margin.
The agency-specific estimation method:
Client: SaaS company with 5,000 active customers. Wants to add a tier-1 support chatbot.
Estimated usage: 30% of customers use the chatbot monthly = 1,500 customers. Average 4 conversations per active user per month = 6,000 conversations. Each conversation ~3,000 input + 800 output tokens.
Token cost on Claude Haiku 3.5 (good for support quality):
Quote structure:
| Item | Amount | Notes |
|---|---|---|
| Build (one-time) | $12,500 | Discovery, integration, prompt engineering, testing |
| API usage (monthly) | $126 | Pass-through with markup, capped at 8,000 conversations/mo |
| Maintenance retainer | $1,500/mo | 15 hours/mo for prompt tuning, monitoring, model updates |
| Total month 1 | $14,126 | build + 1 month services |
| Ongoing months | $1,626 | Recurring services + API |
Build accurate AI project quotes in 60 seconds.
Open AI Cost Calculator →Client: legal firm with ~500 case files per month, each averaging 50 pages.
Token cost estimation (Claude Sonnet 4 for legal accuracy):
Quote:
| Item | Amount |
|---|---|
| Build (one-time) | $18,500 |
| API usage (monthly) | $202 — capped at 700 documents |
| Maintenance retainer | $2,400/mo (24 hrs) |
| Total ongoing | $2,602/mo |
Model A — Pass-through with markup. You bill the client for actual API usage at 2-3x markup. They see exact volume monthly. Most transparent. Best for clients who want predictable per-unit pricing.
Model B — Fixed monthly with usage cap. Bundle all costs into a flat monthly fee. Cap usage at a defined level. Charge overages. Cleanest billing. Works when usage is predictable.
Model C — Volume tiers. Define 3-4 tiers ($X/mo for up to N requests, $Y/mo for up to M requests). Client picks a tier. Upgrade when they outgrow it. Easiest for clients to understand.
Most clients have no idea what AI costs. Education is part of the sale. Use these talking points:
1. Quoting "unlimited" anything. Never. You will lose. Always cap usage in the contract.
2. Absorbing token cost without monitoring. Set up alerts at 50%, 75%, 90% of budget. Pause processing at 100%.
3. Using a flagship model when a cheap model would work. The client doesn't know the difference. Use the cheap model and pocket the margin.
4. Forgetting prompt iteration costs in the build. You will iterate on the prompt 50-200 times during build. That costs money. Add 20% to your build estimate to cover it.
5. No model upgrade clause. When OpenAI deprecates a model, you have to migrate. Build a clause that lets you charge for re-tuning prompts on the new model.
For every AI project quote, include:
Run your specific project numbers through the AI Cost Calculator before sending any quote.
Estimate API costs for any client project across every model.
Open AI Cost Calculator →