Qorven enforces two kinds of limits: gateway-level (IP and tenant, protect the server) and channel-level (per-binding, respect provider limits).
Gateway-level
From the defaultconfig.toml:
429 Too Many Requests with a Retry-After header. The CLI and web UI honour it.
Channel-level
Each channel binding has its own outbound quota. Telegram’s default:Per-provider LLM quotas
The LLM side has its own limits, enforced by the provider (OpenAI, Anthropic, etc.). Qorven handles them with automatic failover:- Try provider A’s first key
- On 429 → try A’s next key
- All of A’s keys exhausted → try provider B
- All providers exhausted → report error to the user
Quota dashboard
Web UI → Usage → Quotas shows:- IP-level: current vs. limit for connected clients
- Tenant-level: concurrent runs, req/min
- Channel-level: messages/min per binding
- Provider-level: tokens/min per key
Raising limits
For self-hosted, editconfig.toml:
Telegram — unlimited for premium bots
WhatsApp — tier 1/2/3/4/5
Slack — per-app rate buckets
Email (SMTP) — provider-specific
When quotas fire
User gets '429 try again'
User gets '429 try again'
Gateway hit the IP rate limit. Usually a client bug (retry loop). Check
qorven logs --filter rate_limit.exceeded.Channel silently drops messages
Channel silently drops messages
Channel-level outbound limit. The Qor attempted to reply, the quota said no, the reply is retried on backoff. Eventually delivers unless the backlog overflows.
LLM replies start failing
LLM replies start failing
Provider-level. Failover kicks in; if all providers are exhausted you’ll see
ERROR agent.loop.all_models_failed. Time to add more keys or a different provider.Where next
Failover
How LLM calls rotate across keys and providers.
Debounce
The inbound side: merging rapid messages.
Usage page
Monitoring tokens + cost + throughput.