Rate Limits
Default quotas · Headers · Backoff algorithm
To prevent abuse and protect downstream models.
Default Quotas
| Resource | Limit |
|---|---|
| API overall | 20 req/sec per API key |
| Agent chat | 5 req/sec per user_id |
| Workflow run | 10 trigger/sec per API key |
| Knowledge base upload | 60 docs/min per API key |
| Management (create / delete) | 2 req/sec per API key |
Response Headers
Every response includes rate-limit status:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | Total quota in the current window |
X-RateLimit-Remaining | Remaining quota |
X-RateLimit-Reset | Reset time (Unix seconds) |
Over-Limit Response
Recommended Backoff
Honor Retry-After if present; otherwise, use the defaults above.
High-Concurrency Scenarios
If your business truly needs higher concurrency:
- Contact sales to raise the quota
- Private deployment is not subject to SaaS quotas
- Architectural: use Webhooks + async, not aggressive polling