Why this happens
W&B Inference enforces concurrency limits to ensure fair usage and service stability. When the number of simultaneous requests from your account exceeds the allowed limit, additional requests are rejected with a 429 status code.What you can do
-
Reduce concurrent requests
- Implement request queuing or throttling in your application
- Use exponential backoff when retrying failed requests
-
Increase your limits
- Review your plan’s concurrency limits and upgrade if needed
Inference API Errors