Rate limits act as control measures to regulate how frequently users and applications can access our API within specified timeframes. These limits help ensure service stability, fair access, and protection against misuse so that we can serve reliable and fast inference for all. We offer a generous Free Tier, a Developer Tier for higher token consumption, and Enterprise plans for dedicated or multi-tenant instances.
Rate limits apply at the organization level, not individual users. They also apply per model.
More information on rate limits can be found here: https://console.groq.com/docs/rate-limits
You can also view the current, exact rate limits for your organization on the limits page in your account settings.
Be the first to reply!
Reply
Login to the community
No account yet? Create an account
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.