Skip to main content

Rate limits act as control measures to regulate how frequently users and applications can access our API within specified timeframes. These limits help ensure service stability, fair access, and protection against misuse so that we can serve reliable and fast inference for all. We offer a generous Free Tier, a Developer Tier for higher token consumption, and Enterprise plans for dedicated or multi-tenant instances.

Rate limits apply at the organization level, not individual users. They also apply per model.
More information on rate limits can be found here: https://console.groq.com/docs/rate-limits 

You can also view the current, exact rate limits for your organization on the limits page in your account settings.

Be the first to reply!

Reply