Groq team,
I think enabling self-service upload of models and subsequent inference against them is critical to the company’s success. Specifically, I am immediately interested in uploading and using custom model weights for public models already hosted by Groq. However, I think Groq should support both public and private model uploads in the long-term:
- Private models could be billed to the customer directly based on bandwidth, size, and inference costs. Groq can schedule these models on hardware based on pre-defined service quotas that can be negotiated with customers. Private model uploads could be fully self-service, aside from quota negotiations that may occur.
- Public models could be mostly self-service, but still require application and audit by Groq employees. Having a robust application and audit process can help Groq ensure that duplicative models aren’t being introduced within its ecosystem. Additionally, customers could vote on pending applications for public models to prioritize those most critical to your user base.
I think this feature alone represents a 10-100x market opportunity for Groq:
- A serious portion of customers who would generate significant revenue for Groq need platforms that can execute inference for fine-tuned models. Customers with fine-tuned must execute inference on other platforms until Groq provides a mechanism for this.
- Groq supplies significant labor to onboard standard open-source models today, but many customers could outsource these efforts for you. Most of the labor required to onboard new models could be executed by customers.
In my opinion, providing a platform for self-service model weights upload and inference is the most critical step for Groq’s flywheel at this time. Standardizing the onboarding process, such that anyone could upload their own weights (and eventually models), will grow your customer base to critical mass. Everyone requiring AI inference would flock to Groq, given the tokens/sec, latency, and onboarding support only you could provide.
Note: This request is a continuation of an existing feature request from Discord.