Is Fine-Tuning Supported on Groq LPU? How Can We Leverage LPU for Training?

Hi Groq team :waving_hand:

I’m exploring whether fine-tuning (or lightweight adaptation) is supported for models that are intended to run on Groq LPU.

My primary use case is high-throughput, low-latency inference, and I’d like to understand:

Whether Groq currently supports full fine-tuning, PEFT methods (LoRA / adapters), or any training workflows that can directly leverage the LPU

If fine-tuning must be done off-platform (GPU) and then compiled / deployed for Groq inference, what constraints or best practices should be followed

How model architecture, quantization, or weight formats impact deployability on Groq after fine-tuning

Any known limitations or roadmap around training or adaptation support on Groq hardware

The goal is to adapt a model’s behavior (domain/language/style) while still fully benefiting from Groq’s deterministic performance and throughput at inference time.

Would appreciate guidance, documentation pointers, or examples from the community or Groq team.

Thanks!