faq [Organic]

This is an old revision of the document!

Frequently Asked Questions

When will Synthetic get model X?

Before you ask this, make sure it is:

Open weight or open source.

Has weights available on HuggingFace (these are not always the same thing: sometimes a model is planned to be open weight, but has not been released yet).

Has an NVFP4 quantization available so that Synthetic can run it on their GPUs at optimal speed (there are some exceptions to this — if a model is sufficiently desired, they may make their own quant).

Has solid support for that model or its general architecture in sglang, the inference engine Synthetic uses to actually run the models.

Factors that can delay Synthetic getting a model:

If it is unusually large, it may take time for them to acquire or free up GPU space to host it.

If it has a novel or unusual architecture (such as DeepSeek Sparse Attention for GLM 5), it will take time for inference engines like sglang to get reliable support for the model.

If the model has not yet been quantized to NVFP4, Synthetic will have to wait for NVIDIA to do that, or make one themselves, both of which can take some time.

Additionally, it is worth keeping in mind that many models from labs known for making open weight models may either be closed source — such as Qwen 3.6-Plus — or available on the lab’s API for user testing and feedback (and to give the lab a profitable head-start) but not yet available as open weights (this was true of GLM 5.1 for a few weeks, and is true as of April 9th, 2026 for MiniMax M2.7 still).