DeepSeek R1 Qwen 32B

Instruct

DeepSeek's Qwen-distilled models are compact reasoning models derived from DeepSeek-R1, achieving exceptional performance by distilling larger model reasoning patterns into smaller architectures. Spanning from 1.5B to 70B parameters, the models are based on Qwen2.5 and Llama3, with the standout DeepSeek-R1-Distill-Qwen-32B outperforming OpenAI-o1-mini and setting new dense model benchmarks. By combining reinforcement learning (RL) and supervised fine-tuning (SFT), these open-source models provide a powerful resource for advancing research and practical applications.

For instructions on accessing this model or initializing it via API, please refer to our docs.

Configuration

For more details about _model_provder--model_name, visit the model's page on Hugging Face.

NVIDIA L40S x 1

Slider

Context defines the maximum tokens the model can process at once. Smaller values improve speed but risk truncating input. Adjust it to balance performance and input needs.

This website requires your consent to use cookies for traffic analytics. Read more in our privacy policy.