Qwen 3.5 Medium Model Series Released
Qwen 3.5 Medium Model Series Released
On February 24, 2026, Alibaba’s Qwen team announced the release of the Qwen 3.5 Medium Model Series, marking a significant shift in their approach to AI model development. The new lineup prioritizes architectural efficiency and high-quality data over traditional scaling methods.
New Models Available
The release includes three new models:
- Qwen3.5-122B-A10B - A large-scale Mixture-of-Experts (MoE) model with 122B total parameters and 10B active parameters
- Qwen3.5-27B - A dense mid-sized model for balanced performance and efficiency
- Qwen3.5-35B-A3B - A highly efficient MoE model with 35B total parameters and only 3B active parameters
Additionally, the Qwen3.5-Flash model was also released as part of this series.
Key Features
The Qwen 3.5 medium models introduce several architectural improvements:
-
Unified Vision-Language Foundation: Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks.
-
Architectural Efficiency: The series demonstrates that smaller AI models can be smarter through optimized architecture rather than pure parameter scaling.
-
Production Ready: These models are designed for real-world deployment with lower computational requirements while maintaining high intelligence levels.
Comprehensive Benchmark Comparison
All Qwen 3.5 Models
| Model | Total Params | Active Params | Architecture | MMLU-Pro | LiveCodeBench | AIME26 | GPQA | Context | Best For |
|---|---|---|---|---|---|---|---|---|---|
| Qwen3.5-397B-A17B | 397B | 17B | MoE | 87.8 | 83.6 | 91.3 | 78.2 | 256K | Flagship reference |
| Qwen3.5-122B-A10B | 122B | 10B | MoE | 83.5 | 78.4 | 86.7 | 74.5 | 128K | Large-scale tasks |
| Qwen3.5-35B-A3B | 35B | 3B | MoE | 79.2 | 72.1 | 81.4 | 69.8 | 64K | Ultra-efficient single-GPU |
| Qwen3.5-27B | 27B | 27B | Dense | 77.8 | 70.5 | 79.2 | 68.3 | 64K | Balanced performance |
| Qwen3.5-Flash | ~5B | ~5B | Dense | 72.4 | 65.8 | 74.1 | 62.5 | 32K | Fast inference |
Qwen 3.5 vs. GPT-5.2
| Model | MMLU | GSM8C | AIME26 | LiveCodeBench | SWE-bench | Context |
|---|---|---|---|---|---|---|
| GPT-5.2 | 91.2 | 95.8 | 89.4 | 82.1 | 86.4 | 1M |
| Qwen3.5-397B | 87.8 | 92.4 | 91.3 | 83.6 | 79.2 | 256K |
| Qwen3.5-122B | 83.5 | 88.7 | 86.7 | 78.4 | 74.1 | 128K |
| Qwen3.5-35B | 79.2 | 84.2 | 81.4 | 72.1 | 68.5 | 64K |
Qwen 3.5 vs. Claude 4.5 Opus
| Model | MMLU | GSM8C | AIME26 | LiveCodeBench | SWE-bench | Context |
|---|---|---|---|---|---|---|
| Claude 4.5 Opus | 89.8 | 94.2 | 87.6 | 80.4 | 84.7 | 200K |
| Qwen3.5-397B | 87.8 | 92.4 | 91.3 | 83.6 | 79.2 | 256K |
| Qwen3.5-122B | 83.5 | 88.7 | 86.7 | 78.4 | 74.1 | 128K |
| Qwen3.5-35B | 79.2 | 84.2 | 81.4 | 72.1 | 68.5 | 64K |
Qwen 3.5 vs. Gemini 3 Pro
| Model | MMLU | GSM8C | AIME26 | LiveCodeBench | SWE-bench | Context |
|---|---|---|---|---|---|---|
| Gemini 3 Pro | 90.4 | 93.8 | 88.2 | 81.7 | 85.1 | 1M |
| Qwen3.5-397B | 87.8 | 92.4 | 91.3 | 83.6 | 79.2 | 256K |
| Qwen3.5-122B | 83.5 | 88.7 | 86.7 | 78.4 | 74.1 | 128K |
| Qwen3.5-35B | 79.2 | 84.2 | 81.4 | 72.1 | 68.5 | 64K |
Qwen 3.5 vs. MiniMax M2.5
| Model | MMLU | GSM8C | AIME26 | LiveCodeBench | SWE-bench | Context |
|---|---|---|---|---|---|---|
| MiniMax M2.5 | 88.4 | 93.6 | 89.8 | 84.2 | 82.7 | 256K |
| Qwen3.5-397B | 87.8 | 92.4 | 91.3 | 83.6 | 79.2 | 256K |
| Qwen3.5-122B | 83.5 | 88.7 | 86.7 | 78.4 | 74.1 | 128K |
| Qwen3.5-35B | 79.2 | 84.2 | 81.4 | 72.1 | 68.5 | 64K |
Qwen 3.5 vs. GPT-OSS 120B
| Model | Total Params | Active Params | MMLU | AIME26 | LiveCodeBench | Context |
|---|---|---|---|---|---|---|
| GPT-OSS 120B | 116.8B | 5.1B | 90.0 | 90.2 | 81.5 | 128K |
| Qwen3.5-122B | 122B | 10B | 83.5 | 86.7 | 78.4 | 128K |
| Qwen3.5-35B | 35B | 3B | 79.2 | 81.4 | 72.1 | 64K |
Performance Summary
Qwen 3.5 excels in:
- Multimodal understanding: Native vision-language support outperforms Qwen3
- Agent capabilities: Strong performance on complex multi-step tasks
- Coding: Competitive with GPT-5.2 and MiniMax M2.5
- Math reasoning: AIME26 scores rival proprietary models
Competitive advantages:
- vs GPT-5.2: Closer performance on AIME26 (91.3 vs 89.4), better context window
- vs Claude 4.5 Opus: Superior on AIME26 (91.3 vs 87.6), longer context
- vs Gemini 3 Pro: Competitive on AIME26 (91.3 vs 88.2), more efficient MoE
- vs MiniMax M2.5: Slightly better on AIME26 (91.3 vs 89.8), native multimodal
- vs GPT-OSS 120B: More active params (10B vs 5.1B), better agent performance
Significance
This release signals a strategic shift in Alibaba’s Qwen approach, prioritizing:
- Architectural efficiency over traditional scaling
- High-quality training data
- Production-ready model sizes
- Lower computing power requirements
The 35B-A3B model, in particular, represents an ultra-sparse MoE architecture that could enable deployment on single GPUs while maintaining competitive performance.
Availability
The models are now available on Hugging Face and other platforms. GGUF versions from Unsloth and other community projects are expected to follow shortly, making these models accessible for local deployment.
Published: February 24, 2026