Skip to main content
NewsArticle AI Industry News

Alibaba Open-Sources Qwen3.6-35B-A3B, A Sparse MoE With Only 3B Active Params

Alibaba's Qwen team released Qwen3.6-35B-A3B on April 16, 2026 under Apache 2.0. Sparse MoE architecture: 35B total parameters, only 3B activated per token via 8+1 experts out of 256. Native 262k context, extensible to 1M via YaRN. Apache 2.0 license allows commercial use. Aggregate benchmarks trail Claude Opus 4.7 by roughly 18 points (77 vs 94) but close significantly on knowledge tasks; the gap widens on agentic + MCP tool use where Opus still leads.

Alibaba Open-Sources Qwen3.6-35B-A3B, A Sparse MoE With Only 3B Active Params

Alibaba’s Qwen team shipped Qwen3.6-35B-A3B on April 16, 2026 under Apache 2.0. It’s a sparse Mixture-of-Experts vision-language model with an unusually aggressive expert routing: only 3B parameters activate per token even though the full model holds 35B in weights.

What’s actually in it

Architecture:

  • Total parameters: 35B
  • Active per token: ~3B (via 256 experts, 8 routed + 1 shared per forward pass)
  • Block pattern: 10 blocks of (Gated DeltaNet → MoE) × 1
  • Context: 262,144 native, 1,010,000 extensible via YaRN
  • License: Apache 2.0 (full commercial use permitted)

Practical economics:

  • Zero licensing cost
  • Runs on a single consumer GPU if you have enough VRAM for the full 35B weights (MoE loads all experts, activates few)
  • On Apple Silicon with unified memory, practical for 32GB+ machines
  • Ollama, LM Studio, and vLLM have Day-0 support; AMD Instinct GPUs also shipped Day-0 kernels

Benchmark reality check

A viral claim circulating says Qwen 3.6 “delivers 80% of Opus 4.7’s performance.” That’s approximately correct in aggregate but hides where the gap matters.

CategoryClaude Opus 4.7Qwen 3.6 PlusQwen as % of Opus
Aggregate947782%
Agentic tasks avg74.961.682%
Coding avg72.964.889%
Knowledge tasks68.26697%
MCP Atlas (tool use)77.3%48.2%62%

The honest read: Qwen 3.6 is close on raw knowledge and not-too-far on coding, but Opus 4.7 maintains a real lead on agentic workflows and tool-use-heavy tasks. The 80% headline understates that spread.

Where Qwen wins: Speed (roughly 1.7× faster than Claude on Qwen 3.6 Plus), cost (~15× cheaper per coding-agent conversation, around $0.05 vs $0.75), and openness (Apache 2.0 beats Anthropic API lock-in for regulated or on-prem workloads).

Why this matters for 2026

Open-weight flagship parity with proprietary frontier models was the theme we flagged in the open-source-parity trend. Qwen3.6-35B-A3B, GLM-5.1, Llama 4 Scout, and Gemma 4 together close the raw-capability gap that existed through 2024. What proprietary labs still own is agentic depth, tool use reliability, and multi-step reasoning under pressure. On those dimensions, Claude Opus 4.7 and GPT-5.4 still lead.

For teams building production AI products in April 2026: Qwen 3.6 is now a credible drop-in for a meaningful slice of LLM workloads at much lower cost, with the clear caveat that agentic workflows should still route to Opus or Mythos or GPT-5.4 until the open-weight gap closes further.

Availability

Sources

Sources

Primary and corroborating references used for this news item.

3 cited sources
  1. Qwen3.6 GitHub - QwenLM/Qwen3.6
  2. Qwen Team Open-Sources Qwen3.6-35B-A3B - MarkTechPost
  3. Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 - Simon Willison

Read next

Share LinkedIn
Spotted an error or want to share your experience with Alibaba Open-Sources Qwen3.6-35B-A3B, A Sparse MoE With Only 3B Active Params?

Every tool page is re-verified on a recurring cycle, and corrections land faster when readers flag them directly. If you spot a stale fact, a missing capability, or have used Alibaba Open-Sources Qwen3.6-35B-A3B, A Sparse MoE With Only 3B Active Params and want to share what worked or didn't, the editorial desk reviews every message sent through this form.

Email editorial@aipedia.wiki