Skip to main content
Tool Chatbots freemium active 8-8.9
Verified May 2026 Chatbots Editorial only, no paid placements

Qwen

Active

Alibaba Cloud's open-weight LLM family. Qwen3.6 Plus (Apr 2, 2026) is the 1M-context proprietary flagship; Qwen3.6-35B-A3B (Apr 16, 2026) is the open-source sparse MoE with 3B active params under Apache 2.0.

Best plan Free (open weights) / API from ~$0.15/M tokens Free + paid plans
Best for Multilingual products across 119 languages Chatbots
Watch Users wanting a polished consumer chat app Check fit before switching
Pricing Free (open weights) / API from ~$0.15/M tokens
Launched 2023
Watchlist Qwen

Save this page locally, then revisit it when pricing, score notes, or related news changes.

Decision badges Readiness signals
Active productFree tierNo public repo listedVerified this monthMonthly review cycleStrong editorial score
Fact ledger Verified fields
Company
Alibaba Cloud
Category
Chatbots
Pricing model
Free tier
Price range
Free (open weights) / API from ~$0.15/M tokens
Status
Active
Last verified
May 3, 2026
Pricing Anchor Hosted API pricing is published through Alibaba Cloud Model Studio and depends on the selected model and token usage. Alibaba Cloud Model Studio pricing
Best For Developers who want strong open-weight models and Alibaba Cloud hosted inference options, especially for multilingual and agentic workloads. Qwen official site
Watch Out For Do not generalize from one Qwen checkpoint to the whole family. Benchmark the exact model, quantization, serving stack, and language mix you plan to use. Qwen3 blog
Model Surface Qwen should be evaluated as a model family with open releases, hosted APIs, and fast-moving version changes rather than a single chatbot product. Qwen3 blog
Deployment Surface Choose Qwen when open-weight deployment, regional availability, or Alibaba Cloud integration matters; compare license, context, and tool-use behavior per model. Qwen official site
Change timeline What moved recently
  1. Verified
    Core pricing and product facts checked May 3, 2026 | Monthly cadence
  2. Updated
    Editorial page changed May 3, 2026
  3. Major
  4. Major
  5. Major
Best for
  • Multilingual products across 119 languages
  • Developers wanting open weights for self-hosting
  • Coding, math, and agentic workloads
  • Cost-sensitive high-volume API use
Not ideal for
  • Users wanting a polished consumer chat app
  • Teams needing strict Western data residency on hosted API
  • Workloads sensitive to Alibaba cloud exposure

Alibaba Cloud’s open-weight LLM family, developed by the Qwen team and spanning text, code, vision-language, math, and reasoning. Model sizes run from 0.6B up through the 235B-parameter MoE flagship and the trillion-parameter Qwen3-Max.

The current surface includes Qwen3.6 Plus (released April 2, 2026) as the flagship proprietary model with 1M native context and always-on chain-of-thought. Qwen3-Max is Alibaba’s trillion-parameter closed model. Open-weight releases from the Qwen3 line ship under Apache 2.0 on Hugging Face. Qwen3-Coder (480B MoE with 35B active) leads coding tasks among Alibaba releases.

Qwen3.6-35B-A3B with 35B total params but only ~3B active per token via 256 experts (8 routed + 1 shared per forward pass). Native context is 262,144 tokens, extensible to ~1M via YaRN. Apache 2.0 licensed. Benchmarks aggregate ~82% of Claude Opus 4.7 performance, meaningful gap on agentic tool-use (62% on MCP Atlas) but close to parity on knowledge tasks. See the full coverage.

Recent developments

System Verdict

Pick Qwen if you need open-weight frontier models with multilingual reach. Apache 2.0 across most sizes gives real commercial flexibility. 119-language coverage, strongest among major open-weight families. Qwen3-Coder handles agentic coding at frontier-adjacent quality. Cost-per-token on Alibaba Cloud undercuts OpenAI and Anthropic by 5-10x for equivalent capability.

Skip it if you want a polished consumer chat product or strict Western data residency. qwen.ai is functional but developer-first, not ChatGPT-grade. Alibaba Cloud is a Chinese provider, which matters for regulated enterprise buyers. Competing open-weight families like DeepSeek publish stronger reasoning benchmarks on specific tasks.

Who uses which surface: Hugging Face downloads for self-hosters, Alibaba Cloud Model Studio API for hosted use, OpenRouter or DeepInfra for neutral gateways, Qwen3-Coder for IDE coding backends, Qwen3.6 Plus for agentic 1M-context workloads.

Key Facts

Flagship (proprietary)Qwen3.6 Plus (released April 2, 2026, 1M context)
Trillion-parameter modelQwen3-Max (pricing cut up to 50% in 2026 price war)
Open-weight lineQwen3 series on Apache 2.0 (0.6B through 235B MoE)
Newest open-source MoEQwen3.6-35B-A3B (April 16, 2026): 35B total, ~3B active, 262k native / 1M YaRN context, Apache 2.0
Coding flagshipQwen3-Coder (480B MoE, 35B active)
Vision flagshipQwen3.5-Omni (multimodal)
Language coverage119 languages, pre-trained on ~36T tokens
ArchitectureHybrid thinking / non-thinking mode switchable
Context windowUp to 1M tokens on Qwen3.6 Plus
Hosted API pricingFrom ~$0.15/M input (Qwen3-32B) to ~$0.70/M (Qwen3-235B Thinking)
Qwen3-Max pricing~$0.861/M input, ~$3.441/M output at launch, now reduced
Batch invocation50% off real-time pricing on supported models

Every data point above was verified on 2026-04-17. See Sources.

What it actually is

A multi-pronged model family covering four surfaces. Chat at qwen.ai and tongyi.aliyun.com. Hosted API through Alibaba Cloud Model Studio. Open-weight downloads on Hugging Face. Third-party gateway access through OpenRouter, Together AI, and DeepInfra.

The family splits into specialists. Core text models (Qwen3, Qwen3.5, Qwen3.6) handle general chat and reasoning. Qwen3-Coder is the coding-optimized variant. Qwen-VL and Qwen3.5-Omni handle vision and multimodal. QwQ-32B is a reasoning-first model in the chain-of-thought style.

The real moats are Apache 2.0 licensing on most open sizes, 119-language coverage no other major family matches, and Alibaba’s willingness to run aggressive API pricing. Thin-margin cloud pricing combined with open weights gives teams a self-host escape valve most closed-model providers cannot offer.

When to pick Qwen

  • Multilingual products. 119-language training covers Chinese, Japanese, Korean, Arabic, and European languages at higher quality than English-centric families.
  • Self-hosted deployment. Apache 2.0 weights run from single-CPU (0.6B) to 4x A100 (72B dense) to MoE clusters (235B, 480B Coder). No licensing fees.
  • Cost-sensitive API at frontier-adjacent quality. Qwen3-32B at ~$0.15/M input undercuts closed-model alternatives 5-10x.
  • Agentic coding with 1M context. Qwen3.6 Plus supports entire-codebase workflows without chunking, with always-on chain-of-thought.
  • Vision and video input. Qwen3.5 Plus and Qwen3.6 Plus accept image and video in addition to text.
  • IDE backends via OpenAI-compatible API. Drop Qwen3-Coder into Cursor, Continue.dev, or Cline.

When to pick something else

  • Polished consumer chat product: ChatGPT or Claude. qwen.ai is developer-first.
  • Strongest open-weight reasoning: DeepSeek R1 still leads on specific reasoning benchmarks.
  • Best-in-class English writing: Claude Opus 4.7. Qwen handles English well but trails Claude on nuance.
  • Google Workspace integration: Gemini. Qwen has no Workspace hooks.
  • Open-weight with Huawei Ascend training stack: GLM GLM-5.1 is the closest alternative with domestic-silicon provenance.
  • Broadest plugin marketplace: ChatGPT. No Qwen equivalent to the GPT Store.

Pricing

Hosted pricing via Alibaba Cloud Model Studio. Self-host for free under Apache 2.0 via Hugging Face.

Plan / ModelPriceNotes
Open weights (Hugging Face)FreeApache 2.0 across most Qwen3 sizes
Qwen3-32B (dense)$0.15/M input, $0.75/M outputLightweight hosted tier
Qwen3-235B-A22B$0.20-$1.20/M input, $1.00-$6.00/M outputTiered by context length
Qwen3.5 Plus$0.26/M input, $1.56/M outputFeb 2026, text + image + video input
Qwen3.6 Plus$0.325/M input, $1.95/M output1M context, agentic coding
Qwen3-MaxFrom ~$0.861/M input after 50% cutTrillion-parameter closed flagship
Qwen-Turbo$0.0004/K input, $0.0012/K outputFast, lightweight
Batch invocation50% off real-timeSupported models only

Prices verified 2026-04-17 via Alibaba Cloud Model Studio pricing, DeepInfra Qwen API pricing 2026, and OpenRouter Qwen3.6 Plus preview. Qwen3-Max rates dropped as much as 50% during the 2026 China AI price war.

Against the alternatives

Qwen3.6 PlusDeepSeek V3Claude Opus 4.7GLM-5.1
Open weightsApache 2.0 on open lineV3 openClosedMIT on GLM-5.1
Context window1M64K1M200K
Language coverage119Chinese + English focusBroad, English-strongestChinese + English
API input price~$0.325/M~$0.28/M$5.00/M$1.00/M
CodingQwen3-Coder 480B MoEStrongClaude Code CLISWE-Bench Pro leader
MultimodalText, vision, video inputLimited visionText + visionText-first
Best viewed asOpen-weight multilingualCheap capable APIReasoning specialistOpen-weight coding leader

Failure modes

  • Consumer chat product is minimal. qwen.ai is functional for testing but lacks ChatGPT-grade onboarding, memory, or ecosystem.
  • Data residency on Alibaba Cloud. Enterprise buyers in regulated industries need to evaluate the Chinese-cloud posture. Self-hosting the Apache 2.0 weights is the workaround.
  • Thin moat on open-weight leaderboard. DeepSeek, Kimi, GLM, and Qwen all iterate monthly. Leadership positions shift fast.
  • English documentation lag. Official docs translate from Chinese first. Some resources trail the Chinese original by weeks.
  • Vision models lag best-in-class. Qwen-VL and Qwen3.5-Omni are capable but trail the strongest closed vision models on independent evaluations.
  • Hosted API rate limits vary by region. Alibaba Cloud tier and regional load affect throughput. Production deployments should load-test.
  • Qwen3-Max tier pricing is complex. Tiered-by-context pricing is harder to budget than flat rates. Batch discounts help.

Methodology

This page was produced by the aipedia.wiki editorial pipeline, an automated system that ingests vendor documentation, verifies pricing and model details against primary sources, and generates the editorial analysis you are reading. No individual human wrote this review. Scoring follows the four-dimension rubric at /about/scoring/ (Utility, Value, Moat, Longevity; unweighted average). Last verified 2026-04-17 against Alibaba Cloud Model Studio pricing, Qwen3 blog, Constellation Research Qwen 3.6 Plus coverage, and DeepInfra Qwen API pricing 2026 guide.

FAQ

Is Qwen open source? Largely yes. The Qwen3 open-weight line ships under Apache 2.0 on Hugging Face, covering sizes from 0.6B to 235B MoE. Download, self-host, fine-tune, and deploy commercially without licensing fees. Qwen3-Max and Qwen3.5/3.6 Plus are proprietary hosted models.

What is Qwen3.6 Plus? Alibaba’s current proprietary flagship, released April 2, 2026. Supports 1M native context, always-on chain-of-thought, and agentic coding. Priced at $0.325/M input and $1.95/M output through Alibaba Cloud Model Studio.

What is Qwen3.6-35B-A3B (the April 16 open-source release)? The newest open-source addition. Sparse Mixture-of-Experts architecture: 35B total parameters, ~3B active per token via 256 experts (8 routed + 1 shared activated per forward pass). Native 262,144 token context, extensible to ~1M via YaRN. Apache 2.0 license permits full commercial use. Benchmarks aggregate around 82% of Claude Opus 4.7 performance at zero license cost. Gap widens on agentic tool-use (62% on MCP Atlas) but closes on knowledge tasks (97%). Runs locally via Ollama, LM Studio, Jan.ai, llama.cpp, vLLM. Day-0 support from AMD Instinct GPUs.

How does Qwen3 compare to Claude Opus 4.7? Qwen3-235B-A22B and Qwen3.6 Plus are competitive on coding and math benchmarks but trail Claude Opus 4.7 on long-form English reasoning. At roughly 10-15x lower API cost, Qwen wins on value for multilingual and coding workloads.

What is Qwen3-Coder? The coding-optimized branch, a 480B-parameter MoE with 35B active. Released as open-weight under Apache 2.0. Handles long-context agentic coding, competitive on HumanEval and SWE-bench against closed frontier models.

Can I run Qwen locally? Yes. Sizes start at 0.6B for single CPU, with 7B and 14B practical on consumer GPUs. The 72B dense model runs on 4x A100. MoE variants require larger clusters. Quantized versions extend accessibility further.

Sources

Share LinkedIn
Was this review helpful?
Embed this score on your site Free. Links back.
Qwen editorial score badge
<a href="https://aipedia.wiki/tools/qwen/" target="_blank" rel="noopener"><img src="https://aipedia.wiki/badges/qwen.svg" alt="Qwen on aipedia.wiki" width="260" height="72" /></a>
[![Qwen on aipedia.wiki](https://aipedia.wiki/badges/qwen.svg)](https://aipedia.wiki/tools/qwen/)

Badge value auto-updates if the editorial score changes. Attribution via the link is required.

Cite this page For journalists, researchers, and bloggers
According to aipedia.wiki Editorial at aipedia.wiki (https://aipedia.wiki/tools/qwen/)
aipedia.wiki Editorial. (2026). Qwen — Editorial Review. aipedia.wiki. Retrieved May 8, 2026, from https://aipedia.wiki/tools/qwen/
aipedia.wiki Editorial. "Qwen — Editorial Review." aipedia.wiki, 2026, https://aipedia.wiki/tools/qwen/. Accessed May 8, 2026.
aipedia.wiki Editorial. 2026. "Qwen — Editorial Review." aipedia.wiki. https://aipedia.wiki/tools/qwen/.
@misc{qwen-editorial-review-2026, author = {{aipedia.wiki Editorial}}, title = {Qwen — Editorial Review}, year = {2026}, publisher = {aipedia.wiki}, url = {https://aipedia.wiki/tools/qwen/}, note = {Accessed: 2026-05-08} }
Spotted an error or want to share your experience with Qwen?

Every tool page is re-verified on a recurring cycle, and corrections land faster when readers flag them directly. If you spot a stale fact, a missing capability, or have used Qwen and want to share what worked or didn't, the editorial desk reviews every message sent through this form.

Email editorial@aipedia.wiki
Report outdated info Help us keep this page accurate