File photo/ Chen Ting (NBD)
On July 23, Alibaba's open-sourced Qwen3-Coder, the newest member of its Qwen family of large language models purpose-built for software development, now leads every open-source rival and even surpasses closed-source heavyweights such as GPT-4.1, placing it on par with the industry-leading Claude 4.
Key breakthroughs span both pure code generation and agent tool use. With Qwen3-Coder, a junior developer can reportedly finish in one day what used to take a senior engineer a week, while an entire brand website can be finished in as short as five minutes.
Architecture & Scale
• First MoE (Mixture-of-Experts) code model in the Qwen lineage
• 480B total parameters, 35B activated
• Native 256K token context, extendable to 1M tokens
Benchmark Highlights
• WebArena & BFCL (Agent tool-use leaderboards): new open-source record, exceeds GPT-4.1
• SWE-Bench: best open-source score to date, on par with Claude 4