Alibaba Open-Sources Qwen3-Coder, Performance Surpassing GPT-4.1

NBD

File photo/ Chen Ting (NBD)

On July 23, Alibaba's open-sourced Qwen3-Coder, the newest member of its Qwen family of large language models purpose-built for software development, now leads every open-source rival and even surpasses closed-source heavyweights such as GPT-4.1, placing it on par with the industry-leading Claude 4.

Key breakthroughs span both pure code generation and agent tool use. With Qwen3-Coder, a junior developer can reportedly finish in one day what used to take a senior engineer a week, while an entire brand website can be finished in as short as five minutes.

Architecture & Scale

• First MoE (Mixture-of-Experts) code model in the Qwen lineage

• 480B total parameters, 35B activated

• Native 256K token context, extendable to 1M tokens

Benchmark Highlights

• WebArena & BFCL (Agent tool-use leaderboards): new open-source record, exceeds GPT-4.1

• SWE-Bench: best open-source score to date, on par with Claude 4

Editor: Gao Han

Alibaba Open-Sources Qwen3-Coder, Performance Surpassing GPT-4.1

Most Popular