
AI: China leads US in open source LLM AIs. RTZ #744
The Bigger Picture, Sunday, June 8, 2025
Much has been made of the head-line grabbing US vs China ‘AI Space Race’, allegedly for ‘national security’ reasons. Much of it driven by an AI chip race, which as we noted last Sunday, has Nvidia, our national AI chip champ, doing a delicate balancing act. Never mind that this race is an ‘infinite game’ with no near-term end in sight, the very mention gets media and politician attention, with curdled up nationalistic responses. Well, now not one but two Chinese companies are lapping the US in ‘open source’ AI language models. It’s now Alibaba AND DeepSeek vs the US’s Meta, and that’s the Bigger Picture we’re discussing this Sunday.
The Information tees us up well here with “How Alibaba Helped China Take the Lead From the U.S. in Open-Source AI”:
“Alibaba is now in the lead in open-source AI globally, ahead of Meta Platforms’ Llama on several benchmarks. And while Alibaba’s biggest model is neck and neck with DeepSeek’s R1 model, business users say they prefer Alibaba’s because it offers a broader lineup of models, including smaller ones that cost less to run than DeepSeek’s most up-to-date R1 model. Alibaba’s own business units have switched over completely to Qwen. At the same time, Alibaba is winning over outside businesses as it establishes itself as China’s biggest provider of open-source AI models.”
DeepSeek was already running away with the global open source AI crown these past few months, with universal praise for both its core LLM AI models and AI Reasoning models as well. Going up against the US’s best open and closed source models from not just Meta, but OpenAI, Anthropic et al as well.
“The success of Qwen and DeepSeek demonstrates how Chinese firms are starting to take the lead from the U.S. in open-source AI, one major front in the international AI race. That has enormous implications, as the low cost of open-source AI software means businesses are more likely to adopt it. Chinese tech giants like Alibaba could reshape the global AI software ecosystem if more developers around the world use Chinese open-source models.”
And that gives China some near-term bragging right, and market grabbing might:
““Focusing on open-source AI models could enable Chinese companies to have a global impact. Popular open-source models can tap into the collective knowledge of developers and researchers around the world who use the models, and constant feedback from those communities can help accelerate improvements,” said Martin Saerbeck, co-founder and chief technology officer of Aiquris, a Singapore-based firm that helps global businesses adopt AI and manage potential risks.”
“Chinese open-source models like those from Alibaba and DeepSeek could also help accelerate the adoption of AI in China and trigger a proliferation of domestic AI applications, both for enterprises and consumers. The potential impact is huge, given China’s vast market and the growing acceptance of open-source AI solutions among state-owned enterprises and government agencies.”
These companies are also getting praise from the best US technologists as well:
“Last week, Nvidia CEO Jensen Huang said during the company’s earnings conference call that DeepSeek and Alibaba’s Qwen are “among the best open-source AI models.” Huang also talked about how the U.S. can benefit from those Chinese open-source models by deploying and optimizing them on U.S. platforms. “America wins when models like DeepSeek and Qwen run best on American infrastructure,” he said.”
“When Nvidia’s AI research team recently developed new AI models called Cosmos-Reason1 that could be used for robots, autonomous vehicles and other applications that require an ability to understand the physical world, the team used an Alibaba open-source model as the basis for one of the Cosmos-Reason1 models, according to a paper published by Nvidia last month.”
And Alibaba has a growing range of their models as well:
“For Alibaba Cloud, China’s largest cloud service provider, a broad lineup of open-source Qwen models that come in all sizes and specifications could motivate more businesses to start using Alibaba’s cloud computing platform, according to employees.”
The ‘Who’ aspect of the story is well supplemented by the ‘How’ aspect as well:
“How Alibaba took the lead in open-source AI is a lesson for U.S. tech giants, including Amazon, Microsoft and Google, which operate in a more centralized fashion than the Chinese company. Alibaba made its decision to allow its different business units to operate autonomously as a prelude to a breakup of the company that didn’t end up happening. But it proved to be a lucky break for Alibaba, forcing its AI engineers to work harder at making the models more appealing.”
“The engineers realized that if they couldn’t convince Alibaba’s own business units Qwen models were the best, they wouldn’t be able to convince outside customers, either.”
And it’s quite the tale:
(From left: Eddie Wu, Jack Ma and Zhou Jingren, Photos by Getty)
“A Thousand Questions”
“Alibaba was one of China’s early movers in AI model development. In 2021, a year before OpenAI released ChatGPT, Alibaba’s research institute, Damo Academy, launched an AI model called M6. It was based on the transformer architecture that Google engineers had developed and that OpenAI used for its GPT generative AI models such as GPT-2, released in 2019.”
“In late 2022, when OpenAI released ChatGPT, sparking a wave of excitement in the tech industry around the world, Alibaba ramped up its efforts. It promoted Alibaba executive Zhou Jingren, a Microsoft veteran who had joined Alibaba in 2015 and had worked on M6, to be Alibaba Cloud’s chief technology officer.”
“Zhou set about developing a new generation of AI models under the name Tongyi Qianwen, or Qwen for short. In Mandarin, “Tongyi” means “extensive knowledge” and “Qianwen” means “a thousand questions.” Together, the moniker represents Alibaba’s ambition in the LLM space.”
“Alibaba Cloud unveiled the first version in April 2023 and the second, Qwen2, six months later.”
“At the time, China’s domestic race to develop LLMs was still in the early stages. Alibaba and other Chinese companies were trying to catch up with U.S. leaders like OpenAI, Anthropic, Google and Meta. Dozens of local players, tech giants and startups alike, were rushing to build their foundation models. The market was so crowded and the competition so intense that the Chinese media dubbed the phenomenon “the war of a hundred models.”
All this happened while the core company was itself going through some severe turbulence at home:
“While Alibaba was grappling with the intensifying AI race, the company went through a historic shake-up. In early 2023—in the wake of the Chinese government’s antitrust crackdown, which landed Alibaba a record $2.8 billion fine—the company announced it would split itself into six highly independent business groups under a holding company. It was both a response to Chinese regulators’ unhappiness with big tech conglomerates and an effort to rejuvenate growth within the company. Alibaba at the time said the split would allow each business unit to respond more quickly to market changes.”
“In September 2023, Alibaba’s then-CEO, Daniel Zhang, stepped down, replaced by Eddie Wu, one of the 18 founding members who built the company in 1999. Wu, an engineer who had served as chief technology officer of multiple Alibaba businesses, focused his attention primarily on AI strategy once he took the helm.”
“In the first half of 2024, Alibaba Cloud stepped up its efforts to persuade the other business units to use Qwen models for all of their AI products. Alibaba Cloud employees reached out to various units and tried to talk to teams that were working on AI applications and features. But after the 2023 reorganization, business units communicated less. Employees of one unit often had little knowledge of other units’ organizational structures or who was in charge of what.”
That pressure from Xi of course saw some relief earlier this year, as I’ve discussed.
But there was more turbulence for Qwen WITHIN Alibaba.
Open source was not at all an internal commitment yet at Alibaba. There was fierce internal competition for the best approach:
“At that time, the company’s AI development work was focusing as heavily on proprietary versions of Qwen models as on open-source versions. But over the past year, Alibaba’s priorities gradually shifted toward open-source models, as Qwen’s open-source versions began to receive more feedback from AI developer communities in both China and in the U.S., and startups, academic researchers and doctoral students started using them to build their own custom AI models.”
“In contrast, the proprietary Qwen models, which were up against the best models from OpenAI, Anthropic and Google, as well as from Chinese competitors like ByteDance, didn’t attract as much attention.”
“The Qwen team’s first major breakthrough in terms of public recognition came in late 2024, after the release of its Qwen2.5 open-source models, which received positive feedback from developers in China and the U.S. and helped establish Alibaba as one of the leaders in open-source models. Inside Alibaba, many teams developing AI applications also adopted Qwen2.5.”
“The open-source versions of Qwen 2.5, released in September last year, “significantly outperformed” Llama 3, which had come out earlier in the year, said Tony Ren, founder of agentic AI startup ReOrc.”
And then there was an external competitor from China itself for the spotlight:
“But the success of DeepSeek quickly overshadowed the brief excitement over Qwen2.5. A two-year-old offshoot of a Chinese quantitative hedge fund, DeepSeek shot to global stardom in early February as its R1 open-source reasoning model shocked the global tech industry with its strong performance and low development cost.”
Alibaba’s core business units commendably were allowed to choose their best choices, regardless of where the AI tech came from:
“Many of Alibaba’s cloud services customers asked to use the DeepSeek model, so Alibaba Cloud added R1 to its offerings of AI models. Some of Alibaba’s own AI applications and features also adopted DeepSeek. For example, Alibaba’s popular travel app, Fliggy, decided to use R1 to build its new AI travel assistant feature, AskMe, launched in April this year, according to an employee with knowledge of the matter.”
“Alibaba.com, which helps merchants outside China find products from Chinese suppliers, also integrated R1 into its AI search app, Accio. Some of Alibaba’s business intelligence teams also adopted R1 in their internal analytical tools.”
And then, Alibaba had its own ‘Founder Mode’ phase in this long tale:
“Jack Ma’s Attention”
“DeepSeek’s success put enormous pressure on the Qwen team. Even Jack Ma, Alibaba’s iconic founder, who had stepped down from executive and board roles six years ago, frequently asked Zhou, the Alibaba Cloud CTO, to provide updates on the progress of Qwen3 development, according to two people with knowledge of the matter. Ma’s attention reminded the Qwen team’s members that Qwen3 was the top priority not only for Alibaba Cloud but for Alibaba as a whole.”
“Adding to the pressure, Alibaba wanted the new models to come out before DeepSeek launched its highly anticipated successor to R1. In the office, Qwen team members sometimes took turns taking power naps at night on mattresses kept under their desks. During the final week before Qwen3’s April launch, some members only slept five or six hours in total for the whole week, according to an employee.”
Of course, in the US, there was another company also in ‘Founder Mode’ Competition on the global open source LLM AI front:
And pushed on by ‘Zuck’, they were coming on strong:
“Meanwhile, Meta’s AI team, responsible for the company’s Llama models, was working just as hard to catch up to DeepSeek and other rivals. In early April, Meta unveiled Llama 4, the latest generation of its open-source AI models, which received a lukewarm reception from some critics who said improvements from the previous generation were too incremental. That was a relief for Alibaba’s Qwen team, which internally tested Llama 4, according to two employees. They became more confident that their upcoming Qwen3 models would receive positive feedback from global AI developer communities.”
But Alibaba was coming up hard on that important lap:
“In late April, Alibaba finally released Qwen3, a suite of eight models that come in various sizes and specifications. And all eight of them were open-source models, highlighting Alibaba’s strategic priority. The company said Qwen3 can switch between “thinking mode” for performing complex tasks like math and coding, and “nonthinking mode” for quick responses to simpler prompts, depending on users’ preferences. Wu, the Alibaba CEO, said during an earnings call last month that the company is firmly committed to open-source AI. “We believe the full open sourcing of Qwen3 will drive innovation and the new applications by developers, startups and enterprises,” he said.”
And they lapped Meta Llama and DeepSeek:
“Several versions of Alibaba’s latest-generation Qwen3 models, released in late April, outperform Meta’s latest Llama 4 models, according to AI model leaderboards LiveBench and Artificial Analysis. The largest version of Qwen3 initially surpassed DeepSeek’s R1 on those leaderboards, but DeepSeek last week released an updated version of R1, which once again surpassed Qwen3.”
The models also lapped DeepSeek, which must have been sweet internally:
“Alibaba’s own AI products that previously used DeepSeek are now relying on Qwen. Fliggy, the Alibaba travel app, is switching the foundation model for its AskMe AI travel assistant from R1 to Qwen3, according to the employee with knowledge of the matter. Accio, the AI search app for merchants, is also adopting Qwen3 while phasing out its usage of R1.”
And they’re also competing hard on the enterprise front in and outside China:
“Ren, of ReOrc, which is building enterprise AI agents for customers both within and outside China, said he sees big potential to develop enterprise agents on Qwen3 for overseas customers.”
“Though Alibaba’s business units continue to operate independently, the growing importance of Qwen is helping to bring them closer. Many teams from various business units are now talking to Alibaba Cloud about their plans to develop more-capable AI agents powered by Qwen3. Employees of multiple units are also discussing potential future collaborations where units can access each others’ AI agents, so all of the agents can perform more diverse tasks for users, according to an employee with knowledge of such discussions.”
The whole riveting story is worth a read this Sunday, as it highlights the dynamic nature of LLM AI developments both within and across countries. And that is a Bigger Picture worth keeping in mind in these earliest of days in the AI Tech Wave.
This infinite race is not country vs country, but company vs company and product vs product, in a global, highly fluid tech marketplace. Stay tuned.
(NOTE: The discussions here are for information purposes only, and not meant as investment advice at any time. Thanks for joining us here)