Nvidia Unveils Nemotron 3 Ultra at Computex, Trails China's Kimi K2.6 in Intelligence Rankings

Nvidia unveiled Nemotron 3 Ultra on June 1 at Computex in Taipei, a 550-billion-parameter open-weight AI model that marks the company's largest open AI release to date. CEO Jensen Huang announced the model during his keynote address, positioning it as the highest-ranking U.S. open-weight model on intelligence benchmarks. The release intensifies competition in the open-weight AI space, where Chinese models including Moonshot AI's Kimi K2.6 currently lead global intelligence rankings despite Nvidia's speed advantages.

Nemotron 3 Ultra Scores 48 on Intelligence Index Benchmark

Artificial Analysis, which partnered with Nvidia on the pre-release assessment, placed Nemotron 3 Ultra at 48 on its Intelligence Index. The composite benchmark aggregates 10 evaluations spanning reasoning, coding, general knowledge, and agentic performance. The score establishes Nemotron 3 Ultra as the top U.S. open-weight model, surpassing Google's Gemma 4 31B at 39, Nvidia's own Nemotron 3 Super at 36, and OpenAI's gpt-oss-120b at 33.

The model uses a mixture-of-experts architecture with 550 billion total parameters but activates only 55 billion at any given moment. This design reduces operational costs while maintaining performance across complex reasoning tasks.

Model Delivers 300+ Tokens Per Second on Pre-Release Endpoint

Nemotron 3 Ultra served over 300 output tokens per second on a pre-release DeepInfra endpoint, according to Artificial Analysis testing. Chinese models in the same intelligence class—DeepSeek V4 Pro and Kimi K2.6—currently operate at 50–100 tokens per second through their commercial APIs. Nvidia claims the model runs five times faster than comparable open-weight alternatives with costs 30% lower.

The architecture combines Mamba-2 layers, standard Transformer attention, and mixture-of-experts routing. The model supports a 1-million-token context window and incorporates multi-token prediction (MTP), which generates several future tokens simultaneously rather than sequentially.

Kimi K2.6 Leads Open-Weight Rankings at 54 Intelligence Score

Moonshot AI's Kimi K2.6 holds the top position among open-weight models with an Intelligence Index score of 54, six points above Nemotron 3 Ultra. Released in April, Kimi K2.6 ranks fourth globally among all AI models, sitting three points behind the proprietary flagships from Anthropic, Google, and OpenAI, which tie at 57.

Chinese open-source models increased their share of global open-model usage from approximately 1.2% in late 2024 to around 30% by end of 2025, as reported in March.

Nemotron Family Spans Three Model Sizes Since 2023

Nvidia released its first Nemotron-branded model in November 2023, with the third generation announced in December 2025. The family includes three sizes: Nano for lightweight tasks, Super for mid-range enterprise applications, and Ultra for complex reasoning workloads. All three models share the hybrid architecture combining Mamba-2 layers, Transformer attention, and mixture-of-experts routing.

Nemotron 3 Super, released in March at 120 billion parameters, scored 36 on the Intelligence Index. Nemotron 3 Ultra's 12-point increase represents a significant advancement within the product line.

Nvidia Allocates $26 Billion to Open-Weight AI Development

Nvidia disclosed a five-year plan to spend $26 billion on open-weight AI development. The company formed the Nemotron Coalition in March, a group of eight AI labs including Mistral AI and Perplexity, to co-develop open frontier models on DGX Cloud infrastructure. Nvidia announced it is working on Nemotron 4, the next generation in the model family.

Model Ships June 4 Through Nvidia API and Cloud Providers

Nemotron 3 Ultra ships on June 4. The model's weights are public and training recipes are being released. Users can access the model through Nvidia's API or cloud providers without requiring dedicated datacenter hardware.

FAQ

What intelligence score did Nvidia's Nemotron 3 Ultra achieve on June 1? Nemotron 3 Ultra scored 48 on the Artificial Analysis Intelligence Index, making it the highest-ranking U.S. open-weight model. The benchmark aggregates 10 evaluations covering reasoning, coding, general knowledge, and agentic performance.

How does Nemotron 3 Ultra's speed compare to Chinese models? Nemotron 3 Ultra delivered over 300 output tokens per second on a pre-release DeepInfra endpoint, while Chinese models DeepSeek V4 Pro and Kimi K2.6 operate at 50–100 tokens per second through their commercial APIs.

When does Nvidia's Nemotron 3 Ultra become available? Nemotron 3 Ultra ships on June 4. Users can access the model through Nvidia's API or cloud providers, with public weights and training recipes being released.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments