Google Unveils Next-Gen TPU: 2 AI Chips Targeting Training and Inference
Google announced its 8th generation AI chip, the TPU, offering two distinct models: 'TPU 8t' for training and 'TPU 8i' for inference, to meet the growing computing demands of the AI agent era.
📋 Article Processing Timeline
- 📰 Published: April 22, 2026 at 21:11
- 🔍 Collected: April 22, 2026 at 21:32 (20 min after Published)
- 🤖 AI Analyzed: April 23, 2026 at 14:36 (17h 4m after Collected)
Central News Agency
(CNA Reporter Chang Hsin-yu, Las Vegas, 22nd, Exclusive Dispatch) The era of AI agents has arrived. Seeing inference as the largest future computing demand, Google today released its 8th generation AI chip, the TPU. Unlike previous generations, the new generation consists of two products: TPU 8t, which focuses on training and significantly shortens model training time, and TPU 8i, which focuses on inference and reduces data access latency.
As Artificial Intelligence (AI) moves from the conversational era into the Agentic Era, market demand for Inference continues to expand. AI leader Google today unveiled its next-generation self-developed chip, the TPU (Tensor Processing Unit), at the Google Cloud Next conference in Las Vegas, as anticipated by the market.
The new generation TPU comes in 'two models', including the TPU 8t specifically for training and the TPU 8i specifically for inference.
Compared to the previous generation Ironwood TPU, both chips offer up to a 2x improvement in performance per watt.
Before the official start of the conference, Google showcased its past generations of TPUs at a media-only event. From the first-generation chip launched in 2015 to the two custom chips unveiled this year designed for the AI agent era, camera flashes were non-stop.
Amin Vahdat, Google's Chief Technology Officer for AI and Infrastructure, stated that Google's pace of innovation continues to accelerate, moving from a new generation every 3 years, to 2 years, to 1 year. He also noted: 'The Google team realized two years ago that one chip a year is not enough; this is our first attempt at introducing two high-performance, specialized AI chips.'
For large-scale training, the TPU 8t offers a 2.8x improvement in cost-performance. Regarding memory configuration, it utilizes 216GB of High Bandwidth Memory (HBM) and is equipped with 128MB of Static Random-Access Memory (SRAM).
A single TPU 8t Superpod can scale up to 9,600 chips.
Google also announced a network architecture named Virgo, which is crucial for training ultra-large models using the TPU 8t.
The inference-focused TPU 8i has higher memory bandwidth, significantly reducing inference latency. It is equipped with 288 GB HBM and 384 MB SRAM, breaking through the 'memory wall' bottleneck of latency and high energy consumption caused by frequent data movement.
Notably, the TPU 8i utilizes a new network topology design called Boardfly to improve communication efficiency between chips.
Vahdat indicated that Google's two new chips will be available to cloud customers later this year.
Google TPUs have historically been co-developed with Broadcom, but rumors suggest MediaTek has secured a large order for the new generation inference chip. Responding to inquiries from CNA, Google stated it is inconvenient to publicly discuss details regarding supply chain partners. (Editor: Chang Chih-hsuan) 1150422
Choose to stand with the facts; every sponsorship you make is the power to protect freedom of the press.
Download the CNA 'First Hand News' APP to grasp the latest news instantly.
The text, images, and audio/video on this website may not be reproduced, publicly broadcast, publicly transmitted, or utilized without authorization.
(CNA Reporter Chang Hsin-yu, Las Vegas, 22nd, Exclusive Dispatch) The era of AI agents has arrived. Seeing inference as the largest future computing demand, Google today released its 8th generation AI chip, the TPU. Unlike previous generations, the new generation consists of two products: TPU 8t, which focuses on training and significantly shortens model training time, and TPU 8i, which focuses on inference and reduces data access latency.
As Artificial Intelligence (AI) moves from the conversational era into the Agentic Era, market demand for Inference continues to expand. AI leader Google today unveiled its next-generation self-developed chip, the TPU (Tensor Processing Unit), at the Google Cloud Next conference in Las Vegas, as anticipated by the market.
The new generation TPU comes in 'two models', including the TPU 8t specifically for training and the TPU 8i specifically for inference.
Compared to the previous generation Ironwood TPU, both chips offer up to a 2x improvement in performance per watt.
Before the official start of the conference, Google showcased its past generations of TPUs at a media-only event. From the first-generation chip launched in 2015 to the two custom chips unveiled this year designed for the AI agent era, camera flashes were non-stop.
Amin Vahdat, Google's Chief Technology Officer for AI and Infrastructure, stated that Google's pace of innovation continues to accelerate, moving from a new generation every 3 years, to 2 years, to 1 year. He also noted: 'The Google team realized two years ago that one chip a year is not enough; this is our first attempt at introducing two high-performance, specialized AI chips.'
For large-scale training, the TPU 8t offers a 2.8x improvement in cost-performance. Regarding memory configuration, it utilizes 216GB of High Bandwidth Memory (HBM) and is equipped with 128MB of Static Random-Access Memory (SRAM).
A single TPU 8t Superpod can scale up to 9,600 chips.
Google also announced a network architecture named Virgo, which is crucial for training ultra-large models using the TPU 8t.
The inference-focused TPU 8i has higher memory bandwidth, significantly reducing inference latency. It is equipped with 288 GB HBM and 384 MB SRAM, breaking through the 'memory wall' bottleneck of latency and high energy consumption caused by frequent data movement.
Notably, the TPU 8i utilizes a new network topology design called Boardfly to improve communication efficiency between chips.
Vahdat indicated that Google's two new chips will be available to cloud customers later this year.
Google TPUs have historically been co-developed with Broadcom, but rumors suggest MediaTek has secured a large order for the new generation inference chip. Responding to inquiries from CNA, Google stated it is inconvenient to publicly discuss details regarding supply chain partners. (Editor: Chang Chih-hsuan) 1150422
Choose to stand with the facts; every sponsorship you make is the power to protect freedom of the press.
Download the CNA 'First Hand News' APP to grasp the latest news instantly.
The text, images, and audio/video on this website may not be reproduced, publicly broadcast, publicly transmitted, or utilized without authorization.