Google unveils 8th-gen TPUs, agent platform, and Workspace AI layer at Cloud Next '26
Key Points
- For the first time, Google is splitting its eighth-generation TPUs into separate chips for training and inference. Instead of chasing peak single-chip performance, the company is betting on scale, linking up to one million chips in massive clusters.
- The new Gemini Enterprise Agent Platform is built to simplify the creation and safe operation of autonomous AI agents. It gives agents long-term memory for multi-step processes and aims to keep systems secure through cryptographic identities and anomaly detection.
- Google is also introducing Workspace Intelligence, a layer that centrally connects information across apps like Gmail, Docs, and Drive, so AI models can understand relationships that span multiple apps.
Google used its Cloud Next '26 conference to unveil its eighth-generation TPUs, a revamped agent platform, and a new AI layer for Workspace. The company is pitching the whole package under the banner "Agentic Enterprise."
For the first time, Google is splitting its Tensor Processing Units into two variants: TPU 8t for training and TPU 8i for inference. According to Amin Vahdat, SVP and chief technologist for AI and infrastructure, the move is a response to rising inference demands from agents that plan, act, and learn in loops.
Compared to Nvidia, Google is betting less on raw single-chip performance and more on scale. As The Register notes, Nvidia's upcoming Rubin GPUs offer more compute and significantly more memory bandwidth per chip than the TPU 8t. But when training frontier models, what matters is how many chips you can efficiently link together.
That's where Google has the edge, according to The Register. Nvidia's latest GPUs connect up to 576 accelerators in a single NVLink domain before slower Ethernet or InfiniBand links kick in. Google, by contrast, uses optical circuit switches to link 9,600 TPUs in a single pod. Its new Virgo Network can tie multiple data centers together into clusters of up to one million TPUs. A managed Lustre storage system pushes data straight into accelerator memory. Google is targeting a "goodput" rate of around 97 percent - meaning the share of time chips spend actually training rather than waiting on checkpoints or recovering from errors.
The TPU 8i inference chip trades some compute for more on-chip SRAM and faster HBM. The larger SRAM keeps more of the key-value cache - essentially the model's memory of previous responses - directly on the chip, so cores don't sit idle waiting for data. A Collective Acceleration Engine is designed to speed up mixture-of-experts models. Google also developed a network topology called Boardfly to cut chip-to-chip latency.
Both TPUs now run on Google's Arm-based Axion CPUs for the first time.
A single platform for building and running agents
On the software side, Google is bundling its existing AI services into the Gemini Enterprise Agent Platform, which builds on Vertex AI. For building, there's a tool that lets developers map out how multiple agents work together as a flowchart, plus an interface called Agent Studio for creating agents through natural language. A central registry is meant to prevent organizations from ending up with dozens of nearly identical agents.
For running agents, Google is taking aim at well-known weak spots. Long-running agents can now handle multi-step processes on their own instead of pausing for human input at every step. Sandboxed test environments let agents execute their own code or browser automations without putting host systems at risk. A Memory Bank gives agents long-term memory so they don't start from scratch with every session.
Because autonomous agents open up new attack surfaces, Google is shipping controls to match: cryptographic identities for each agent, upstream filters against prompt injection, and anomaly detection for suspicious behavior like unauthorized data access or reasoning loops that never end. Simulation tools let teams test agents against synthetic user interactions before they ever meet a real customer. How effective these safeguards actually are remains to be seen.
Available models include Gemini 3.1 Pro, Nano Banana 2, and Lyria 3, along with Anthropic's Claude Opus, Sonnet, Haiku, and the newly added Claude Opus 4.7.
The accompanying Gemini Enterprise app targets end users: employees can assemble their own agents from building blocks, track running tasks in an inbox-style view, and edit documents directly in the app.
Workspace Intelligence as a shared knowledge layer
Alongside the platform, Google is rolling out Workspace Intelligence, a layer that connects content across Gmail, Docs, Drive, Meet, and Chat. The idea is that Gemini and the agents built on top of it can understand the relationships between emails, meetings, chats, and files instead of querying each app in isolation.
In Gmail, Gemini sorts incoming messages and summarizes topics. In Google Chat, users can create calendar events or documents directly from a conversation. In Docs, Gemini drafts content from emails and files; in Sheets, it builds dashboards; in Slides, it puts together presentations. Drive Projects groups files and emails into topic-based workspaces. For companies looking to switch, Google is offering a faster migration path from Microsoft 365.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe now