Google is expanding its custom chip program with new designs aimed at improving how AI systems handle workloads. The company is preparing updated tensor processing units (TPUs) and is exploring chips built specifically for inference, where trained models generate outputs.
These efforts build on more than a decade of internal chip development tied closely to Google’s AI systems and infrastructure.
The company is also making TPUs more widely available through its cloud services. Interest has grown among major AI developers and large enterprises, many of which are testing or adopting the hardware for different training and inference use cases.
Why It Matters: Google’s investment in inference-focused chips changes how organizations evaluate infrastructure, since response speed and compute efficiency directly affect application performance and cost. Expanded access to TPUs through cloud services introduces another viable option alongside NVIDIA, which brings trade-offs around compatibility, tooling, and vendor dependence. Supply constraints and long chip development timelines add pressure to make early decisions about architecture and capacity planning as AI usage grows.
- Developing New Chips Aimed at Faster Inference Workloads: Google is working on a new class of chips focused on inference, which handles the process of generating answers from trained models. The company has indicated that separating chips for training and inference is under active consideration, as demand grows for faster response times. These chips are expected to improve performance for applications that rely on real-time outputs, including AI assistants and systems that manage multi-step tasks.
- Continuing to Evolve TPUs Through Close Integration With AI Models: TPUs remain central to Google’s infrastructure strategy. Over time, their design has been shaped by internal AI work, including large language models and reinforcement learning systems. Adjustments have been made to how chips are connected and how data moves between them to avoid underutilization. The company is also exploring trade-offs, such as reducing numerical precision to lower costs and deciding how many chips should be grouped together in a single system.
- Expanding TPU Availability to Major External Customers: Google is supplying TPUs to a growing list of companies through its cloud platform. Anthropic has secured access to large volumes of chips, while Meta has signed a multiyear agreement and is evaluating different use cases. Other organizations, including Citadel Securities and G42, are testing deployments. Google is also making its hardware easier to adopt by supporting tools like PyTorch and allowing customers to use their own scheduling software or run some systems in their own data centers.
- Improving Reliability and Large-Scale System Management: Operating large numbers of AI chips introduces challenges in fault detection and system stability. Google has built systems to quickly identify manufacturing or hardware issues that can disrupt AI workloads. Even small errors can propagate through computations and affect results, so detection and correction must happen quickly across large clusters of chips.
- Managing Supply Limits and Long Development Timelines: Chip development can take several years, while AI models continue to change during that time, making it difficult to predict future requirements. Demand for TPUs is rising, and supply constraints have led Google to prioritize certain high-profile customers. The company is also considering additional directions, such as deploying chips closer to end users to reduce delays and exploring multiple design paths in parallel in case needs change.
Go Deeper -> Google challenges Nvidia with new chips to speed up AI – Los Angeles Times
Trusted insights for technology leaders
Our readers are CIOs, CTOs, and senior IT executives who rely on The National CIO Review for smart, curated takes on the trends shaping the enterprise, from GenAI to cybersecurity and beyond.
Subscribe to our 4x a week newsletter to keep up with the insights that matter.


