Curated Content | Thought Leadership | Technology News

Next-Gen AI Chips Developed to Cut Energy Use

BrAIn power.
David Eberly
Contributing Writer

The advancement of AI also calls for increasing power supply demands, posing a major challenge for the future of computing. With AI energy requirements expected to grow by 50% annually through 2030, companies are racing to develop microchips that can handle inference tasks more cost-efficiently.

Inference, generating AI outputs from trained models, is becoming the focal point of energy-efficiency efforts across the tech sector.

In fact, some startups are reimagining chip design from the ground up.

At the same time, tech giants including Google, Amazon, and Microsoft are pushing their own custom silicon to reduce dependence on Nvidia’s expensive, power-hungry systems. While promising gains have been made, experts caution that the ever-growing demands of newer AI models may still outpace improvements in hardware efficiency.

Why It Matters: AI’s energy consumption is quickly becoming one of the most daunting challenges in tech. Without breakthroughs in hardware efficiency or energy production, scaling AI could hit physical and economic limits that could have profound effects.

  • Startups vs. Nvidia: Emerging chipmakers like Positron and Groq are positioning themselves as viable alternatives to Nvidia, which currently dominates the AI hardware market. These companies are focused on optimizing inference, claiming their chips offer better performance per watt and dollar. Positron, for instance, asserts that its upcoming generation of chips can deliver two to three times better performance per dollar and three to six times better energy efficiency than Nvidia’s next-gen Vera Rubin system.
  • Diverging Design Approaches: The new class of inference chips reflects different architecture strategies. Groq’s approach involves embedding memory directly into the chip, minimizing the energy and time costs associated with data movement between separate memory and processing units. This integration allows faster and more power-efficient processing, particularly for large-scale language and vision models. In contrast, Positron prioritizes simplicity and specialization. By narrowing the range of functions its chips can perform, it’s designed to accelerate specific AI tasks at unmatched efficiency. This strategy echoes how early GPUs revolutionized graphics processing by focusing on a limited but critical set of operations, which became useful in AI.
  • Big Tech Joins the Fray: Industry giants aren’t waiting on startups to solve the AI energy challenge. Google, Amazon, and Microsoft have invested heavily in proprietary chips tailored for inference, intending to reduce reliance on Nvidia and gain more control over their ecosystems. Google’s Ironwood TPU and Amazon’s Inferentia are prime examples, custom-built to optimize the execution of large AI workloads within their own cloud infrastructure. These chips allow for more predictable performance scaling while effectively reducing operational costs. In some cases, these in-house solutions are being offered to external cloud customers, broadening their reach.
  • The Energy Production Bottleneck: Experts and industry leaders warn that hardware improvements alone won’t offset the rising demands of generative AI systems. Instead, energy generation itself may become the most significant constraint on future AI expansion. Companies like Google are exploring long-term solutions such as nuclear and fusion power to meet projected demand, highlighting the growing intersection between tech innovation and energy infrastructure.
  • Cloudflare’s Strategic Bet on Positron: Cloudflare, a major internet infrastructure provider, has taken a bold step by piloting Positron’s chips in its data centers. The company’s hardware lead, Andrew Wee, who previously held senior roles at Apple and Meta, sees Positron’s technology as one of the few credible alternatives worth testing at scale. Initial results have been promising enough to warrant ongoing trials, and Cloudflare is prepared to deploy these chips globally if they continue to deliver as advertised. This move could validate Positron’s approach and accelerate a broader shift toward energy-efficient inference hardware.

Go Deeper -> The New Chips Designed to Solve AI’s Energy Problem – The Wall Street Journal

Trusted insights for technology leaders

Our readers are CIOs, CTOs, and senior IT executives who rely on The National CIO Review for smart, curated takes on the trends shaping the enterprise, from GenAI to cybersecurity and beyond.

Subscribe to our 4x a week newsletter to keep up with the insights that matter.

☀️ Subscribe to the Early Morning Byte! Begin your day informed, engaged, and ready to lead with the latest in technology news and thought leadership.

☀️ Your latest edition of the Early Morning Byte is here! Kickstart your day informed, engaged, and ready to lead with the latest in technology news and thought leadership.

ADVERTISEMENT

×
You have free article(s) left this month courtesy of CIO Partners.

Enter your username and password to access premium features.

Don’t have an account? Join the community.

Would You Like To Save Articles?

Enter your username and password to access premium features.

Don’t have an account? Join the community.

Thanks for subscribing!

We’re excited to have you on board. Stay tuned for the latest technology news delivered straight to your inbox.

Save My Spot For TNCR LIVE!

Thursday April 18th

9 AM Pacific / 11 PM Central / 12 PM Eastern

Register for Unlimited Access

Already a member?

Digital Monthly

$12.00/ month

Billed Monthly

Digital Annual

$10.00/ month

Billed Annually

Would You Like To Save Books?

Enter your username and password to access premium features.

Don’t have an account? Join the community.

Log In To Access Premium Features

Sign Up For A Free Account

Name
Newsletters