Making AI Way More Energy Efficient | Extropic CTO

Making AI Way More Energy Efficient | Extropic CTO

▶️ Watch the Video

📝 VIDEO INFORMATION

Title: Making AI Way More Energy Efficient | Extropic CTO
Creator/Author: Trevor (CTO of Extropic)
Publication/Channel: Extropic
Date: 2025
URL/Link: https://www.youtube.com/watch?v=dRuhl6MLC78
Duration: 47 minutes
E-E-A-T Assessment:
Experience: Exceptional - Trevor is CTO of Extropic with deep expertise in physics, machine learning, and hardware design, having led development of thermodynamic sampling units and probabilistic circuits
Expertise: World-class - Demonstrates profound understanding of probabilistic computing, energy-based models, and hardware physics with first-hand experience building working prototypes
Authoritativeness: Definitive - Extropic has built functional prototypes, published research, and achieved promising early results with 10,000x efficiency gains in simulations
Trust: High - Evidence-based approach with detailed technical explanations, acknowledges limitations, and presents verifiable physics principles; openly discusses challenges and early-stage nature of research


🎯 HOOK

What if the future of AI isn’t about bigger models and more powerful GPUs, but about fundamentally rethinking how computers process information at the most basic level?


💡 ONE-SENTENCE TAKEAWAY

Extropic is developing thermodynamic sampling units, which are specialized processors that leverage probabilistic circuits to run new AI algorithms that could be up to 10,000 times more energy efficient than current GPU-based systems.


📖 SUMMARY

The Extropic CTO presents a compelling solution to AI’s growing energy crisis through a fundamentally new approach to computing hardware and algorithms. The talk begins by highlighting the massive energy demands of current AI systems, noting that today’s AI applications represent the first time average consumers are significant users of high-performance computing resources. The speaker argues that energy efficiency, not just speed and capability, will become the primary constraint on AI scaling in coming years.

Extropic’s solution consists of three interconnected innovations: a new type of integrated circuit processor called a thermodynamic sampling unit (TSU), novel probabilistic circuits that sample from mathematically defined probability distributions instead of computing deterministic functions, and new generative AI algorithms designed to leverage this hardware efficiently. The company has spent two years developing prototypes of all three components and has achieved promising early results, with simulations showing potential for 10,000x efficiency gains over GPUs on simple generative AI benchmarks.

The speaker presents a detailed analysis of why current approaches to AI scaling are unsustainable. Using calculations based on transformer energy requirements, they demonstrate that providing everyone with advanced AI assistants would require consuming 20% of the current power grid, while more sophisticated applications like video processing or expert-level reasoning would require 10-100x the current grid capacity. These projections suggest infrastructure costs in the tens to hundreds of trillions of dollars.

The talk explains why traditional efficiency improvements in computing are reaching physical limits. Transistor miniaturization has plateaued in terms of energy efficiency, and voltage reduction is constrained by fundamental thermodynamic principles. While models are becoming more efficient at fixed performance levels, the computational requirements at the frontier of AI capabilities are growing exponentially faster.

The technical solution presented centers on thermodynamic sampling units; arrays of specialized sampling cells that implement ultra-efficient random number generation. Unlike GPUs that excel at matrix multiplication, TSUs are designed to natively perform sampling from complex probability distributions, which the speaker argues is more fundamental to machine learning than deterministic computation. The approach leverages Gibbs sampling to break down complex sampling tasks into simpler operations that can be performed in parallel by the sampling cells.

Extropic has overcome key challenges in implementing this technology, particularly in building reliable probabilistic circuits using only conventional transistors. By operating transistors at very low voltages where thermal fluctuations dominate charge dynamics, they’ve created circuits that naturally sample from probability distributions without requiring exotic materials or manufacturing processes.

The talk concludes with a roadmap for scaling this technology, including plans for larger TSU chips and hybrid approaches that combine TSUs with conventional neural networks. The speaker acknowledges the early stage of the research but expresses confidence that this approach could dramatically reduce the energy requirements of AI systems, making truly ubiquitous AI more feasible.


🔍 INSIGHTS

Core Insights

AI Energy Crisis: Current AI scaling is fundamentally unsustainable due to energy constraints, with projections showing that providing everyone with advanced AI would require consuming 20-1000% of current power grid capacity.

Physical Limits Reached: Traditional computing efficiency improvements are reaching physical limits as transistor miniaturization and voltage reduction both face fundamental thermodynamic barriers.

Sampling as Core ML Operation: Machine learning is fundamentally about fitting distributions and sampling from them, yet current hardware (GPUs) is optimized for deterministic computation rather than sampling.

Probabilistic Computing Advantage: Probabilistic circuits that sample from distributions rather than computing deterministic functions could be orders of magnitude more efficient for AI tasks.

Thermal Noise as Feature: By operating transistors at very low voltages where thermal fluctuations dominate, it’s possible to create natural sampling circuits without exotic materials.

How This Connects to Broader Trends/Topics

Sustainable AI Development: Energy efficiency in computing is becoming increasingly critical as AI adoption grows and environmental concerns mount.

Specialized Hardware Revolution: New AI workloads are driving the creation of specialized hardware architectures beyond traditional GPUs and CPUs.

Algorithm-Hardware Co-Evolution: Just as GPUs and neural networks co-evolved, new algorithms are being designed specifically for probabilistic hardware.

Physics-Based Computing: Fundamental physical principles (thermodynamics, noise) are being leveraged rather than fought in new computing paradigms.

AI Democratization Challenge: Making advanced AI accessible to everyone requires solving the energy scaling problem, not just the computational complexity problem.


🛠️ FRAMEWORKS & MODELS

Thermodynamic Sampling Unit (TSU)

Components: Arrays of sampling cells containing specialized sampling circuitry, parameter computation circuitry, and state registers
Purpose: Orchestrates sampling procedures to sample from computationally useful probability distributions
Benefits: Ultra-efficient random number generation using probabilistic circuits
Implementation: Built using conventional transistors at standard foundries like TSMC

Probabilistic Circuits

Components: Circuits that sample from mathematically defined probability distributions rather than computing deterministic functions
Purpose: Takes voltages as inputs that set distribution parameters and outputs random voltages as samples
Benefits: Enables native sampling operations without energy overhead of digital random number generation
Implementation: Uses transistors operating at very low voltages where thermal fluctuations dominate

Energy-Based Models with Denoising

Components: Energy functions that directly parameterize probability distributions, combined with denoising procedures
Purpose: Learns the shape of data distributions directly rather than through convoluted denoising processes
Benefits: Requires fewer sampling steps than traditional diffusion models
Implementation: Uses TSUs to sample from complex distributions efficiently

Energy Scaling Model

Components: Formula E = P × N × M × ε where P is parameters, N is tokens, M is reasoning multiplier, and ε is energy per flop
Purpose: Predicts energy requirements for AI scenarios based on model characteristics
Benefits: Quantifies unsustainable energy requirements of current AI scaling approaches
Implementation: Uses transformer FLOP calculations and hardware energy specifications


🎯 KEY THEMES

  • Energy-First AI Design: Shifting from performance and capability optimization to energy efficiency as the primary constraint
  • Probabilistic Computing Paradigm: Moving beyond deterministic computation to embrace probabilistic approaches more aligned with machine learning fundamentals
  • Hardware-Algorithm Co-Design: Creating new algorithms specifically designed to leverage novel hardware capabilities
  • Physical Limits Navigation: Working with fundamental physical constraints rather than against them
  • Sustainable AI Scaling: Making advanced AI accessible without requiring unsustainable energy infrastructure

⚖️ COMPARISON TO OTHER WORKS

ComparisonFocusApproachConclusion
vs. The Master AlgorithmUnified learning theoryPhilosophical analysisPedro Domingos explores theoretical foundations of machine learning; Extropic focuses on practical hardware implementation of probabilistic approaches
vs. Deep Learning (Goodfellow et al.)Neural network architecturesComprehensive textbookStandard deep learning reference; Extropic challenges the assumption that neural networks are the most efficient path for all AI tasks
vs. Neuromorphic Computing ResearchBrain-inspired hardwareBiological modelingNeuromorphic approaches mimic neural structures; Extropic leverages thermodynamic principles for probabilistic computing
vs. Energy-Efficient Computing ResearchGeneral computing efficiencyHardware optimizationTraditional efficiency research focuses on deterministic computing; Extropic introduces fundamentally new probabilistic paradigm
vs. Quantum Computing for MLQuantum algorithmsQuantum mechanicsQuantum computing offers different computational advantages; Extropic provides classical probabilistic computing breakthrough

💬 QUOTES

  1. “The tools that you’ve all started to use over the last few years really represent the first time that the average person is becoming a significant consumer of the world’s high performance computing resources.”

    Context: Introduction highlighting how AI has changed computing resource consumption patterns Significance: Establishes why energy efficiency is becoming a critical constraint for AI development

  2. “In the coming years it’s going to be how efficient is it, right? Whereas in the past the problems have more been centered around speed and capability.”

    Context: Discussing the shifting constraints in computing from speed to efficiency Significance: Frames the fundamental problem that Extropic is addressing

  3. “We’re building a new type of integrated circuit processor that leverages our novel probabilistic circuits to run new generative AI algorithms.”

    Context: Brief explanation of what Extropic is building Significance: Summarizes the three-part innovation at the core of their approach

  4. “If you want the chatbot to be expert level, which I think is getting closer to like Jensen’s AI coworker, um, then it’s just like I don’t I can’t even count how many zeros are there, right? It’s a lot a lot a lot a lot of power.”

    Context: Describing the energy requirements for expert-level AI assistants Significance: Illustrates the scale of the energy problem in relatable terms

  5. “The energy dissipated when you switch a digital logic gate from one state to another can’t get that much smaller than it is today operating room temperature, right?”

    Context: Explaining why traditional computing efficiency improvements are reaching limits Significance: Provides the physical basis for why new approaches are needed

  6. “Machine learning is about fitting distributions and sampling from them right and to make that more concrete we can look at a specific example which is diffusion models.”

    Context: Explaining the fundamental nature of machine learning Significance: Establishes why sampling-based hardware might be more appropriate for AI than current architectures

  7. “Simulations of a chip we’re building now could be around 10,000 times more efficient than a VAE running on a GPU on some simple generative AI benchmark.”

    Context: Presenting early results from their research Significance: Quantifies the potential efficiency gains of their approach


📋 APPLICATIONS/HABITS

For AI Researchers and Engineers

Adopt Energy-Aware Design: Include energy consumption as a primary metric alongside accuracy and speed when evaluating AI systems and algorithms.

Explore Probabilistic Approaches: Investigate algorithms that leverage sampling and probabilistic computation rather than purely deterministic approaches.

Hardware-Algorithm Co-Design: Design new algorithms specifically for emerging hardware paradigms like probabilistic computing.

Energy Modeling: Use energy scaling models to predict and optimize the energy requirements of AI applications before deployment.

Hybrid System Design: Consider combining probabilistic and deterministic computing elements for optimal efficiency across different tasks.

For Hardware Designers and Architects

Leverage Thermal Effects: Design circuits that work with thermal noise rather than against it for probabilistic computing applications.

Probabilistic Circuit Development: Explore circuits that naturally sample from probability distributions using conventional manufacturing processes.

Energy-First Optimization: Prioritize energy efficiency over raw computational speed in AI-specific hardware design.

TSU Integration: Study how thermodynamic sampling units can complement existing GPU and CPU architectures.

Low-Voltage Design: Investigate transistor operation at voltages where thermal fluctuations enable probabilistic behavior.

For CTOs and Technology Leaders

Energy Budget Planning: Factor energy constraints into AI strategy and infrastructure planning, not just computational requirements.

Sustainable AI Investment: Evaluate AI investments based on energy efficiency and long-term sustainability, not just short-term performance gains.

Hardware Diversification: Consider probabilistic computing approaches alongside traditional GPU acceleration for future-proofing AI infrastructure.

Research Partnerships: Collaborate with companies like Extropic to access emerging energy-efficient AI technologies.

Infrastructure Assessment: Audit current AI energy consumption and plan for the exponential growth in energy demands.

For Product Managers and Business Leaders

Energy-Constrained Product Design: Design AI products with energy efficiency as a core requirement, especially for consumer-facing applications.

Ubiquitous AI Planning: Consider how energy-efficient AI could enable truly widespread adoption of AI assistants and tools.

Cost Modeling: Include energy costs in total cost of ownership calculations for AI systems and services.

Sustainability Marketing: Highlight energy efficiency as a competitive advantage in AI product positioning.

Regulatory Preparedness: Prepare for potential energy efficiency regulations in AI development and deployment.

Common Pitfalls to Avoid

Performance-Only Focus: Ignoring energy efficiency in favor of raw computational performance, leading to unsustainable scaling.

Hardware Lock-In: Committing to traditional GPU architectures without exploring probabilistic alternatives.

Short-Term Optimization: Optimizing for current energy costs without considering exponential growth in AI capabilities.

Algorithm Inertia: Sticking with established algorithms that aren’t optimized for energy-efficient hardware.

Infrastructure Underestimation: Underestimating the energy infrastructure required for advanced AI deployment.

How to Measure Success

Energy per Task: Track energy consumption per AI operation or inference rather than just operations per second.

Efficiency Gains: Compare energy efficiency improvements across different hardware and algorithm combinations.

Scaling Projections: Model energy requirements for different AI adoption scenarios and capability levels.

Total Cost Analysis: Include energy costs in comprehensive AI system cost-benefit analyses.

Sustainability Metrics: Monitor carbon footprint and environmental impact of AI systems alongside performance metrics.


📚 REFERENCES

Research and Studies

  • Extropic Research: Two years of prototype development in thermodynamic sampling units and probabilistic circuits
  • Energy-Based Models: Foundational work from the 1980s on energy functions and probability distributions
  • Transformer Energy Analysis: Studies on computational requirements and energy consumption of transformer models
  • Transistor Efficiency Limits: Research on fundamental thermodynamic constraints in semiconductor devices

Influential Technologies and Projects

  • Diffusion Models: Modern generative AI approaches that rely heavily on sampling from probability distributions
  • Gibbs Sampling: Statistical method for sampling from complex probability distributions through iterative procedures
  • VAE (Variational Autoencoders): Generative models that learn to sample from data distributions
  • H100 GPU: Current state-of-the-art AI accelerator with known energy efficiency characteristics (0.7 picojoules per flop)

Technical Concepts

  • Thermodynamic Sampling: Using thermal fluctuations in transistors to generate probabilistic behavior
  • Probabilistic Circuits: Hardware that performs sampling operations rather than deterministic computation
  • Energy-Based Models: Machine learning models that work directly with energy functions and probability distributions
  • Low-Voltage Transistor Operation: Operating regime where thermal noise dominates electrical behavior

Industry Context

  • AI Energy Consumption Trends: Current and projected energy usage patterns in AI development and deployment
  • Power Grid Capacity: Global and regional electricity generation and distribution capabilities
  • Semiconductor Manufacturing: Current capabilities at foundries like TSMC for specialized chip production

⚠️ QUALITY & TRUSTWORTHINESS NOTES

Accuracy Check

Technical Claims: Physics principles and transistor behavior are accurately described based on established semiconductor physics. Energy calculations use reasonable assumptions and established formulas.

Research Status: Clearly presented as early-stage research with simulations rather than production hardware. Efficiency claims are qualified as potential rather than achieved.

Historical Context: Computing efficiency trends and AI energy consumption patterns align with industry knowledge and published research.

No Identified Errors: Technical explanations are consistent with known physics and engineering principles.

Bias Assessment

Company Affiliation: As Extropic CTO, speaker has clear commercial bias toward their technology approach, which is expected and disclosed.

Balanced Perspective: Acknowledges early stage of research, discusses challenges and limitations, and provides reasonable counterarguments to potential criticisms.

Evidence-Based Claims: Supports assertions with technical explanations, physics principles, and early research results rather than unsubstantiated marketing claims.

Transparent Limitations: Openly discusses the need for further development and acknowledges that results are preliminary.

Source Credibility

Domain Expertise: Demonstrates deep understanding of physics, machine learning, and hardware design with first-hand experience building prototypes.

Research Track Record: Extropic has published research and built working prototypes, establishing credibility in this specialized field.

Industry Validation: References established physics principles and industry-standard energy calculations that can be independently verified.

Academic Rigor: Presentation combines technical depth with clear explanations, maintaining scholarly standards appropriate for the field.

Transparency

Clear Affiliations: Openly identifies as Extropic CTO with commercial interests in the technology.

Research Stage Disclosure: Clearly states that results are from simulations and early prototypes, not production systems.

Technical Detail: Provides sufficient technical depth for expert evaluation while remaining accessible.

Balanced Assessment: Discusses both potential benefits and current limitations of the approach.

Potential Concerns

Early-Stage Claims: Efficiency projections are based on simulations and could change with real hardware implementation.

Commercial Interests: As expected from a company CTO, presentation promotes Extropic’s approach, though with appropriate caveats.

Technical Complexity: The approach requires significant paradigm shifts that may face practical implementation challenges.

Market Uncertainty: Probabilistic computing represents a fundamental shift that may not achieve widespread adoption.

Overall Assessment

Trustworthy for Technical Content: Claims are grounded in established physics and engineering principles with appropriate qualifications about research stage.

Valuable Primary Source: Essential viewing for those interested in AI energy efficiency and novel computing paradigms.

Balanced Presentation: While commercially affiliated, the presentation maintains technical integrity and transparency.

Recommendation: Treat as authoritative on the potential of probabilistic computing for AI energy efficiency, but supplement with independent research on implementation feasibility.


Crepi il lupo! 🐺