Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
At the recent Nvidia GTC conference, the company unveiled what it described as the first single-rack system of servers capable of one exaflop — one billion billion, or a quintillion, floating-point operations (FLOPS) per second. This breakthrough is based on the latest GB200 NVL72 system, which incorporates Nvidia’s latest Blackwell graphics processing units (GPUs). A standard computer rack is about 6 feet tall, a little more than 3 feet deep and less than 2 feet wide.

Shrinking an exaflop: From Frontier to Blackwell
A couple of things about the announcement struck me. First, the world’s first exaflop-capable computer was installed only a few years ago, in 2022, at Oak Ridge National Laboratory. For comparison, the “Frontier” supercomputer built by HPE and powered by AMD GPUs and CPUs, originally consisted of 74 racks of servers. The new Nvidia system has achieved roughly 73X greater performance density in just three years, equivalent to a tripling of performance every year. This advancement reflects remarkable progress in computing density, energy efficiency and architectural design.
Secondly, it needs to be said that while both systems hit the exascale milestone, they are built for different challenges, one optimized for speed, the other for precision. Nvidia’s exaflop specification is based on lower-precision math — specifically 4-bit and 8-bit floating-point operations — considered optimal for AI workloads including tasks like training and running large language models (LLMs). These calculations prioritize speed over precision. By contrast, the exaflop rating for Frontier was achieved using 64-bit double-precision math, the gold standard for scientific simulations where accuracy is critical.
We’ve come a long way (very quickly)
This level of progress seems almost unbelievable, especially as I recall the state-of-the-art when I began my career in the computing industry. My first professional job was as a programmer on the DEC KL 1090. This machine, part of DEC’s PDP-10 series of timeshare mainframes, offered 1.8 million instructions per second (MIPS). Aside from its CPU performance, the machine connected to cathode ray tube (CRT) displays via hardwired cables. There were no graphics capabilities, just light text on a dark background. And of course, no Internet. Remote users connected over phone lines using modems running at speeds up to 1,200 bits per second.
DEC System 10; Source: By Joe Mabel, CC BY-SA 3.0.
500 billion times more compute
While comparing MIPS to FLOPS gives a general sense of progress, it is important to remember that these metrics measure different computing workloads. MIPS reflects integer processing speed, which is useful for general-purpose computing, particularly in business applications. FLOPS measures floating-point performance that is crucial for scientific workloads and the heavy number-crunching behind modern AI, such as the matrix math and linear algebra used to train and run machine learning (ML) models.
While not a direct comparison, the sheer scale of the difference between MIPS then and FLOPS now provides a powerful illustration of the rapid growth in computing performance. Using these as a rough heuristic to measure work performed, the new Nvidia system is approximately 500 billion times more powerful than the DEC machine. That kind of leap exemplifies the exponential growth of computing power over a single professional career and raises the question: If this much progress is possible in 40 years, what might the next 5 bring?
Nvidia, for its part, has offered some clues. At GTC, the company shared a roadmap predicting that its next-generation full-rack system based on the “Vera Rubin” Ultra architecture will deliver 14X the performance of the Blackwell Ultra rack shipping this year, reaching somewhere between 14 and 15 exaflops in AI-optimized work in the next year or two.
Just as notable is the efficiency. Achieving this level of performance in a single rack means less physical space per unit of work, fewer materials and potentially lower energy use per operation, although the absolute power demands of these systems remain immense.
Does AI really need all that compute power?
While such performance gains are indeed impressive, the AI industry is now grappling with a fundamental question: How much computing power is truly necessary and at what cost? The race to build massive new AI data centers is being driven by the growing demands of exascale computing and ever-more capable AI models.
The most ambitious effort is the $500 billion Project Stargate, which envisions 20 data centers across the U.S., each spanning half a million square feet. A wave of other hyperscale projects is either underway or in planning stages around the world, as companies and countries scramble to ensure they have the infrastructure to support the AI workloads of tomorrow.
Some analysts now worry that we may be overbuilding AI data center capacity. Concern intensified after the release of R1, a reasoning model from China’s DeepSeek that requires significantly less compute than many of its peers. Microsoft later canceled leases with multiple data center providers, sparking speculation that it might be recalibrating its expectations for future AI infrastructure demand.
However, The Register suggested that this pullback may have more to do with some of the planned AI data centers not having sufficiently robust ability to support the power and cooling needs of next-gen AI systems. Already, AI models are pushing the limits of what present infrastructure can support. MIT Technology Review reported that this may be the reason many data centers in China are struggling and failing, having been built to specifications that are not optimal for the present need, let alone those of the next few years.
AI inference demands more FLOPs
Reasoning models perform most of their work at runtime through a process known as inference. These models power some of the most advanced and resource-intensive applications today, including deep research assistants and the emerging wave of agentic AI systems.
While DeepSeek-R1 initially spooked the industry into thinking that future AI might require less computing power, Nvidia CEO Jensen Huang pushed back hard. Speaking to CNBC, he countered this perception: “It was the exact opposite conclusion that everybody had.” He added that reasoning AI consumes 100X more computing than non-reasoning AI.
As AI continues to evolve from reasoning models to autonomous agents and beyond, demand for computing is likely to surge once again. The next breakthroughs may come not just in language or vision, but in AI agent coordination, fusion simulations or even large-scale digital twins, each made possible by the kind of computing ability leap we have just witnessed.
Seemingly right on cue, OpenAI just announced $40 billion in new funding, the largest private tech funding round on record. The company said in a blog post that the funding “enables us to push the frontiers of AI research even further, scale our compute infrastructure and deliver increasingly powerful tools for the 500 million people who use ChatGPT every week.”
Why is so much capital flowing into AI? The reasons range from competitiveness to national security. Although one particular factor stands out, as exemplified by a McKinsey headline: “AI could increase corporate profits by $4.4 trillion a year.”
What comes next? It’s anybody’s guess
At their core, information systems are about abstracting complexity, whether through an emergency vehicle routing system I once wrote in Fortran, a student achievement reporting tool built in COBOL, or modern AI systems accelerating drug discovery. The goal has always been the same: To make greater sense of the world.
Now, with powerful AI beginning to appear, we are crossing a threshold. For the first time, we may have the computing power and the intelligence to tackle problems that were once beyond human reach.
New York Times columnist Kevin Roose recently captured this moment well: “Every week, I meet engineers and entrepreneurs working on AI who tell me that change — big change, world-shaking change, the kind of transformation we’ve never seen before — is just around the corner.” And that does not even count the breakthroughs that arrive each week.
Just in the past few days, we’ve seen OpenAI’s GPT-4o generate nearly perfect images from text, Google release what may be the most advanced reasoning model yet in Gemini 2.5 Pro and Runway unveil a video model with shot-to-shot character and scene consistency, something VentureBeat notes has eluded most AI video generators until now.
What comes next is truly a guess. We do not know whether powerful AI will be a breakthrough or breakdown, whether it will help solve fusion energy or unleash new biological risks. But with ever more FLOPS coming online over the next five years, one thing seems certain: Innovation will come fast — and with force. It is clear, too, that as FLOPS scale, so must our conversations about responsibility, regulation and restraint.
Gary Grossman is EVP of technology practice at Edelman and global lead of the Edelman AI Center of Excellence.
Daily insights on business use cases with VB Daily
If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.
Thanks for subscribing. Check out more VB newsletters here.
An error occured.
GIPHY App Key not set. Please check settings