800.553.8359 info@const-ins.com

DeepSeek’s blockbuster release of its R1 reasoning model on Jan. 20 unleashed a firestorm of discussion about the U.S./China technological rivalry and the wisdom of AI infrastructure spending, resulting in a sharp dip in the stock prices of many leading AI companies. But buried within all this controversy lies perhaps the least-discussed and most consequential trend in the journey toward artificial general intelligence: the increasing role of AI systems in designing, building, and refining their next-generation successors. Increasingly, AI is building AI.

In the paper accompanying the launch of R1, DeepSeek explained how it took advantage of techniques such as synthetic data generation, distillation, and machine-driven reinforcement learning to produce a model that exceeded the current state-of-the-art. Each of these approaches can be explained another way as harnessing the capabilities of an existing AI model to assist in the training of a more advanced version.

DeepSeek is far from alone in using these AI techniques to advance AI. Mark Zuckerberg predicts that the mid-level engineers at Meta may soon be replaced by AI counterparts, and that Llama 3 (his company’s LLM) “helps us experiment and iterate faster, building capabilities we want to refine and expand in Llama 4.” Nvidia CEO Jensen Huang has spoken at length about creating virtual environments in which AI systems supervise the training of robotic systems: “We can create multiple different multiverses, allowing robots to learn in parallel, possibly learning in 100,000 different ways at the same time.”

This isn’t quite yet the singularity, when intelligent machines autonomously self-replicate, but it is something new and potentially profound. Even amidst such dizzying progress in AI models, though, it’s not uncommon to hear some observers talk about the potential slowing of what’s called the “scaling laws”—the observed principles that AI models increase in performance in direct relationship to the quantity of data, power, and compute applied to them. The release from DeepSeek, and several subsequent announcements from other companies, suggests that arguments of the scaling laws’ demise may be greatly exaggerated. In fact, innovations in AI development are leading to entirely new vectors for scaling—all enabled by AI itself. Progress isn’t slowing down, it’s speeding up—thanks to AI.

Perhaps the oldest method of using AI to create AI is through synthetic data, or using data created by AI systems to further train and refine other AI systems. The term “synthetic data” implies that the generated versions of data are somehow inferior to “organic” data (i.e. the contents of the internet). In practice the opposite is proving true. Synthetic data generation allows AI systems to create realistic training examples tailored to specific domains or edge cases that might be underrepresented in real-world datasets. It’s reasonable to be skeptical of synthetic data as a limitless scaling vector—one recent paper observed that after a few rounds of synthetic data creation the models degraded quickly. Even with limitations, this capability can accelerate innovation in areas where acquiring real data might be impractical such as medical imaging or modeling protein-folding to discover new drugs.

Another key technique that DeepSeek’s release highlighted was the distillation of models, where large, computationally expensive models transfer their knowledge and capabilities to smaller, more efficient models. This process allows for the proliferation of capabilities in open-source and open-weight models, and it helps companies to make those model capabilities available to more users in the form of smaller versions of high-performing models. Distillation makes AI models more scalable by reducing their size, which will make AI models more accessible and applicable to more use-cases.

Imagine if every student began university with the accumulated knowledge of every student and professor who had gone before them. Now imagine that same student being invited to compete with hundreds of other virtual students, all with the same knowledge, with the goal of optimizing for a specific objective. This is the idea of machine-driven reinforcement learning, a technique where AI improves itself through self-play, experimentation, and refining its own thinking. This method of learning has been instrumental in some of the most famous AI breakthroughs of our time, including AlphaGo’s triumph over human players of the ancient game of Go. By leveraging AI systems to create their own training curricula, we open an entirely new vector for scale, limited only by the capacity of ever-more intelligent machines to discover new things.

One of the most remarkable applications of AI being used to refine AI is Google Gemini’s “co-scientist” model, a virtual “scientific collaborator” multi-agent AI system that is designed to replicate the process for the scientific method—but at superhuman scale and speed. Google’s AI co-scientist leverages what’s called test-time compute scaling (additional computation during the inference step) to simulate scientific reasoning, test various hypotheses, and critique its own review process over time. This additional time for computation allows for this AI model to employ a number of these techniques to use synthetic data, reinforcement learning, and agentic coordination of multiple domain-specific models to produce scientific results. It’s akin to having an army of the best-educated scientists in the world who ceaselessly compete to discover new things—an army that never tired, never complained, and constantly improved. This type of approach is not an example of AI building AI, but it shows how these new vectors for scaling have the potential to transform innovation in other sectors.

Now imagine an army of computer scientists with the goal of optimizing the development and speed of LLMs. That’s what Tokyo-based Sakana AI recently announced—an AI CUDA engineer, a fully automated multi-agent framework for the optimization of CUDA kernels, the coding functions that run on Nvidia GPUs. In other words, this is an AI system that rapidly speeds up other AI systems—10-100x faster than previous methods. AI is building AI at an ever-faster rate.

We must accept our inability to perfectly predict how these AI systems will develop and what innovation they might unlock. Most innovations are born out of trial and error over time—often many years or decades. These AI systems replicate the “trial and error process” through ceaseless experimentation at an astounding scale. We scarcely have a conception of what capabilities, even creativity, might emerge from AI systems as they tackle computation and reasoning at ever higher levels, which could soon surpass the ability of any human to even imagine.

Observers of technological progress are in for a wild ride these next few years. Even the very notion of the “innovator” will change as more breakthroughs come not from a single individual’s achievement or discovery, but from AI systems endlessly iterating. For the past several decades, humans have contributed to the field of computer science and artificial intelligence with the hope of creating AI systems that can replicate the best of human knowledge and reason. But recent developments in the field of AI suggest that we may be approaching a moment when the AI systems we have created progressively bootstrap themselves to build their successors. The next great inventors—those who discover the next critical medical treatment, create new materials, unlock the mysteries of the cosmos or the atom—may not be human at all.

The opinions expressed in Fortune.com commentary pieces are solely the views of their authors and do not necessarily reflect the opinions and beliefs of Fortune.

This story was originally featured on Fortune.com