The Godmother of AI on jobs, robots & why world models are next

Facebook X

Few individuals have had a more profound and directional impact on the modern technological landscape than Dr. Fei-Fei Li. Often dubbed the “godmother of AI,” Dr. Li is the co-creator of the legendary ImageNet dataset, a pivotal work that single-handedly ushered in the era of deep learning. Today, as the co-director of the Stanford Human-Centered AI Institute and the founder of the frontier model company World Labs, she is pushing AI beyond mere language, championing a new generation of models focused on spatial intelligence—known as World Models.
In a recent interview, Dr. Li shared her journey from the depths of the “AI Winter” to the current explosion of computational power, offering a humanist’s perspective on the technology’s future and revealing why she believes that understanding the 3D world is the key to unlocking true intelligence.
The Spark of a Revolution: From AI Winter to ImageNet
The journey to the current AI boom was not linear. Dr. Li recounted how, in the late 1990s and early 2000s, the field was mired in the “AI Winter”—a period of skepticism and low funding. Early AI research, dating back to the 1950s, focused on rule-based or small-scale neural network models, but these approaches quickly plateaued.
Dr. Li, who focused on visual intelligence, recognized a critical flaw in the existing paradigm: the models lacked the sheer volume of data required to learn patterns the way a human or animal does.
“It dawned on me that human learning as well as evolution is actually a big data learning process,” she explained. This realization led her and her students to embark on the ambitious ImageNet project in 2006. The goal was to build a comprehensive visual database by curating 15 million labeled images across a taxonomy of 22,000 object concepts from the internet.
The scale and meticulous labeling of ImageNet provided the essential training data that had been missing for decades. It was a sweet lesson in the power of data, contrasting sharply with what researchers later called the “bitter lesson”—the principle that simpler models with more data often beat complex models with less data.
The Golden Recipe and the Deep Learning Era
The true shift occurred in 2012 at the annual ImageNet challenge. A group of researchers, led by Professor Jeff Hinton, introduced a revolutionary approach:

ImageNet Data: The massive, clean, labeled dataset.
Neural Networks (Deep Learning): A new, deeper, and more powerful algorithm architecture.
GPU Compute: The use of two Nvidia GPUs—graphics cards initially designed for gaming—to handle the massive parallel computations.
This combination, which Dr. Li calls the “golden recipe for modern AI,” became the blueprint for every major AI model that followed, including today’s most sophisticated LLMs. She notes that less than a decade ago, calling a company an “AI company” was avoided, sometimes seen as a “dirty word”. Now, every company is an AI company, a change she finds deeply satisfying as someone who has dedicated her life to the field.
A Humanist’s Vision: The Future of AI and AGI
Dr. Li is an optimist, but her optimism is framed by humanism. “I’m not a utopian… I believe that whatever AI does currently or in the future is up to us”, she asserts. For her, technology is a double-edged sword, and the key to harnessing its power lies in individual and collective responsibility.
She also offers a grounded perspective on the pursuit of AGI (Artificial General Intelligence). She argues the term is often “more a marketing term than a scientific term”. While LLMs have performed astonishing feats in language, they are still far from true human-level intelligence.
Dr. Li challenges the notion that simply scaling up current models will achieve AGI, stating, “I definitely think we need more innovations”. She points to capabilities that today’s AI still lacks:
Abstraction and Creativity: AI cannot yet match the profound creative leaps of figures like Isaac Newton, who derived the laws of bodily movement from observed data.
Embodied and Emotional Intelligence: Current models struggle with complex real-world tasks that a toddler can perform, such as understanding a chaotic emergency scene or engaging in a deep conversation about motivation and passion.
This deficit in real-world understanding is what drives her to the next big leap.
The Next Frontier: World Models and Spatial Intelligence
Dr. Li believes the most critical missing ingredient for the next stage of AI is spatial intelligence, and the solution is World Models.
She contrasts this approach with LLMs: “The world is not passively watching videos passing by”. Just as the prisoners in Plato’s allegory of the cave had to make sense of a 3D world from 2D shadows, AI needs to understand the deep spatial structure and dynamics of the environment.
A World Model is defined as a foundation that allows an agent—a human or a robot—to create, reason, and interact within a deeply spatial world.
For Robots: World Models are the “key missing piece” for embodied AI. Robots need this technology to plan paths, manipulate objects, and function in the complexity of the 3D world, which is vastly more challenging than simpler tasks like autonomous driving.
For Humans: Beyond robotics, World Models will augment human agents in fields like scientific discovery—she cites the work of Watson and Crick in deducing the 3D double helix structure of DNA from 2D X-ray images—and in creative disciplines like design and art.
Bringing Worlds to Life: The Launch of Marble
To realize this vision, Dr. Li co-founded World Labs, a frontier model company anchored in deep tech research. Their first product, Marble, is an application built upon their large world model.
Marble is described as one of the world’s first models that can output genuinely 3D worlds from a simple text or image prompt—a process they call “prompt to worlds”. Users can not only generate environments but also navigate them immersively, allowing for interaction and planning.
The applications being discovered are vast:
Virtual Production (VFX): Directors are using Marble to generate 3D scenes quickly, drastically cutting production time.
Robotics Simulation: Researchers are creating diverse, synthetic environments for training complex robot tasks.
Therapy and Psychology: The ability to generate specific, immersive scenes is being explored by psychologists for research, potentially aiding in exposure therapy and the study of patient responses to different environments.
Guiding the Way: Human-Centered AI and Legacy
Dr. Li’s commitment to guiding the direction of AI extends beyond her research and commercial work. In 2018, she returned to Stanford to co-found the Human-Centered AI Institute (HAI).
The core mission of HAI is to create an AI framework anchored in human benevolence, addressing the technology’s impact on society, policy, and education. The institute has become a thought leader, supporting interdisciplinary research and engaging in crucial policy work, including advocating for national AI research and regulatory discussions.
Dr. Li’s final message is directed at everyone, not just engineers: “Everybody has a role in AI”. She urges people to see AI as a tool to augment human dignity and agency, whether they are musicians, nurses, teachers, or farmers. The technology should not dictate the future, but rather be shaped by the human needs, passions, and unique voices of its users.
Dr. Li’s career is a testament to the power of fearless intellectual curiosity. She took the simple, yet profound, idea that AI needs more data very seriously, and now she is doing the same with the idea that AI needs to understand the physical world. Her work continues to redefine the boundaries of artificial intelligence, ensuring that as machines get smarter, they remain tethered to the human experience.

About The Author

LAJONG11

See author's posts

The Godmother of AI on jobs, robots & why world models are next

About The Author

LAJONG11

Like this:

Related

Leave a ReplyCancel reply

About The Author

LAJONG11

Share this:

Like this:

Related

Leave a ReplyCancel reply

Discover more from NEWS NEST