Dwarkesh Podcast

Jeff Dean & Noam Shazeer — 25 years at Google: from PageRank to AGI

Dwarkesh Patel 2h 14m 15 months ago

Deeply researched interviews <br/><br/><a href="https://www.dwarkesh.com?utm_medium=podcast">www.dwarkesh.com</a>

0:00 –:––

Website

Show Notes

Tap timecodes to jump

This week I welcome on the show two of the most important technologists ever, in any field.

Jeff Dean is Google's Chief Scientist, and through 25 years at the company, has worked on basically the most transformative systems in modern computing: from MapReduce, BigTable, Tensorflow, AlphaChip, to Gemini.

Noam Shazeer invented or co-invented all the main architectures and techniques that are used for modern LLMs: from the Transformer itself, to Mixture of Experts, to Mesh Tensorflow, to Gemini and many other things.

We talk about their 25 years at Google, going from PageRank to MapReduce to the Transformer to MoEs to AlphaChip – and maybe soon to ASI.

My favorite part was Jeff's vision for Pathways, Google’s grand plan for a mutually-reinforcing loop of hardware and algorithmic design and for going past autoregression. That culminates in us imagining *all* of Google-the-company, going through one huge MoE model.

And Noam just bites every bullet: 100x world GDP soon; let’s get a million automated researchers running in the Google datacenter; living to see the year 3000.Watch on Youtube; listen on Apple Podcasts or Spotify.

Sponsors

Scale partners with major AI labs like Meta, Google Deepmind, and OpenAI. Through Scale’s Data Foundry, labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkesh

Curious how Jane Street teaches their new traders? They use Figgie, a rapid-fire card game that simulates the most exciting parts of markets and trading. It’s become so popular that Jane Street hosts an inter-office Figgie championship every year. Download from the app store or play on your desktop at figgie.com

Meter wants to radically improve the digital world we take for granted. They’re developing a foundation model that automates network management end-to-end. To do this, they just announced a long-term partnership with Microsoft for tens of thousands of GPUs, and they’re recruiting a world class AI research team. To learn more, go to meter.com/dwarkesh

To sponsor a future episode, visit dwarkeshpatel.com/p/advertise

Timestamps

Intro

Joining Google in 1999

Future of Moore's Law

Future TPUs

Jeff’s undergrad thesis: parallel backprop

LLMs in 2007

“Holy s**t” moments

AI fulfills Google’s original mission

Doing Search in-context

The internal coding model

What will 2027 models do?

A new architecture every day?

Automated chip design and intelligence explosion

Future of inference scaling

Already doing multi-datacenter runs

Debugging at scale

Fast takeoff and superalignment

A million evil Jeff Deans

Fun times at Google

World compute demand in 2030

Getting back to modularity

Keeping a giga-MoE in-memory

All of Google in one model

What’s missing from distillation

Open research, pros and cons

Going the distance

Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

This episode is not processed yet. Sign in to queue the transcript and make it useful for search and Q&A.

Free to start

Skim the episode first,
then decide if it is worth listening.

Start free

Google sign-in · No credit card · 10 free transcript credits

Jeff Dean & Noam Shazeer — 25 years at Google: from PageRank to AGI

Skim the episode first,then decide if it is worth listening.

Skim the episode first,
then decide if it is worth listening.