4.0s
Tesla Andre Carpathy. Tesla Andre Carpathy. Music Music Hello.
7.2s
Tesla Andre Carpathy. Music
11.4s
Music Hello.
22.8s
Um, okay. Yeah. So I'm excited to be here today to talk to you about software in the era of AI. And I'm told that many of you are students like bachelors, masters, PhD and so on. And you're about to enter the industry. And I think it's actually like an extremely unique and very interesting time to enter the
24.8s
here today to talk to you about software
27.2s
in the era of AI. And I'm told that many
30.6s
of you are students like bachelors,
32.6s
masters, PhD and so on. And you're about
34.4s
to enter the industry. And I think it's
36.4s
actually like an extremely unique and
37.8s
very interesting time to enter the
39.0s
industry right now. And I think fundamentally the reason for that is that um software is changing uh again. And I say again because I actually gave this talk already. Um but the problem is that software keeps changing. So I actually have a lot of material to create new talks and I think it's changing quite fundamentally. I think
41.4s
fundamentally the reason for that is
43.0s
that um software is changing uh again.
47.6s
And I say again because I actually gave
49.9s
this talk already. Um but the problem is
52.6s
that software keeps changing. So I
54.1s
actually have a lot of material to
55.2s
create new talks and I think it's
56.7s
changing quite fundamentally. I think
58.2s
roughly speaking software has not changed much on such a fundamental level for 70 years. And then it's changed I think about twice quite rapidly in the last few years. And so there's just a huge amount of work to do a huge amount of software to write and rewrite. So let's take a look at maybe the realm of
60.3s
changed much on such a fundamental level
62.0s
for 70 years. And then it's changed I
64.6s
think about twice quite rapidly in the
66.9s
last few years. And so there's just a
68.6s
huge amount of work to do a huge amount
69.8s
of software to write and rewrite. So
72.3s
let's take a look at maybe the realm of
74.2s
software. So if we kind of think of this as like the map of software this is a really cool tool called map of GitHub. Um this is kind of like all the software that's written. Uh these are instructions to the computer for carrying out tasks in the digital space. So if you zoom in here, these are all
76.1s
as like the map of software this is a
77.8s
really cool tool called map of GitHub.
80.0s
Um this is kind of like all the software
81.9s
that's written. Uh these are
83.4s
instructions to the computer for
84.6s
carrying out tasks in the digital space.
86.4s
So if you zoom in here, these are all
88.0s
different kinds of repositories and this is all the code that has been written. And a few years ago I kind of observed that um software was kind of changing and there was kind of like a new type of software around and I called this software 2.0 at the time and the idea
90.1s
is all the code that has been written.
91.7s
And a few years ago I kind of observed
93.6s
that um software was kind of changing
95.8s
and there was kind of like a new type of
97.8s
software around and I called this
99.7s
software 2.0 at the time and the idea
102.3s
here was that software 1.0 is the code you write for the computer. Software 2.0 know are basically neural networks and in particular the weights of a neural network and you're not writing this code directly you are most you are more kind of like tuning the data sets and then you're running an optimizer to create to
104.6s
you write for the computer. Software 2.0
106.8s
know are basically neural networks and
108.8s
in particular the weights of a neural
110.3s
network and you're not writing this code
113.3s
directly you are most you are more kind
115.4s
of like tuning the data sets and then
116.9s
you're running an optimizer to create to
118.4s
create the parameters of this neural net and I think like at the time neural nets were kind of seen as like just a different kind of classifier like a decision tree or something like that and so I think it was kind of like um I think this framing was a lot more appropriate and now actually what we
120.9s
and I think like at the time neural nets
122.6s
were kind of seen as like just a
123.6s
different kind of classifier like a
124.8s
decision tree or something like that and
126.2s
so I think it was kind of like um I
129.0s
think this framing was a lot more
130.2s
appropriate and now actually what we
132.2s
have is kind of like an equivalent of GitHub in the realm of software 2.0 And I think the hugging face is basically equivalent of GitHub in software 2.0. And there's also model atlas and you can visualize all the code written there. In case you're curious, by the way, the giant circle, the point in the middle,
133.5s
GitHub in the realm of software 2.0 And
135.8s
I think the hugging face is basically
138.1s
equivalent of GitHub in software 2.0.
140.7s
And there's also model atlas and you can
142.4s
visualize all the code written there. In
144.2s
case you're curious, by the way, the
145.4s
giant circle, the point in the middle,
148.3s
uh these are the parameters of flux, the image generator. And so anytime someone tunes a on top of a flux model, you basically create a git commit uh in this space and uh you create a different kind of a image generator. So basically what we have is software 1.0 is the computer code that programs a computer. Software
150.9s
image generator. And so anytime someone
152.9s
tunes a on top of a flux model, you
155.0s
basically create a git commit uh in this
157.1s
space and uh you create a different kind
159.1s
of a image generator. So basically what
161.6s
we have is software 1.0 is the computer
163.6s
code that programs a computer. Software
165.9s
2.0 are the weights which program neural networks. Uh and here's an example of Alexet image recognizer neural network. Now so far all of the neural networks that we've been familiar with until recently where kind of like fixed function computers image to categories or something like that. And I think what's changed and I think is a quite
168.7s
networks. Uh and here's an example of
170.7s
Alexet image recognizer neural network.
173.5s
Now so far all of the neural networks
175.0s
that we've been familiar with until
176.4s
recently where kind of like fixed
178.2s
function computers image to categories
181.7s
or something like that. And I think
183.4s
what's changed and I think is a quite
185.2s
fundamental change is that neural networks became programmable with large language models. And so I I see this as quite new, unique. It's a new kind of a computer and uh so in my mind it's uh worth giving it a new designation of software 3.0. And basically your prompts are now programs that program the LLM.
186.7s
networks became programmable with large
189.6s
language models. And so I I see this as
192.2s
quite new, unique. It's a new kind of a
195.0s
computer and uh so in my mind it's uh
198.0s
worth giving it a new designation of
199.6s
software 3.0. And basically your prompts
202.2s
are now programs that program the LLM.
205.7s
And uh remarkably uh these uh prompts are written in English. So it's kind of a very interesting programming language. Um so maybe uh to summarize the difference if you're doing sentiment classification for example you can imagine writing some uh amount of Python to to basically do sentiment classification or you can train a neural
208.3s
are written in English. So it's kind of
210.4s
a very interesting programming language.
213.6s
Um so maybe uh to summarize the
216.8s
difference if you're doing sentiment
217.9s
classification for example you can
219.4s
imagine writing some uh amount of Python
222.5s
to to basically do sentiment
224.2s
classification or you can train a neural
226.0s
net or you can prompt a large language model. Uh so here this is a few short prompt and you can imagine changing it and programming the computer in a slightly different way. So basically we have software 1.0 software 2.0 and I think we're seeing maybe you've seen a lot of GitHub code is not just like code
227.8s
model. Uh so here this is a few short
230.0s
prompt and you can imagine changing it
231.3s
and programming the computer in a
232.8s
slightly different way. So basically we
234.6s
have software 1.0 software 2.0 and I
237.6s
think we're seeing maybe you've seen a
239.7s
lot of GitHub code is not just like code
241.9s
anymore. there's a bunch of like English interspersed with code and so I think kind of there's a growing category of new kind of code. So not only is it a new programming paradigm, it's also remarkable to me that it's in our native language of English. And so when this
243.5s
interspersed with code and so I think
245.4s
kind of there's a growing category of
247.4s
new kind of code. So not only is it a
249.2s
new programming paradigm, it's also
250.9s
remarkable to me that it's in our native
252.7s
language of English. And so when this
254.9s
blew my mind a few uh I guess years ago now I tweeted this and um I think it captured the attention of a lot of people and this is my currently pinned tweet uh is that remarkably we're now programming computers in English. Now, when I was at uh Tesla, um we were
257.9s
now I tweeted this and um I think it
260.9s
captured the attention of a lot of
261.9s
people and this is my currently pinned
263.2s
tweet uh is that remarkably we're now
265.4s
programming computers in English. Now,
268.2s
when I was at uh Tesla, um we were
271.6s
working on the uh autopilot and uh we were trying to get the car to drive and I sort of showed this slide at the time where you can imagine that the inputs to the car are on the bottom and they're going through a software stack to produce the steering and acceleration
275.0s
were trying to get the car to drive and
277.4s
I sort of showed this slide at the time
279.9s
where you can imagine that the inputs to
281.7s
the car are on the bottom and they're
283.2s
going through a software stack to
284.6s
produce the steering and acceleration
287.0s
and I made the observation at the time that there was a ton of C code around in the autopilot which was the software 1.0 code and then there was some neural nets in there doing image recognition and uh I kind of observed that over time as we made the autopilot better basically the neural network grew in
288.6s
that there was a ton of C code around
291.1s
in the autopilot which was the software
292.7s
1.0 code and then there was some neural
294.5s
nets in there doing image recognition
297.0s
and uh I kind of observed that over time
298.8s
as we made the autopilot better
300.9s
basically the neural network grew in
302.7s
capability and size and in addition to that all the C code was being deleted and kind of like was um and a lot of the kind of capabilities and functionality that was originally written in 1.0 was migrated to 2.0. So as an example, a lot of the stitching up of information across images from the different cameras
305.8s
that all the C code was being deleted
308.6s
and kind of like was um and a lot of the
312.1s
kind of capabilities and functionality
314.6s
that was originally written in 1.0 was
316.5s
migrated to 2.0. So as an example, a lot
319.0s
of the stitching up of information
320.7s
across images from the different cameras
322.6s
and across time was done by a neural network and we were able to delete a lot of code and so the software 2.0 stack quite literally ate through the software stack of the autopilot. So I thought this was really remarkable at the time and I think we're seeing the same thing
325.0s
network and we were able to delete a lot
326.5s
of code and so the software 2.0 stack
329.8s
quite literally ate through the software
332.6s
stack of the autopilot. So I thought
334.2s
this was really remarkable at the time
335.7s
and I think we're seeing the same thing
337.0s
again where uh basically we have a new kind of software and it's eating through the stack. We have three completely different programming paradigms and I think if you're entering the industry it's a very good idea to be fluent in all of them because they all have slight pros and cons and you may want to
339.4s
kind of software and it's eating through
340.8s
the stack. We have three completely
342.5s
different programming paradigms and I
344.4s
think if you're entering the industry
345.6s
it's a very good idea to be fluent in
347.4s
all of them because they all have slight
349.4s
pros and cons and you may want to
350.8s
program some functionality in 1.0 or 2.0 or 3.0. Are you going to train neurallet? Are you going to just prompt an LLM? Should this be a piece of code that's explicit etc. So we all have to make these decisions and actually potentially uh fluidly trans transition between these paradigms. So what I
353.1s
or 3.0. Are you going to train
354.4s
neurallet? Are you going to just prompt
355.6s
an LLM? Should this be a piece of code
357.4s
that's explicit etc. So we all have to
359.4s
make these decisions and actually
360.6s
potentially uh fluidly trans transition
363.5s
between these paradigms. So what I
366.8s
wanted to get into now is first I want to in the first part talk about LLMs and how to kind of like think of this new paradigm and the ecosystem and what that looks like. Uh like what are what is this new computer? What does it look like and what does the ecosystem look
369.8s
to in the first part talk about LLMs and
371.8s
how to kind of like think of this new
373.5s
paradigm and the ecosystem and what that
375.1s
looks like. Uh like what are what is
377.4s
this new computer? What does it look
378.7s
like and what does the ecosystem look
380.2s
like? Um I was struck by this quote from Anduring actually uh many years ago now I think and I think Andrew is going to be speaking right after me. Uh but he said at the time AI is the new electricity and I do think that it um kind of captures something very interesting in that LLMs certainly feel
383.8s
Anduring actually uh many years ago now
385.8s
I think and I think Andrew is going to
387.5s
be speaking right after me. Uh but he
389.4s
said at the time AI is the new
390.6s
electricity and I do think that it um
393.4s
kind of captures something very
394.6s
interesting in that LLMs certainly feel
396.7s
like they have properties of utilities right now. So um LLM labs like OpenAI, Gemini, Enthropic etc. They spend capex to train the LLMs and this is kind of equivalent to building out a grid and then there's opex to serve that intelligence over APIs to all of us and this is done through metered access where we pay per
399.0s
right now. So
401.6s
um LLM labs like OpenAI, Gemini,
404.2s
Enthropic etc. They spend capex to train
407.1s
the LLMs and this is kind of equivalent
408.9s
to building out a grid and then there's
411.1s
opex to serve that intelligence over
413.0s
APIs to all of us and this is done
416.4s
through metered access where we pay per
418.6s
million tokens or something like that and we have a lot of demands that are very utility- like demands out of this API we demand low latency high uptime consistent quality etc. In electricity, you would have a transfer switch. So you can transfer your electricity source from like grid and solar or battery or
420.4s
and we have a lot of demands that are
421.9s
very utility- like demands out of this
423.9s
API we demand low latency high uptime
426.2s
consistent quality etc. In electricity,
429.0s
you would have a transfer switch. So you
430.8s
can transfer your electricity source
432.4s
from like grid and solar or battery or
434.4s
generator. In LLM, we have maybe open router and easily switch between the different types of LLMs that exist. Because the LLM are software, they don't compete for physical space. So it's okay to have basically like six electricity providers and you can switch between them, right? Because they don't compete in such a direct way. And I think what's
436.8s
router and easily switch between the
438.6s
different types of LLMs that exist.
440.6s
Because the LLM are software, they don't
443.0s
compete for physical space. So it's okay
445.0s
to have basically like six electricity
446.7s
providers and you can switch between
448.2s
them, right? Because they don't compete
449.8s
in such a direct way. And I think what's
451.9s
also a little fascinating and we saw this in the last few days actually a lot of the LLMs went down and people were kind of like stuck and unable to work. And uh I think it's kind of fascinating to me that when the state-of-the-art LLMs go down, it's actually kind of like an intelligence brownout in the world.
453.7s
this in the last few days actually a lot
456.5s
of the LLMs went down and people were
458.8s
kind of like stuck and unable to work.
461.1s
And uh I think it's kind of fascinating
462.5s
to me that when the state-of-the-art
463.8s
LLMs go down, it's actually kind of like
465.8s
an intelligence brownout in the world.
467.8s
It's kind of like when the voltage is unreliable in the grid and uh the planet just gets dumber the more reliance we have on these models, which already is like really dramatic and I think will continue to grow. But LLM's don't only have properties of utilities. I think it's also fair to say that they have
469.4s
unreliable in the grid and uh the planet
472.1s
just gets dumber the more reliance we
475.1s
have on these models, which already is
476.7s
like really dramatic and I think will
478.4s
continue to grow. But LLM's don't only
480.8s
have properties of utilities. I think
482.2s
it's also fair to say that they have
483.5s
some properties of fabs. And the reason for this is that the capex required for building LLM is actually quite large. Uh it's not just like building some uh power station or something like that, right? You're investing a huge amount of money and I think the tech tree and uh for the technology is growing quite
486.5s
for this is that the capex required for
489.5s
building LLM is actually quite large. Uh
492.2s
it's not just like building some uh
494.3s
power station or something like that,
495.9s
right? You're investing a huge amount of
497.6s
money and I think the tech tree and uh
500.0s
for the technology is growing quite
502.5s
rapidly. So we're in a world where we have sort of deep tech trees, research and development secrets that are centralizing inside the LLM labs. Um and but I think the analogy muddies a little bit also because as I mentioned this is software and software is a bit less defensible because it is so malleable.
504.4s
have sort of deep tech trees, research
507.0s
and development secrets that are
509.0s
centralizing inside the LLM labs. Um and
512.4s
but I think the analogy muddies a little
514.2s
bit also because as I mentioned this is
516.2s
software and software is a bit less
518.2s
defensible because it is so malleable.
521.0s
And so um I think it's just an interesting kind of thing to think about potentially. There's many analogy analogies you can make like a 4 nanometer process node maybe is something like a cluster with certain max flops. You can think about when you're use when you're using Nvidia GPUs and you're only doing the software and
523.0s
interesting kind of thing to think about
524.3s
potentially. There's many analogy
526.6s
analogies you can make like a 4
528.2s
nanometer process node maybe is
529.6s
something like a cluster with certain
531.0s
max flops. You can think about when
533.0s
you're use when you're using Nvidia GPUs
534.8s
and you're only doing the software and
536.1s
you're not doing the hardware. That's kind of like the fabless model. But if you're actually also building your own hardware and you're training on TPUs if you're Google, that's kind of like the Intel model where you own your fab. So I think there's some analogies here that make sense. But actually I think the
537.1s
kind of like the fabless model. But if
539.1s
you're actually also building your own
540.3s
hardware and you're training on TPUs if
542.0s
you're Google, that's kind of like the
543.3s
Intel model where you own your fab. So I
545.2s
think there's some analogies here that
546.4s
make sense. But actually I think the
548.2s
analogy that makes the most sense perhaps is that in my mind LLM have very strong kind of analogies to operating systems. Uh in that this is not just electricity or water. It's not something that comes out of the tap as a commodity. uh this is these are now increasingly complex software ecosystems
549.8s
perhaps is that in my mind LLM have very
552.5s
strong kind of analogies to operating
555.3s
systems. Uh in that this is not just
557.8s
electricity or water. It's not something
559.5s
that comes out of the tap as a
561.0s
commodity. uh this is these are now
563.0s
increasingly complex software ecosystems
565.9s
right so uh they're not just like simple commodities like electricity and it's kind of interesting to me that the ecosystem is shaping in a very similar kind of way where you have a few closed source providers like Windows or Mac OS and then you have an open source alternative like Linux and I think for u
568.7s
commodities like electricity and it's
570.9s
kind of interesting to me that the
572.0s
ecosystem is shaping in a very similar
573.9s
kind of way where you have a few closed
576.2s
source providers like Windows or Mac OS
578.6s
and then you have an open source
579.8s
alternative like Linux and I think for u
582.7s
neural for LLMs as well we have a kind of a few competing closed source providers and then maybe the llama ecosystem is currently like maybe a close approximation to something that may grow into something like Linux. Again, I think it's still very early because these are just simple LLMs, but we're starting to see that these are
585.5s
of a few competing closed source
587.5s
providers and then maybe the llama
589.2s
ecosystem is currently like maybe a
591.4s
close approximation to something that
593.1s
may grow into something like Linux.
595.1s
Again, I think it's still very early
596.5s
because these are just simple LLMs, but
598.2s
we're starting to see that these are
599.6s
going to get a lot more complicated. It's not just about the LLM itself. It's about all the tool use and the multiodalities and how all of that works. And so when I sort of had this realization a while back, I tried to sketch it out and it kind of seemed to
601.1s
It's not just about the LLM itself. It's
602.8s
about all the tool use and the
603.9s
multiodalities and how all of that
605.5s
works. And so when I sort of had this
607.3s
realization a while back, I tried to
609.4s
sketch it out and it kind of seemed to
611.2s
me like LLMs are kind of like a new operating system, right? So the LLM is a new kind of a computer. It's sitting it's kind of like the CPU equivalent. uh the context windows are kind of like the memory and then the LLM is orchestrating memory and compute uh for problem solving um using all of these uh
612.8s
operating system, right? So the LLM is a
615.8s
new kind of a computer. It's sitting
617.6s
it's kind of like the CPU equivalent. uh
619.8s
the context windows are kind of like the
621.5s
memory and then the LLM is orchestrating
624.4s
memory and compute uh for problem
626.6s
solving um using all of these uh
629.8s
capabilities here and so definitely if you look at it looks very much like operating system from that perspective. Um, a few more analogies. For example, if you want to download an app, say I go to VS Code and I go to download, you can download VS Code and you can run it on
632.6s
you look at it looks very much like
634.3s
operating system from that perspective.
636.5s
Um, a few more analogies. For example,
638.9s
if you want to download an app, say I go
641.2s
to VS Code and I go to download, you can
643.7s
download VS Code and you can run it on
646.2s
Windows, Linux or or Mac in the same way as you can take an LLM app like cursor and you can run it on GPT or cloud or Gemini series, right? It's just a drop down. So, it's kind of like similar in that way as well. uh more analogies that I think strike me
650.2s
as you can take an LLM app like cursor
653.1s
and you can run it on GPT or cloud or
655.5s
Gemini series, right? It's just a drop
657.4s
down. So, it's kind of like similar in
659.0s
that way as well.
660.7s
uh more analogies that I think strike me
662.4s
is that we're kind of like in this is that we're kind of like in this 1960sish era where LLM compute is still very expensive for this new kind of a computer and that forces the LLMs to be centralized in the cloud and we're all just uh sort of thing clients that interact with it over the network and
664.3s
is that we're kind of like in this 1960sish
665.9s
era where LLM compute is still very
669.0s
expensive for this new kind of a
670.7s
computer and that forces the LLMs to be
673.4s
centralized in the cloud and we're all
675.8s
just uh sort of thing clients that
678.4s
interact with it over the network and
680.3s
none of us have full utilization of these computers and therefore it makes sense to use time sharing where we're all just you know a dimension of the batch when they're running the computer in the cloud. And this is very much what computers used to look like at during this time. The operating systems were in
682.1s
these computers and therefore it makes
684.2s
sense to use time sharing where we're
686.4s
all just you know a dimension of the
688.3s
batch when they're running the computer
690.0s
in the cloud. And this is very much what
692.0s
computers used to look like at during
693.4s
this time. The operating systems were in
695.0s
the cloud. Everything was streamed around and there was batching. And so the p the personal computing revolution hasn't happened yet because it's just not economical. It doesn't make sense. But I think some people are trying. And it turns out that Mac minis, for example, are a very good fit for some of
696.2s
around and there was batching. And so
699.6s
the p the personal computing revolution
701.5s
hasn't happened yet because it's just
703.0s
not economical. It doesn't make sense.
704.6s
But I think some people are trying. And
706.7s
it turns out that Mac minis, for
708.4s
example, are a very good fit for some of
710.4s
the LLMs because it's all if you're doing batch one inference, this is all super memory bound. So this actually super memory bound. So this actually works. And uh I think these are some early indications maybe of personal computing. Uh but this hasn't really happened yet. It's not clear what this looks like.
712.3s
doing batch one inference, this is all
713.8s
super memory bound. So this actually
715.4s
super memory bound. So this actually works.
716.9s
And uh I think these are some early
718.7s
indications maybe of personal computing.
720.4s
Uh but this hasn't really happened yet.
722.1s
It's not clear what this looks like.
723.5s
Maybe some of you get to invent what what this is or how it works or uh what this should what this should be. Maybe one more analogy that I'll mention is whenever I talk to Chach or some LLM directly in text, I feel like I'm talking to an operating system through the terminal. Like it's just it's it's
725.2s
what this is or how it works or uh what
728.1s
this should what this should be. Maybe
730.3s
one more analogy that I'll mention is
732.2s
whenever I talk to Chach or some LLM
734.6s
directly in text, I feel like I'm
736.5s
talking to an operating system through
738.4s
the terminal. Like it's just it's it's
741.0s
text. It's direct access to the operating system. And I think a guey hasn't yet really been invented in like a general way like should chatt have a guey like different than just a tech bubbles. Uh certainly some of the apps that we're going to go into in a bit have guey but there's no like guey
742.6s
operating system. And I think a guey
744.7s
hasn't yet really been invented in like
746.7s
a general way like should chatt have a
749.7s
guey like different than just a tech
751.4s
bubbles. Uh certainly some of the apps
753.4s
that we're going to go into in a bit
755.4s
have guey but there's no like guey
758.5s
across all the tasks if that makes sense. Um there are some ways in which LLMs are different from kind of operating systems in some fairly unique way and from early computing. And I wrote about uh this one particular property that strikes me as very different uh this time around. It's that LLMs like flip they flip the direction
760.2s
sense. Um there are some ways in which
763.4s
LLMs are different from kind of
765.5s
operating systems in some fairly unique
767.4s
way and from early computing. And I
769.8s
wrote about uh this one particular
772.9s
property that strikes me as very
774.2s
different uh this time around. It's that
777.1s
LLMs like flip they flip the direction
779.8s
of technology diffusion uh that is usually uh present in technology. So for example with electricity, cryptography, computing, flight, internet, GPS, lots of new transformative technologies that have not been around. Typically it is the government and corporations that are the first users because it's new and expensive etc. and it only later diffuses to consumer. Uh, but I feel
782.0s
usually uh present in technology. So for
785.4s
example with electricity, cryptography,
787.0s
computing, flight, internet, GPS, lots
789.1s
of new transformative technologies that
790.6s
have not been around. Typically it is
792.3s
the government and corporations that are
794.3s
the first users because it's new and
796.7s
expensive etc. and it only later
798.7s
diffuses to consumer. Uh, but I feel
800.7s
like LLMs are kind of like flipped around. So maybe with early computers, it was all about ballistics and military use, but with LLMs, it's all about how do you boil an egg or something like that. This is certainly like a lot of my use. And so it's really fascinating to me that we have a new magical computer
802.1s
around. So maybe with early computers,
804.0s
it was all about ballistics and military
806.0s
use, but with LLMs, it's all about how
809.0s
do you boil an egg or something like
810.3s
that. This is certainly like a lot of my
812.0s
use. And so it's really fascinating to
813.6s
me that we have a new magical computer
815.6s
and it's like helping me boil an egg. It's not helping the government do something really crazy like some military ballistics or some special technology. Indeed, corporations are governments are lagging behind the adoption of all of us, of all of these technologies. So, it's just backwards and I think it informs maybe some of the
817.4s
It's not helping the government do
818.9s
something really crazy like some
820.7s
military ballistics or some special
822.2s
technology. Indeed, corporations are
823.8s
governments are lagging behind the
825.1s
adoption of all of us, of all of these
827.2s
technologies. So, it's just backwards
829.0s
and I think it informs maybe some of the
830.5s
uses of how we want to use this technology or like where are some of the first apps and so on. So, in summary so far, LLM labs LLMs. I think it's accurate language to use, but LLMs are complicated operating systems. They're circa 1960s in computing and we're redoing computing all over again. and they're currently available via time
832.4s
technology or like where are some of the
833.6s
first apps and so on.
836.1s
So, in summary so far, LLM labs LLMs. I
841.0s
think it's accurate language to use, but
843.7s
LLMs are complicated operating systems.
846.5s
They're circa 1960s in computing and
848.6s
we're redoing computing all over again.
850.2s
and they're currently available via time
851.8s
sharing and distributed like a utility. What is new and unprecedented is that they're not in the hands of a few governments and corporations. They're in the hands of all of us because we all have a computer and it's all just software and Chaship was beamed down to our computers like billions of people
853.8s
What is new and unprecedented is that
856.0s
they're not in the hands of a few
857.4s
governments and corporations. They're in
858.9s
the hands of all of us because we all
860.2s
have a computer and it's all just
861.6s
software and Chaship was beamed down to
864.3s
our computers like billions of people
866.6s
like instantly and overnight and this is insane. Uh and it's kind of insane to me that this is the case and now it is our time to enter the industry and program these computers. This is crazy. So I think this is quite remarkable. Before we program LLMs, we have to kind of like
868.3s
insane. Uh and it's kind of insane to me
870.9s
that this is the case and now it is our
873.3s
time to enter the industry and program
875.0s
these computers. This is crazy. So I
877.3s
think this is quite remarkable. Before
879.7s
we program LLMs, we have to kind of like
882.1s
spend some time to think about what these things are. And I especially like to kind of talk about their psychology. So the way I like to think about LLMs is that they're kind of like people spirits. Um they are stoastic simulations of people. Um and the simulator in this case happens to be an auto reggressive transformer. So
883.5s
these things are. And I especially like
885.8s
to kind of talk about their psychology.
888.3s
So the way I like to think about LLMs is
890.5s
that they're kind of like people
891.5s
spirits. Um they are stoastic
894.1s
simulations of people. Um and the
896.4s
simulator in this case happens to be an
898.0s
auto reggressive transformer. So
899.8s
transformer is a neural net. Uh it's and it just kind of like is goes on the level of tokens. It goes chunk chunk chunk chunk chunk. And there's an almost equal amount of compute for every single chunk. Um and um this simulator of course is is just is basically there's some weights involved and we fit it to
902.7s
it just kind of like is goes on the
904.8s
level of tokens. It goes chunk chunk
906.5s
chunk chunk chunk. And there's an almost
908.3s
equal amount of compute for every single
910.2s
chunk. Um and um this simulator of
914.7s
course is is just is basically there's
917.0s
some weights involved and we fit it to
919.0s
all of text that we have on the internet and so on. And you end up with this kind of a simulator and because it is trained on humans, it's got this emergent psychology that is humanlike. So the first thing you'll notice is of course uh LLM have encyclopedic knowledge and memory. uh and they can remember lots of
920.5s
and so on. And you end up with this kind
922.2s
of a simulator and because it is trained
924.2s
on humans, it's got this emergent
926.2s
psychology that is humanlike. So the
928.4s
first thing you'll notice is of course
930.6s
uh LLM have encyclopedic knowledge and
932.6s
memory. uh and they can remember lots of
934.6s
things, a lot more than any single individual human can because they read so many things. It's it actually kind of reminds me of this movie Rainman, which I actually really recommend people watch. It's an amazing movie. I love this movie. Um and Dustin Hoffman here is an autistic savant who has almost
936.1s
individual human can because they read
937.6s
so many things. It's it actually kind of
939.8s
reminds me of this movie Rainman, which
941.7s
I actually really recommend people
943.0s
watch. It's an amazing movie. I love
944.5s
this movie. Um and Dustin Hoffman here
946.7s
is an autistic savant who has almost
949.2s
perfect memory. So, he can read a he can read like a phone book and remember all of the names and phone numbers. And I kind of feel like LM are kind of like very similar. They can remember Shaw hashes and lots of different kinds of things very very easily. So they certainly have superpowers in some set
951.6s
read like a phone book and remember all
953.3s
of the names and phone numbers. And I
955.4s
kind of feel like LM are kind of like
957.2s
very similar. They can remember Shaw
959.0s
hashes and lots of different kinds of
960.4s
things very very easily. So they
962.5s
certainly have superpowers in some set
964.4s
in some respects. But they also have a bunch of I would say cognitive deficits. So they hallucinate quite a bit. Um and they kind of make up stuff and don't have a very good uh sort of internal model of self-nowledge, not sufficient at least. And this has gotten better but not perfect. They display jagged
966.2s
bunch of I would say cognitive deficits.
968.8s
So they hallucinate quite a bit. Um and
971.8s
they kind of make up stuff and don't
973.1s
have a very good uh sort of internal
975.3s
model of self-nowledge, not sufficient
977.7s
at least. And this has gotten better but
979.4s
not perfect. They display jagged
981.6s
intelligence. So they're going to be superhuman in some problems solving domains. And then they're going to make mistakes that basically no human will make. like you know they will insist that 9.11 is greater than 9.9 or that there are two Rs in strawberry these are some famous examples but basically there
982.8s
superhuman in some problems solving
984.5s
domains. And then they're going to make
986.0s
mistakes that basically no human will
987.7s
make. like you know they will insist
989.9s
that 9.11 is greater than 9.9 or that
992.6s
there are two Rs in strawberry these are
994.2s
some famous examples but basically there
996.2s
are rough edges that you can trip on so that's kind of I think also kind of unique um they also kind of suffer from entrograde amnesia um so uh and I think I'm alluding to the fact that if you have a co-orker who joins your organization this co-orker will over time learn your organization and uh they
998.9s
that's kind of I think also kind of
1000.3s
unique um they also kind of suffer from
1003.3s
entrograde amnesia um so uh and I think
1006.9s
I'm alluding to the fact that if you
1008.1s
have a co-orker who joins your
1009.3s
organization this co-orker will over
1011.4s
time learn your organization and uh they
1014.2s
will understand and gain like a huge amount of context on the organization and they go home and they sleep and they consolidate knowledge and they develop expertise over time. LLMs don't natively do this and this is not something that has really been solved in the RD of LLM. I think um and so context windows
1015.9s
amount of context on the organization
1017.8s
and they go home and they sleep and they
1019.6s
consolidate knowledge and they develop
1021.1s
expertise over time. LLMs don't natively
1023.4s
do this and this is not something that
1024.6s
has really been solved in the RD of
1026.4s
LLM. I think um and so context windows
1029.3s
are really kind of like working memory and you have to sort of program the working memory quite directly because they don't just kind of like get smarter by uh by default and I think a lot of people get tripped up by the analogies uh in this way. Uh in popular culture I recommend people watch these two movies
1030.6s
and you have to sort of program the
1032.0s
working memory quite directly because
1033.6s
they don't just kind of like get smarter
1035.0s
by uh by default and I think a lot of
1037.0s
people get tripped up by the analogies
1039.0s
uh in this way. Uh in popular culture I
1042.2s
recommend people watch these two movies
1043.9s
uh Momento and 51st dates. In both of these movies, the protagonists, their weights are fixed and their context windows gets wiped every single morning and it's really problematic to go to work or have relationships when this happens and this happens to all the time. I guess one more thing I would point to is security kind of related
1046.1s
these movies, the protagonists, their
1047.8s
weights are fixed and their context
1049.8s
windows gets wiped every single morning
1052.2s
and it's really problematic to go to
1054.2s
work or have relationships when this
1055.8s
happens and this happens to all the
1057.5s
time. I guess one more thing I would
1059.6s
point to is security kind of related
1062.3s
limitations of the use of LLM. So for example, LLMs are quite gullible. Uh they are susceptible to prompt injection risks. They might leak your data etc. And so um and there's many other considerations uh security related. So, so basically long story short, you have to load your you have to load your you have to simultaneously think through
1064.3s
example, LLMs are quite gullible. Uh
1066.4s
they are susceptible to prompt injection
1068.2s
risks. They might leak your data etc.
1070.8s
And so um and there's many other
1072.8s
considerations uh security related. So,
1075.3s
so basically long story short, you have
1077.5s
to load your you have to load your you
1080.0s
have to simultaneously think through
1081.3s
this superhuman thing that has a bunch of cognitive deficits and issues. How do we and yet they are extremely like useful and so how do we program them and how do we work around their deficits and enjoy their superhuman powers. So what I want to switch to now is talk about the opportunities of how do we use
1083.2s
of cognitive deficits and issues. How do
1085.4s
we and yet they are extremely like
1087.8s
useful and so how do we program them and
1090.6s
how do we work around their deficits and
1092.4s
enjoy their superhuman powers.
1095.8s
So what I want to switch to now is talk
1097.4s
about the opportunities of how do we use
1099.0s
these models and what are some of the biggest opportunities. This is not a comprehensive list just some of the things that I thought were interesting for this talk. The first thing I'm kind of excited about is what I would call partial autonomy apps. So for example, let's work with the example of coding.
1100.7s
biggest opportunities. This is not a
1102.4s
comprehensive list just some of the
1103.5s
things that I thought were interesting
1104.6s
for this talk. The first thing I'm kind
1106.9s
of excited about is what I would call
1109.3s
partial autonomy apps. So for example,
1112.2s
let's work with the example of coding.
1114.2s
You can certainly go to chacht directly and you can start copy pasting code around and copyping bug reports and stuff around and getting code and copy pasting everything around. Why would you why would you do that? Why would you go directly to the operating system? It makes a lot more sense to have an app
1116.6s
and you can start copy pasting code
1118.1s
around and copyping bug reports and
1121.0s
stuff around and getting code and copy
1122.4s
pasting everything around. Why would you
1124.2s
why would you do that? Why would you go
1125.4s
directly to the operating system? It
1127.1s
makes a lot more sense to have an app
1128.5s
dedicated for this. And so I think many of you uh use uh cursor. I do as well. And uh cursor is kind of like the thing you want instead. You don't want to just directly go to the chash apt. And I think cursor is a very good example of an early LLM app that has a bunch of
1130.7s
of you uh use uh cursor. I do as well.
1133.8s
And uh cursor is kind of like the thing
1136.3s
you want instead. You don't want to just
1137.8s
directly go to the chash apt. And I
1139.8s
think cursor is a very good example of
1141.4s
an early LLM app that has a bunch of
1143.8s
properties that I think are um useful across all the LLM apps. So in particular, you will notice that we have a traditional interface that allows a human to go in and do all the work manually just as before. But in addition to that, we now have this LLM integration that allows us to go in
1146.2s
across all the LLM apps. So in
1148.0s
particular, you will notice that we have
1149.7s
a traditional interface that allows a
1152.0s
human to go in and do all the work
1153.8s
manually just as before. But in addition
1156.5s
to that, we now have this LLM
1157.8s
integration that allows us to go in
1159.4s
bigger chunks. And so some of the properties of LLM apps that I think are shared and useful to point out. Number one, the LLMs basically do a ton of the context management. Um, number two, they orchestrate multiple calls to LLMs, right? So in the case of cursor, there's under the hood embedding models for all
1161.9s
properties of LLM apps that I think are
1163.5s
shared and useful to point out. Number
1165.8s
one, the LLMs basically do a ton of the
1168.1s
context management. Um, number two, they
1171.2s
orchestrate multiple calls to LLMs,
1173.2s
right? So in the case of cursor, there's
1175.0s
under the hood embedding models for all
1177.0s
your files, the actual chat models, models that apply diffs to the code, and this is all orchestrated for you. A really big one that uh I think also maybe not fully appreciated always is application specific uh GUI and the importance of it. Um because you don't just want to talk to the operating
1179.2s
models that apply diffs to the code, and
1181.8s
this is all orchestrated for you. A
1183.9s
really big one that uh I think also
1186.1s
maybe not fully appreciated always is
1188.5s
application specific uh GUI and the
1190.5s
importance of it. Um because you don't
1193.1s
just want to talk to the operating
1194.6s
system directly in text. Text is very hard to read, interpret, understand and also like you don't want to take some of these actions natively in text. So it's much better to just see a diff as like red and green change and you can see what's being added is subtracted. It's much easier to just do command Y to
1196.6s
hard to read, interpret, understand and
1199.0s
also like you don't want to take some of
1200.5s
these actions natively in text. So it's
1203.1s
much better to just see a diff as like
1205.0s
red and green change and you can see
1206.8s
what's being added is subtracted. It's
1208.5s
much easier to just do command Y to
1210.2s
accept or command N to reject. I shouldn't have to type it in text, right? So, a guey allows a human to audit the work of these fallible systems and to go faster. I'm going to come back to this point a little bit uh later as well. And the last kind of feature I
1211.9s
shouldn't have to type it in text,
1213.1s
right? So, a guey allows a human to
1215.5s
audit the work of these fallible systems
1217.8s
and to go faster. I'm going to come back
1220.0s
to this point a little bit uh later as
1221.8s
well. And the last kind of feature I
1223.8s
want to point out is that there's what I call the autonomy slider. So, for example, in cursor, you can just do tap completion. You're mostly in charge. You can select a chunk of code and command K to change just that chunk of code. You can do command L to change the entire
1225.2s
call the autonomy slider. So, for
1227.7s
example, in cursor, you can just do tap
1229.4s
completion. You're mostly in charge. You
1231.5s
can select a chunk of code and command K
1233.6s
to change just that chunk of code. You
1236.0s
can do command L to change the entire
1237.9s
file. Or you can do command I which just you know let it rip do whatever you want in the entire repo and that's the sort of full autonomy agent agentic version and so you are in charge of the autonomy slider and depending on the complexity of the task at hand you can uh tune the
1240.4s
you know let it rip do whatever you want
1242.2s
in the entire repo and that's the sort
1244.1s
of full autonomy agent agentic version
1246.4s
and so you are in charge of the autonomy
1248.3s
slider and depending on the complexity
1250.2s
of the task at hand you can uh tune the
1253.0s
amount of autonomy that you're willing to give up uh for that task maybe to show one more example of a fairly successful LLM app uh perplexity um it also has very similar features to what I've just pointed out to in cursor uh it packages up a lot of the information. It orchestrates multiple LLMs. It's got a
1254.3s
to give up uh for that task maybe to
1257.1s
show one more example of a fairly
1258.6s
successful LLM app uh perplexity um it
1263.0s
also has very similar features to what
1264.6s
I've just pointed out to in cursor uh it
1267.2s
packages up a lot of the information. It
1268.7s
orchestrates multiple LLMs. It's got a
1271.0s
GUI that allows you to audit some of its work. So, for example, it will site sources and you can imagine inspecting them. And it's got an autonomy slider. You can either just do a quick search or you can do research or you can do deep research and come back 10 minutes later.
1273.4s
work. So, for example, it will site
1275.6s
sources and you can imagine inspecting
1277.3s
them. And it's got an autonomy slider.
1279.0s
You can either just do a quick search or
1280.6s
you can do research or you can do deep
1282.3s
research and come back 10 minutes later.
1284.3s
So, this is all just varying levels of autonomy that you give up to the tool. So, I guess my question is I feel like a lot of software will become partially autonomous. I'm trying to think through like what does that look like? And for many of you who maintain products and
1285.7s
autonomy that you give up to the tool.
1287.7s
So, I guess my question is I feel like a
1290.2s
lot of software will become partially
1292.0s
autonomous. I'm trying to think through
1293.5s
like what does that look like? And for
1295.3s
many of you who maintain products and
1297.0s
services, how are you going to make your products and services partially autonomous? Can an LLM see everything that a human can see? Can an LLM act in all the ways that a human could act? And can humans supervise and stay in the loop of this activity? Because again, these are fallible systems that aren't
1299.0s
products and services partially
1300.2s
autonomous? Can an LLM see everything
1302.7s
that a human can see? Can an LLM act in
1305.1s
all the ways that a human could act? And
1307.0s
can humans supervise and stay in the
1309.4s
loop of this activity? Because again,
1310.9s
these are fallible systems that aren't
1312.3s
yet perfect. And what does a diff look like in Photoshop or something like that? You know, and also a lot of the traditional software right now, it has all these switches and all this kind of stuff that's all designed for human. All of this has to change and become accessible to LLMs.
1314.9s
like in Photoshop or something like
1316.6s
that? You know, and also a lot of the
1318.8s
traditional software right now, it has
1320.1s
all these switches and all this kind of
1321.8s
stuff that's all designed for human. All
1323.4s
of this has to change and become
1324.7s
accessible to LLMs.
1327.8s
So, one thing I want to stress with a lot of these LLM apps that I'm not sure gets as much attention as it should is um we we're now kind of like cooperating with AIS and usually they are doing the generation and we as humans are doing the verification. It is in our interest
1329.5s
lot of these LLM apps that I'm not sure
1331.1s
gets as much attention as it should is
1334.2s
um we we're now kind of like cooperating
1336.8s
with AIS and usually they are doing the
1338.6s
generation and we as humans are doing
1340.2s
the verification. It is in our interest
1342.6s
to make this loop go as fast as possible. So, we're getting a lot of work done. There are two major ways that I think uh this can be done. Number one, you can speed up verification a lot. Um, and I think guies, for example, are extremely important to this because a guey utilizes your computer vision GPU
1344.5s
possible. So, we're getting a lot of
1345.8s
work done. There are two major ways that
1348.0s
I think uh this can be done. Number one,
1350.4s
you can speed up verification a lot. Um,
1352.7s
and I think guies, for example, are
1354.2s
extremely important to this because a
1356.1s
guey utilizes your computer vision GPU
1359.3s
in all of our head. Reading text is effortful and it's not fun, but looking at stuff is fun and it's it's just a kind of like a highway to your brain. So, I think guies are very useful for auditing systems and visual representations in general. And number two, I would say is we have to keep the
1361.4s
effortful and it's not fun, but looking
1363.2s
at stuff is fun and it's it's just a
1365.8s
kind of like a highway to your brain.
1367.4s
So, I think guies are very useful for
1369.7s
auditing systems and visual
1371.7s
representations in general. And number
1373.6s
two, I would say is we have to keep the
1376.1s
AI on the leash. We I think a lot of people are getting way over excited with AI agents and uh it's not useful to me to get a diff of 10,000 lines of code to my repo. Like I have to I'm still the bottleneck, right? Even though that 10,00 lines come out instantly, I have
1378.9s
people are getting way over excited with
1380.6s
AI agents and uh it's not useful to me
1383.6s
to get a diff of 10,000 lines of code to
1385.8s
my repo. Like I have to I'm still the
1387.9s
bottleneck, right? Even though that
1389.2s
10,00 lines come out instantly, I have
1391.1s
to make sure that this thing is not introducing bugs. It's just like and that it's doing the correct thing, right? And that there's no security issues and so on. So um I think that um yeah basically you we have to sort of like it's in our interest to make the
1392.2s
introducing bugs. It's just like and
1395.4s
that it's doing the correct thing,
1396.6s
right? And that there's no security
1397.8s
issues and so on. So um I think that um
1402.9s
yeah basically you we have to sort of
1405.4s
like it's in our interest to make the
1408.2s
the flow of these two go very very fast and we have to somehow keep the AI on the leash because it gets way too overreactive. It's uh it's kind of like this. This is how I feel when I do AI assisted coding. If I'm just bite coding everything is nice and great but if I'm
1410.3s
and we have to somehow keep the AI on
1412.2s
the leash because it gets way too
1413.1s
overreactive. It's uh it's kind of like
1415.3s
this. This is how I feel when I do AI
1417.3s
assisted coding. If I'm just bite coding
1419.2s
everything is nice and great but if I'm
1420.9s
actually trying to get work done it's not so great to have an overreactive uh agent doing all this kind of stuff. So this slide is not very good. I'm sorry, but I guess I'm trying to develop like many of you some ways of utilizing these agents in my coding workflow and to do
1422.4s
not so great to have an overreactive uh
1424.7s
agent doing all this kind of stuff. So
1427.3s
this slide is not very good. I'm sorry,
1428.8s
but I guess I'm trying to develop like
1431.1s
many of you some ways of utilizing these
1433.8s
agents in my coding workflow and to do
1435.8s
AI assisted coding. And in my own work, I'm always scared to get way too big diffs. I always go in small incremental chunks. I want to make sure that everything is good. I want to spin this loop very very fast and um I sort of work on small chunks of single concrete
1438.1s
I'm always scared to get way too big
1439.8s
diffs. I always go in small incremental
1442.2s
chunks. I want to make sure that
1444.2s
everything is good. I want to spin this
1446.2s
loop very very fast and um I sort of
1449.1s
work on small chunks of single concrete
1450.8s
thing. Uh and so I think many of you probably are developing similar ways of working with the with LLMs. Um, I also saw a number of blog posts that try to develop these best practices for working with LLMs. And here's one that I read recently and I thought was quite good. And it kind of discussed
1453.2s
probably are developing similar ways of
1454.6s
working with the with LLMs.
1457.6s
Um, I also saw a number of blog posts
1459.6s
that try to develop these best practices
1462.2s
for working with LLMs. And here's one
1464.0s
that I read recently and I thought was
1465.4s
quite good. And it kind of discussed
1466.8s
some techniques and some of them have to do with how you keep the AI on the leash. And so, as an example, if you are prompting, if your prompt is vague, then uh the AI might not do exactly what you wanted and in that case, verification will fail. You're going to ask for
1468.2s
do with how you keep the AI on the
1469.9s
leash. And so, as an example, if you are
1472.0s
prompting, if your prompt is vague, then
1475.0s
uh the AI might not do exactly what you
1477.0s
wanted and in that case, verification
1478.9s
will fail. You're going to ask for
1480.2s
something else. If a verification fails, then you're going to start spinning. So it makes a lot more sense to spend a bit more time to be more concrete in your prompts which increases the probability of successful verification and you can move forward. And so I think a lot of us
1482.1s
then you're going to start spinning. So
1483.7s
it makes a lot more sense to spend a bit
1485.1s
more time to be more concrete in your
1486.8s
prompts which increases the probability
1488.5s
of successful verification and you can
1490.2s
move forward. And so I think a lot of us
1492.1s
are going to end up finding um kind of techniques like this. I think in my own work as well I'm currently interested in uh what education looks like in um together with kind of like now that we have AI uh and LLMs what does education look like? And I think a a large amount
1494.1s
techniques like this. I think in my own
1496.3s
work as well I'm currently interested in
1497.8s
uh what education looks like in um
1500.1s
together with kind of like now that we
1501.8s
have AI uh and LLMs what does education
1504.5s
look like? And I think a a large amount
1507.0s
of thought for me goes into how we keep AI on the leash. I don't think it just works to go to chat and be like, Hey, teach me physics. I don't think this works because the AI is like gets lost in the woods. And so for me, this is actually two separate apps. For example,
1509.7s
AI on the leash. I don't think it just
1511.4s
works to go to chat and be like, Hey,
1513.2s
teach me physics. I don't think this
1514.8s
works because the AI is like gets lost
1516.9s
in the woods. And so for me, this is
1518.8s
actually two separate apps. For example,
1520.9s
there's an app for a teacher that creates courses and then there's an app that takes courses and serves them to students. And in both cases, we now have this intermediate artifact of a course that is auditable and we can make sure it's good. We can make sure it's consistent. and the AI is kept on the
1522.6s
creates courses and then there's an app
1524.9s
that takes courses and serves them to
1526.5s
students. And in both cases, we now have
1529.1s
this intermediate artifact of a course
1531.2s
that is auditable and we can make sure
1532.7s
it's good. We can make sure it's
1533.8s
consistent. and the AI is kept on the
1535.9s
leash with respect to a certain syllabus, a certain like um progression of projects and so on. And so this is one way of keeping the AI on leash and I think has a much higher likelihood of working and the AI is not getting lost in the woods. One more kind of analogy I wanted to
1537.1s
syllabus, a certain like um progression
1540.2s
of projects and so on. And so this is
1542.6s
one way of keeping the AI on leash and I
1544.2s
think has a much higher likelihood of
1545.8s
working and the AI is not getting lost
1547.8s
in the woods.
1549.9s
One more kind of analogy I wanted to
1551.9s
sort of allude to is I'm not I'm no stranger to partial autonomy and I kind of worked on this I think for five years at Tesla and this is also a partial autonomy product and shares a lot of the features like for example right there in the instrument panel is the GUI of the
1554.5s
stranger to partial autonomy and I kind
1556.2s
of worked on this I think for five years
1557.8s
at Tesla and this is also a partial
1560.2s
autonomy product and shares a lot of the
1561.9s
features like for example right there in
1563.5s
the instrument panel is the GUI of the
1565.4s
autopilot so it's showing me what the what the neural network sees and so on and we have the autonomy slider where over the course of my tenure there we did more and more autonomous tasks for the user and maybe the story that I wanted to tell very briefly is uh actually the first time I drove a
1567.6s
what the neural network sees and so on
1569.2s
and we have the autonomy slider where
1570.8s
over the course of my tenure there we
1573.4s
did more and more autonomous tasks for
1575.6s
the user and maybe the story that I
1578.3s
wanted to tell very briefly is uh
1581.1s
actually the first time I drove a
1582.6s
self-driving vehicle was in 2013 and I had a friend who worked at Whimo and uh he offered to give me a drive around Palo Alto. I took this picture using Google Glass at the time and many of you are so young that you might not even know what that is. Uh but uh yeah, this
1585.2s
had a friend who worked at Whimo and uh
1587.3s
he offered to give me a drive around
1589.1s
Palo Alto. I took this picture using
1591.5s
Google Glass at the time and many of you
1593.9s
are so young that you might not even
1595.3s
know what that is. Uh but uh yeah, this
1597.3s
was like all the rage at the time. And we got into this car and we went for about a 30-minute drive around Palo Alto highways uh streets and so on. And this drive was perfect. There was zero interventions and this was 2013 which is now 12 years ago. And it kind of struck
1599.4s
we got into this car and we went for
1601.0s
about a 30-minute drive around Palo Alto
1603.0s
highways uh streets and so on. And this
1605.1s
drive was perfect. There was zero
1607.0s
interventions and this was 2013 which is
1609.8s
now 12 years ago. And it kind of struck
1612.5s
me because at the time when I had this perfect drive, this perfect demo, I felt like, wow, self-driving is imminent because this just worked. This is incredible. Um, but here we are 12 years later and we are still working on autonomy. Um, we are still working on driving agents and even now we haven't
1614.0s
perfect drive, this perfect demo, I felt
1616.2s
like, wow, self-driving is imminent
1619.5s
because this just worked. This is
1620.8s
incredible. Um, but here we are 12 years
1623.4s
later and we are still working on
1624.9s
autonomy. Um, we are still working on
1627.0s
driving agents and even now we haven't
1629.2s
actually like really solved the problem. like you may see Whimos going around and they look driverless but you know there's still a lot of teleoperation and a lot of human in the loop of a lot of this driving so we still haven't even like declared success but I think it's definitely like going to succeed at this
1630.8s
like you may see Whimos going around and
1632.9s
they look driverless but you know
1635.0s
there's still a lot of teleoperation and
1636.8s
a lot of human in the loop of a lot of
1638.7s
this driving so we still haven't even
1641.0s
like declared success but I think it's
1642.6s
definitely like going to succeed at this
1644.4s
point but it just took a long time and so I think like like this is software is really tricky I think in the same way that driving is tricky and so when I see things like oh 2025 is the year of agents I get very concerned and I kind of feel like you know this is the decade
1646.6s
so I think like like this is software is
1649.4s
really tricky I think in the same way
1651.6s
that driving is tricky and so when I see
1654.7s
things like oh 2025 is the year of
1656.5s
agents I get very concerned and I kind
1658.7s
of feel like you know this is the decade
1661.0s
of agents and this is going to be quite some time. We need humans in the loop. We need to do this carefully. This is software. Let's be serious here. One more kind of analogy that I always think through is the Iron Man suit. Uh I think this is I always love Iron Man. I think
1664.1s
some time. We need humans in the loop.
1665.8s
We need to do this carefully. This is
1667.2s
software. Let's be serious here. One
1671.0s
more kind of analogy that I always think
1672.9s
through is the Iron Man suit. Uh I think
1676.1s
this is I always love Iron Man. I think
1678.2s
it's like so um correct in a bunch of ways with respect to technology and how it will play out. And what I love about the Iron Man suit is that it's both an augmentation and Tony Stark can drive it and it's also an agent. And in some of the movies, the Iron Man suit is quite
1681.4s
ways with respect to technology and how
1682.9s
it will play out. And what I love about
1684.4s
the Iron Man suit is that it's both an
1685.9s
augmentation and Tony Stark can drive it
1688.7s
and it's also an agent. And in some of
1690.3s
the movies, the Iron Man suit is quite
1691.8s
autonomous and can fly around and find Tony and all this kind of stuff. And so this is the autonomy slider is we can be we can build augmentations or we can build agents and we kind of want to do a bit of both. But at this stage I would say working with fallible LLMs and so
1693.6s
Tony and all this kind of stuff. And so
1695.3s
this is the autonomy slider is we can be
1697.3s
we can build augmentations or we can
1699.0s
build agents and we kind of want to do a
1701.2s
bit of both. But at this stage I would
1703.4s
say working with fallible LLMs and so
1705.9s
on. I would say you know it's less Iron Man robots and more Iron Man suits that you want to build. It's less like building flashy demos of autonomous agents and more building partial autonomy products. And these products have custom gueies and UIUX. And we're trying to um and this is done so that
1709.1s
Man robots and more Iron Man suits that
1711.6s
you want to build. It's less like
1713.7s
building flashy demos of autonomous
1715.1s
agents and more building partial
1716.7s
autonomy products. And these products
1719.7s
have custom gueies and UIUX. And we're
1721.9s
trying to um and this is done so that
1723.8s
the generation verification loop of the human is very very fast. But we are not losing the sight of the fact that it is in principle possible to automate this work. And there should be an autonomy slider in your product. And you should be thinking about how you can slide that autonomy slider and make your product uh
1725.5s
human is very very fast. But we are not
1728.2s
losing the sight of the fact that it is
1729.5s
in principle possible to automate this
1731.3s
work. And there should be an autonomy
1733.0s
slider in your product. And you should
1734.6s
be thinking about how you can slide that
1735.9s
autonomy slider and make your product uh
1738.6s
sort of um more autonomous over time. But this is kind of how I think there's lots of opportunities in these kinds of products. I want to now switch gears a little bit and talk about one other dimension that I think is very unique. Not only is there a new type of programming language that allows for
1741.3s
But this is kind of how I think there's
1742.7s
lots of opportunities in these kinds of
1744.2s
products. I want to now switch gears a
1746.6s
little bit and talk about one other
1748.2s
dimension that I think is very unique.
1749.8s
Not only is there a new type of
1751.4s
programming language that allows for
1753.0s
autonomy in software but also as I mentioned it's programmed in English which is this natural interface and suddenly everyone is a programmer because everyone speaks natural language like English. So this is extremely bullish and very interesting to me and also completely unprecedented. I would say it it used to be the case that you
1755.3s
mentioned it's programmed in English
1756.6s
which is this natural interface and
1759.0s
suddenly everyone is a programmer
1760.6s
because everyone speaks natural language
1762.2s
like English. So this is extremely
1764.6s
bullish and very interesting to me and
1766.2s
also completely unprecedented. I would
1768.0s
say it it used to be the case that you
1769.5s
need to spend five to 10 years studying something to be able to do something in software. this is not the case anymore. So, I don't know if by any chance anyone has heard of vibe coding. Uh, this this is the tweet that kind of like introduced this, but I'm told that
1771.4s
something to be able to do something in
1772.9s
software. this is not the case anymore.
1775.2s
So, I don't know if by any chance anyone
1777.1s
has heard of vibe coding.
1780.6s
Uh, this this is the tweet that kind of
1782.5s
like introduced this, but I'm told that
1784.2s
this is now like a major meme. Um, fun story about this is that I've been on Twitter for like 15 years or something like that at this point and I still have no clue which tweet will become viral and which tweet like fizzles and no one cares. And I thought that this tweet was
1786.7s
story about this is that I've been on
1789.6s
Twitter for like 15 years or something
1791.2s
like that at this point and I still have
1793.5s
no clue which tweet will become viral
1796.3s
and which tweet like fizzles and no one
1798.0s
cares. And I thought that this tweet was
1800.8s
going to be the latter. I don't know. It was just like a shower of thoughts. But this became like a total meme and I really just can't tell. But I guess like it struck a chord and it gave a name to something that everyone was feeling but couldn't quite say in words. So now
1801.8s
was just like a shower of thoughts. But
1803.4s
this became like a total meme and I
1805.3s
really just can't tell. But I guess like
1806.7s
it struck a chord and it gave a name to
1808.5s
something that everyone was feeling but
1810.6s
couldn't quite say in words. So now
1813.3s
there's a Wikipedia page and everything. This is like This is like Applause yeah this is like a major contribution now or something like that. So, um, so Tom Wolf from HuggingFace shared this beautiful video that I really love. this beautiful video that I really love. Um, these are kids vibe coding.
1817.3s
This is like
1818.6s
This is like Applause
1825.9s
yeah this is like a major contribution
1827.6s
now or something like that. So,
1830.7s
um, so Tom Wolf from HuggingFace shared
1833.0s
this beautiful video that I really love.
1835.0s
this beautiful video that I really love. Um,
1837.8s
these are kids vibe coding.
1844.4s
video. Like, I love this video. Like, how can you look at this video and feel bad about the future? The future is bad about the future? The future is great. I think this will end up being like a gateway drug to software development. Um, I'm not a doomer about the future of
1846.7s
how can you look at this video and feel
1848.1s
bad about the future? The future is
1849.8s
bad about the future? The future is great.
1852.6s
I think this will end up being like a
1853.9s
gateway drug to software development.
1856.6s
Um, I'm not a doomer about the future of
1859.2s
the generation and I think yeah, I love this video. So, I tried by coding a little bit uh as well because it's so fun. Uh, so bike coding is so great when you want to build something super duper custom that doesn't appear to exist and you just want to wing it because it's a
1862.2s
this video. So, I tried by coding a
1864.8s
little bit uh as well because it's so
1867.1s
fun. Uh, so bike coding is so great when
1869.4s
you want to build something super duper
1870.8s
custom that doesn't appear to exist and
1872.4s
you just want to wing it because it's a
1873.7s
Saturday or something like that. So, I built this uh iOS app and I don't I can't actually program in Swift, but I was really shocked that I was able to build like a super basic app and I'm not going to explain it. It's really uh dumb, but uh I kind of like this was
1875.5s
built this uh iOS app and I don't I
1878.7s
can't actually program in Swift, but I
1880.6s
was really shocked that I was able to
1881.8s
build like a super basic app and I'm not
1883.4s
going to explain it. It's really uh
1884.7s
dumb, but uh I kind of like this was
1887.4s
just like a day of work and this was running on my phone like later that day and I was like, Wow, this is amazing. I didn't have to like read through Swift for like five days or something like that to like get started. I also vipcoded this app called Menu Genen. And
1888.7s
running on my phone like later that day
1890.3s
and I was like, Wow, this is amazing.
1892.3s
I didn't have to like read through Swift
1893.9s
for like five days or something like
1895.9s
that to like get started. I also
1898.2s
vipcoded this app called Menu Genen. And
1900.5s
this is live. You can try it in menu.app. And I basically had this problem where I show up at a restaurant, I read through the menu, and I have no idea what any of the things are. And I need pictures. So this doesn't exist. So I was like, Hey, I'm going to bite code
1901.8s
menu.app. And I basically had this
1904.1s
problem where I show up at a restaurant,
1905.4s
I read through the menu, and I have no
1906.6s
idea what any of the things are. And I
1908.6s
need pictures. So this doesn't exist. So
1911.6s
I was like, Hey, I'm going to bite code
1913.0s
it. So, um, this is what it looks like. You go to menu.app, um, and, uh, you take a picture of a of a menu and then menu generates the images and everyone gets 5 in credits for free when you sign up. And therefore, this is a major cost center in my life. So, this is a negative
1915.9s
You go to menu.app,
1918.2s
um, and, uh, you take a picture of a of
1921.4s
a menu and then menu generates the
1923.3s
images and everyone gets 5 in credits
1926.2s
for free when you sign up. And
1928.0s
therefore, this is a major cost center
1930.5s
in my life. So, this is a negative
1933.8s
negative uh, revenue app for me right negative uh, revenue app for me right now. I've lost a huge amount of money on I've lost a huge amount of money on menu. Okay. But the fascinating thing about menu genen for me is that the code of the v the vite coding part the code was
1936.2s
negative uh, revenue app for me right now.
1937.8s
I've lost a huge amount of money on
1939.2s
I've lost a huge amount of money on menu.
1941.3s
Okay. But the fascinating thing about
1943.4s
menu genen for me is that the code of
1948.2s
the v the vite coding part the code was
1950.2s
actually the easy part of v of v coding menu and most of it actually was when I tried to make it real so that you can actually have authentication and payments and the domain name and averal deployment. This was really hard and all of this was not code. All of this devops
1952.7s
menu and most of it actually was when I
1955.1s
tried to make it real so that you can
1956.5s
actually have authentication and
1957.6s
payments and the domain name and averal
1959.6s
deployment. This was really hard and all
1961.9s
of this was not code. All of this devops
1964.2s
stuff was in me in the browser clicking stuff and this was extreme slo and took another week. So it was really fascinating that I had the menu genen um basically demo working on my laptop in a few hours and then it took me a week because I was trying to make it real and
1967.1s
stuff and this was extreme slo and took
1969.8s
another week. So it was really
1971.5s
fascinating that I had the menu genen um
1974.6s
basically demo working on my laptop in a
1977.3s
few hours and then it took me a week
1979.3s
because I was trying to make it real and
1981.2s
the reason for this is this was just really annoying. Um, so for example, if you try to add Google login to your web page, I know this is very small, but just a huge amount of instructions of this clerk library telling me how to integrate this. And this is crazy. Like
1982.9s
really annoying. Um, so for example, if
1985.6s
you try to add Google login to your web
1987.3s
page, I know this is very small, but
1989.2s
just a huge amount of instructions of
1991.7s
this clerk library telling me how to
1993.6s
integrate this. And this is crazy. Like
1995.2s
it's telling me go to this URL, click on this dropdown, choose this, go to this, and click on that. And it's like telling me what to do. Like a computer is telling me the actions I should be taking. Like you do it. Why am I doing taking. Like you do it. Why am I doing this?
1997.5s
this dropdown, choose this, go to this,
1999.8s
and click on that. And it's like telling
2001.2s
me what to do. Like a computer is
2002.6s
telling me the actions I should be
2004.9s
taking. Like you do it. Why am I doing
2006.6s
taking. Like you do it. Why am I doing this?
2008.6s
What the hell? I had to follow all these instructions. This was crazy. So I think the last part of my talk therefore focuses on can we just build for agents? I don't want to do this work. Can agents do this? Thank do this work. Can agents do this? Thank you. Okay. So roughly speaking, I think
2011.8s
I had to follow all these instructions.
2013.8s
This was crazy. So I think the last part
2016.2s
of my talk therefore focuses on can we
2019.5s
just build for agents? I don't want to
2021.7s
do this work. Can agents do this? Thank
2024.2s
do this work. Can agents do this? Thank you.
2026.3s
Okay. So roughly speaking, I think
2028.6s
there's a new category of consumer and manipulator of digital information. It used to be just humans through GUIs or computers through APIs. And now we have a completely new thing and agents are they're computers but they are humanlike kind of right they're people spirits there's people spirits on the internet and they need to interact with our
2030.9s
manipulator of digital information. It
2033.1s
used to be just humans through GUIs or
2035.4s
computers through APIs. And now we have
2037.5s
a completely new thing and agents are
2040.2s
they're computers but they are humanlike
2042.8s
kind of right they're people spirits
2044.3s
there's people spirits on the internet
2045.6s
and they need to interact with our
2046.7s
software infrastructure like can we build for them it's a new thing so as an example you can have robots.txt on your domain and you can instruct uh or like advise I suppose um uh web crawlers on how to behave on your website in the same way you can have maybe lm.txt txt
2048.3s
build for them it's a new thing so as an
2050.6s
example you can have robots.txt on your
2053.0s
domain and you can instruct uh or like
2055.1s
advise I suppose um uh web crawlers on
2058.3s
how to behave on your website in the
2059.8s
same way you can have maybe lm.txt txt
2061.5s
file which is just a simple markdown that's telling LLMs what this domain is about and this is very readable to a to an LLM. If it had to instead get the HTML of your web page and try to parse it, this is very errorprone and difficult and will screw it up and it's
2063.4s
that's telling LLMs what this domain is
2065.7s
about and this is very readable to a to
2068.1s
an LLM. If it had to instead get the
2070.6s
HTML of your web page and try to parse
2072.5s
it, this is very errorprone and
2073.8s
difficult and will screw it up and it's
2075.7s
not going to work. So we can just directly speak to the LLM. It's worth it. Um a huge amount of documentation is currently written for people. So you will see things like lists and bold and pictures and this is not directly accessible by an LLM. So I see some of the services now are transitioning a lot
2076.8s
directly speak to the LLM. It's worth
2078.4s
it. Um a huge amount of documentation is
2081.3s
currently written for people. So you
2082.7s
will see things like lists and bold and
2085.6s
pictures and this is not directly
2087.8s
accessible by an LLM. So I see some of
2091.2s
the services now are transitioning a lot
2092.8s
of the their docs to be specifically for LLMs. So Versell and Stripe as an example are early movers here but there are a few more that I've seen already and they offer their documentation in markdown. Markdown is super easy for LMS to understand. This is great. Um maybe one simple example from from uh my
2094.9s
LLMs. So Versell and Stripe as an
2097.0s
example are early movers here but there
2099.4s
are a few more that I've seen already
2101.9s
and they offer their documentation in
2104.2s
markdown. Markdown is super easy for LMS
2106.7s
to understand. This is great. Um maybe
2110.1s
one simple example from from uh my
2112.3s
experience as well. Maybe some of you know three blue one brown. He makes beautiful animation videos on YouTube. beautiful animation videos on YouTube. Applause wrote uh Manon and I wanted to make my own and uh there's extensive documentations on how to use manon and so I didn't want to actually read
2114.1s
know three blue one brown. He makes
2115.6s
beautiful animation videos on YouTube.
2119.4s
beautiful animation videos on YouTube. Applause
2125.0s
wrote uh Manon and I wanted to make my
2127.4s
own and uh there's extensive
2130.1s
documentations on how to use manon and
2132.6s
so I didn't want to actually read
2134.0s
through it. So I copy pasted the whole thing to an LLM and I described what I wanted and it just worked out of the box like LLM just bcoded me an animation exactly what I wanted and I was like wow this is amazing. So if we can make docs legible to LLMs, it's going to unlock a
2135.4s
thing to an LLM and I described what I
2137.4s
wanted and it just worked out of the box
2139.2s
like LLM just bcoded me an animation
2141.4s
exactly what I wanted and I was like wow
2143.3s
this is amazing. So if we can make docs
2145.8s
legible to LLMs, it's going to unlock a
2148.2s
huge amount of um kind of use and um I think this is wonderful and should should happen more. The other thing I wanted to point out is that you do unfortunately have to it's not just about taking your docs and making them appear in markdown. That's the easy part. We actually have to change the
2151.2s
think this is wonderful and should
2152.4s
should happen more. The other thing I
2155.1s
wanted to point out is that you do
2156.2s
unfortunately have to it's not just
2157.7s
about taking your docs and making them
2159.0s
appear in markdown. That's the easy
2160.6s
part. We actually have to change the
2161.9s
docs because anytime your docs say click this is bad. An LLM will not be able to natively take this action right now. So, Verscell, for example, is replacing every occurrence of click with an equivalent curl command that your LM agent could take on your behalf. Um, and so I think this is very interesting. And
2164.7s
this is bad. An LLM will not be able to
2166.8s
natively take this action right now. So,
2169.9s
Verscell, for example, is replacing
2171.5s
every occurrence of click with an
2173.5s
equivalent curl command that your LM
2175.4s
agent could take on your behalf. Um, and
2178.2s
so I think this is very interesting. And
2179.8s
then, of course, there's a model context protocol from Enthropic. And this is also another way, it's a protocol of speaking directly to agents as this new consumer and manipulator of digital information. So, I'm very bullish on these ideas. The other thing I really like is a number of little tools here and there that are helping ingest data
2181.4s
protocol from Enthropic. And this is
2183.0s
also another way, it's a protocol of
2184.9s
speaking directly to agents as this new
2186.7s
consumer and manipulator of digital
2188.2s
information. So, I'm very bullish on
2189.7s
these ideas. The other thing I really
2191.5s
like is a number of little tools here
2193.5s
and there that are helping ingest data
2196.6s
that in like very LLM friendly formats. So for example, when I go to a GitHub repo like my nanoGPT repo, I can't feed this to an LLM and ask questions about it uh because it's you know this is a human interface on GitHub. So when you just change the URL from GitHub to get
2198.7s
So for example, when I go to a GitHub
2200.2s
repo like my nanoGPT repo, I can't feed
2202.7s
this to an LLM and ask questions about
2204.3s
it uh because it's you know this is a
2206.7s
human interface on GitHub. So when you
2208.9s
just change the URL from GitHub to get
2210.5s
ingest then uh this will actually concatenate all the files into a single giant text and it will create a directory structure etc. And this is ready to be copy pasted into your favorite LLM and you can do stuff. Maybe even more dramatic example of this is deep wiki where it's not just the raw
2212.3s
concatenate all the files into a single
2214.2s
giant text and it will create a
2215.9s
directory structure etc. And this is
2217.5s
ready to be copy pasted into your
2219.0s
favorite LLM and you can do stuff. Maybe
2221.5s
even more dramatic example of this is
2223.4s
deep wiki where it's not just the raw
2225.4s
content of these files. uh this is from Devon but also like they have Devon basically do analysis of the GitHub repo and Devon basically builds up a whole docs uh pages just for your repo and you can imagine that this is even more helpful to copy paste into your LLM. So I love all the little tools that
2228.6s
Devon but also like they have Devon
2231.0s
basically do analysis of the GitHub repo
2232.9s
and Devon basically builds up a whole
2234.6s
docs uh pages just for your repo and you
2238.0s
can imagine that this is even more
2239.8s
helpful to copy paste into your LLM. So
2242.1s
I love all the little tools that
2243.4s
basically where you just change the URL and it makes something accessible to an LLM. So this is all well and great and u I think there should be a lot more of it. One more note I wanted to make is that it is absolutely possible that in the future LLMs will be able to this is
2245.0s
and it makes something accessible to an
2246.6s
LLM. So this is all well and great and u
2249.5s
I think there should be a lot more of
2250.7s
it. One more note I wanted to make is
2252.7s
that it is absolutely possible that in
2255.3s
the future LLMs will be able to this is
2258.0s
not even future this is today they'll be able to go around and they'll be able to click stuff and so on but I still think it's very worth u basically meeting LLM halfway LLM's halfway and making it easier for them to access all this information uh because this is still fairly expensive I would say to use and
2259.6s
able to go around and they'll be able to
2260.8s
click stuff and so on but I still think
2262.6s
it's very worth u basically meeting LLM
2266.1s
halfway LLM's halfway and making it
2268.6s
easier for them to access all this
2269.9s
information uh because this is still
2271.7s
fairly expensive I would say to use and
2274.4s
uh a lot more difficult and so I do think that lots of software there will be a long tail where it won't like adapt apps because these are not like live player sort of repositories or digital infrastructure and we will need these tools. Uh but I think for everyone else I think it's very worth kind of like
2276.6s
think that lots of software there will
2278.2s
be a long tail where it won't like adapt
2280.6s
apps because these are not like live
2282.2s
player sort of repositories or digital
2284.5s
infrastructure and we will need these
2286.2s
tools. Uh but I think for everyone else
2288.4s
I think it's very worth kind of like
2289.7s
meeting in some middle point. So I'm bullish on both if that makes sense. So in summary, what an amazing time to get into the industry. We need to rewrite a ton of code. A ton of code will be written by professionals and by coders. These LLMs are kind of like utilities, kind of like fabs, but
2291.8s
bullish on both if that makes sense.
2294.6s
So in summary, what an amazing time to
2297.1s
get into the industry. We need to
2298.6s
rewrite a ton of code. A ton of code
2300.7s
will be written by professionals and by
2303.0s
coders. These LLMs are kind of like
2305.6s
utilities, kind of like fabs, but
2307.5s
they're kind of especially like operating systems. But it's so early. It's like 1960s of operating systems and uh and I think a lot of the analogies cross over. Um and these LMS are kind of like these fallible uh you know people spirits that we have to learn to work with. And in order to do that properly,
2308.8s
operating systems. But it's so early.
2311.0s
It's like 1960s of operating systems and
2314.3s
uh and I think a lot of the analogies
2316.1s
cross over. Um and these LMS are kind of
2319.0s
like these fallible uh you know people
2321.6s
spirits that we have to learn to work
2323.4s
with. And in order to do that properly,
2325.6s
we need to adjust our infrastructure towards it. So when you're building these LLM apps, I describe some of the ways of working effectively with these LLMs and some of the tools that make that uh kind of possible and how you can spin this loop very very quickly and basically create partial tunneling
2327.7s
towards it. So when you're building
2329.0s
these LLM apps, I describe some of the
2330.6s
ways of working effectively with these
2332.8s
LLMs and some of the tools that make
2334.7s
that uh kind of possible and how you can
2337.0s
spin this loop very very quickly and
2339.0s
basically create partial tunneling
2340.8s
products and then um yeah, a lot of code has to also be written for the agents more directly. But in any case, going back to the Iron Man suit analogy, I think what we'll see over the next decade roughly is we're going to take the slider from left to right. And I'm
2343.5s
has to also be written for the agents
2344.9s
more directly. But in any case, going
2347.2s
back to the Iron Man suit analogy, I
2349.5s
think what we'll see over the next
2350.9s
decade roughly is we're going to take
2352.7s
the slider from left to right. And I'm
2355.9s
very interesting. It's going to be very interesting to see what that looks like. And I can't wait to build it with all of you. Thank you.
2357.6s
interesting to see what that looks like.
2359.4s
And I can't wait to build it with all of
2361.5s
you. Thank you.