Tesla Autonomy Day.mp3

4:02AM Apr 23, 2019

Speakers:

Keywords:

car

tesla

neural network

data

fleet

drive

design

chip

self driving

network

predictions

system

images

computer

hardware

vehicle

build

question

cameras

lidar

Hi, everyone,

I'm sorry for being late. Welcome to our very first, Emily stay for autonomy. I really hope that this is something we can do a little bit more regularly now, to keep you posted about the the development we're doing with regards to autonomy striving.

About three months ago, we were getting prepped up for our q4 earnings call with Ilan and quite a few other executives. And one of the things that I told the group is that, from all the conversations that I keep having with investors on regular basis, that the biggest gap that I see with what I see inside the company, and will be outside perception is, is our ability of autonomous driving. And it kind of makes sense, because for the past couple of years, we've been really talking about model three ramp. And you know, a lot of the debate has revolved around model three. But in reality, a lot of things have been happening in the background, we've been working on the new full self driving chip, we've had a complete overhaul of our neural net for vision, recognition, etc. So now that we finally started to produce our full self driving computer, we thought it's a good idea to just open the veil, invite everyone in and talk about everything that we've been doing for the past two years. So about three years ago, we wanted to use, we wanted to find the best possible chip for, for the autonomy. And we found out that there's no chip that's been designed from ground up for neural nets. So we invited my colleague, Steve Bannon, the VP of silicon engineering, to design such chip for us. He's got about 35 years of experience of building chips and designing chips. About 12 of those years were for a company called pa semi, which was later acquired by Apple. So he worked on dozens of different architectures and designs. And he was the lead designer, I think, for Apple, iPhone five, but just before joining Tesla, and he's going to be joined on stage by Elon

Musk,

thank you.

Actually, I was going to introduce Pete, but Martin's done. So

he's just the best

champion and system architect that I know in the world and, and that's an honor to have you and your team at Tesla.

And take away Just tell him about the incredible work that you and your team have done.

Thanks, Ilan. It's a pleasure to be here this morning. And a real treat really to tell you about all the work that my colleagues and I've been doing here at Tesla for the last three years.

I think

we'll tell you a little bit about how the whole thing got started, and then I'll introduce you to the full self driving computer and tell you a little bit about how it works. We'll dive into the chip itself and go through some of those details. I'll describe how the custom neural network accelerator that we design works, and then I'll show you some results. And hopefully I'll still be awake by then.

I was hired in February of 2016. I asked Ilan, if he was willing to spend all the money it takes to do full custom system design. And he said, Well, are we going to win? And I said, Well, yeah, of course. So he said I'm in. And so that kind of started, we heard a bunch of people and started thinking about what a what a custom designed chip for autonomy would look like. We spent 18 months doing the design. And then august of 2017, we released the design for manufacturing, we got it back in December it powered up and it actually worked very, very well. In the first try. We made a few changes and released a be zero revenue April of 2018. In July of 2018, the chip was qualified and we started full production of production quality parts. In December of 2018, we had the Thomas driving stack, running on the new hardware, and we're able to start retrofitting employee cars, and testing the hardware and software out in the real world. Just last March, we started shipping the new computer in the Model S and x. And just earlier in April, we started production and model three. So this whole program from the hiring of the first few employees to having it in full production in all three of our cars is just a little over three years and is probably the fastest system development program I've ever been associated with. And it really speaks a lot to the advantages of having a tremendous amount of vertical integration to allow you to do concurrent engineering and speed up deployment. In terms of goals, we were totally focuses exclusively on Tesla requirements. And that makes life a lot easier. If you have one and only one customer, you don't have to worry about anything else. One of those goals was to keep the power under hundred watts so that we could retrofit the new machine into the existing cars.

We also wanted to lower Park costs, so we could enable for redundancy for safety. At the time, we had a feminine wind estimated that it would take at least 50 trillion operations, a second of neural network performance to drive a car. And so we wanted to get at least that much and really as much as we possibly could. batch sizes, how many items you operate on at the same time. So for example, Google's TP one has a batch size of 256. And you have to wait around until you have to 56 things to process before you can get started. We didn't want to do that. So we design a machine with a batch size of one. So as soon as an image shows up, we process it immediately to minimize latency, which maximizes safety. We needed a GPU to run some post processing. At the time, we were doing quite a lot of that. But we speculated that over time, the amount of post processing on the GPU would decline as the neural networks got better and better. And that has actually come to pass. So we took a risk by putting a fairly modest GPU in the design, as you'll see. And that turned out to be a good bet. Security is super important. If you don't have a secure car, you can't have a safe car. So there's a lot of focus on security and then of course safety.

In terms of actually doing the chip designed, as Ilan alluded earlier, there was really no ground up neural network accelerator in existence in 2016. Everybody out there was adding instructions to their CPU or GPU or DSP to make it better for inference. But nobody was really just doing it

natively. So we set out to do that ourselves. And then for other components on the chip, we purchased industry standard IP for CPUs and GPUs, that allowed us to minimize the design time and also the risk to the program.

Another thing that was a little unexpected when I first arrived was our ability to leverage existing teams at Tesla. Tesla had wonderful power supply design teams, signal integrity analysis, package design, system software, firmware board designs, and really good system validation program that we were able to take advantage of to accelerate this program. Here's what it looks like.

Over there, on the right, you see all the connectors for the video that comes in from all the cameras that are in the car, you can see the two self driving computers in the middle of the board. And then on the left is the power supply and some control connections. And so I really love it when a solution is boiled down to its barest elements, you have video computing and power and, and it's

straightforward and simple.

Here's the original hardware 2.5 enclosure that the computer went into. And we've been shipping for the last two years, here's the new design for the FST computer, it's basically the same. And that, of course, is driven by the constraints of having a retrofit program for the cars. I'd like to point out that this is actually a pretty small computer, it fits behind the glove box between the glove box and the firewall on the car, it does not take up half your trunk.

As I said earlier, there's two fully independent computers on the board. You can see them there highlighted in blue and green to either side of the large SOC, you can see the DRAM chips for that we use for storage. And then below left, you see the flash chips that represent the file system. So these are two independent computers that boot up and run their own operating system. And

yet, if I can add something is that the general principle here is that it any part of this could fail, and the car will keep driving. So you can have cameras fail, you could have power circuits fail, you could have one of the Tesla full throttle for self driving computer chips fail, car keeps driving, the probability of this computer failing is substantially lower than somebody losing consciousness. That's that's the key metric least an order of magnitude.

Yep. So one of the things that we additional thing we do to keep the machine going is to have redundant power supplies in the car. So one one machines running on one power supply and the other ones on the other. The cameras are the same. So half of the cameras run on the blue power supply, the other half running the green power supply, and both chips receive all of the video and process it independently. So in terms of driving the car, the basic sequence is collect lots of information from the the world around you. Not only do we have cameras, we also have radar GPS maps, the IM use ultrasonic sensors around the car, we have wheel tech steering angle, we know what the acceleration deceleration of the car supposed to be. All of that gets integrated together to form a plan. Once we have a plan, the two machines exchange their independent version of the plan to make sure it's the same and assuming that we agree with an act and drive a car. Now once you've driven the car with some new control, you won't call someone to validate it. So we validate that what we transmitted was what we intend to transmit to the other actuators in the car. And then you can use the sensor suite to make sure that it happens. So if you ask the car to accelerate or break or steer right or left, you can look at the accelerometers and make sure that you are in fact doing that. So there's a tremendous amount of redundancy and overlap in both are data acquisition and our data monitoring capabilities here.

Moving on to talk about the full self driving chip a little bit.

It's packaged in a 37.5 millimeter VGA with 1600 balls, most of those are used for power and ground, but plenty for signal as well. If you take the lid off, it looks like this, you can see the package substrate and you can see the dice sitting in the center there. If you take the die off and flip it over, it looks like this, there's 13,000 see four bumps scattered across the top of the die. And then underneath that underneath that are 12 metal layers. And if you which is obscuring all the details of the design. So if you strip that off, it looks like this.

This is a 14 nanometer fin Fett CMOS process, it's 260 millimeters in size, which is a modest sized die. So for comparison, typical cell phone chip is about 100 millimeters square, which so we're quite a bit bigger than that. But a high end GPU would be more like 600 800 millimeters square. so so we're sort of in the middle, I would call it the sweet spot. It's a comfortable size to build. There's 250 million logic gates on there, and a total of 6 billion transistors, which even even though I work on this all the time, that's mind boggling to me.

The chip is manufactured and tested to EC q 100 standards, which is a standard automotive criteria.

Next, I'd like to just walk around the chip and explain all the different pieces to it. And I'm sort of going to go in the order that a pixel coming in from the camera would visit all the different pieces. So up there in the top left, you can see the camera similar interface, we can ingest 2.5 billion pixels per second, which is more than enough to cover all the sensors that we know about. We have an on chip network that distributes data from the memory system. So the pixels would travel across the network to the memory controllers on the right and left edges of the chip. We use industry standard LP DDR for memory running at 404,266 gigabits per second, which gives us a peak bandwidth is 68 gigabytes a second, which is a pretty healthy bandwidth. But again, this is not like ridiculous, so we're sort of trying to stay in the comfortable sweet spot for cost reasons.

The image signal processor has a 24 bit internal pipeline that allows us to do take full advantage the HDR sensors that we have around the car, it does advanced time mapping, which helps to bring out details and shadows. And then it has advanced noise reduction, which just improves your overall quality of the images that we're using in the neural network.

The neural network accelerator itself, there's two of them on the chip, they each have 32 megabytes of RAM to hold temporary results and minimize the amount of data that we have to transmit on and off the chip which helps reduce power. Each array has a 96 by 96, multiply an array with in place accumulation, which allows us to do almost 10,000 multiplied ads per cycle. There's dedicated hardware dedicated pooling hardware. And each of these delivered 306 excuse me, each one delivers 36 trillion operations per second. And they operate at two gigahertz. The two of them together on a dime deliver 72 trillion operations a second. So we exceeded our goal of 50. Tariffs by a fair bit.

There's also a video and co we encode video and use it in a very variety of places in the car, including the backup camera display. There's optionally a user feature for dash cam, and also for clip logging data to the cloud, which Stewart and Andre will talk about more later. There's a GPU on the chip, it's modest performance that has support for both 32 and 16 bit floating point. And then we have 12 a 7264 bit CPUs for general purpose processing, they operate at 2.2 gigahertz. And this represents about two and a half times the performance available in the current solution.

There's a safety system that contains two CPUs that operate in lockstep the system is the final arbiter of whether it's safe to actually drive the actuators in the car. So this is where the two plans come together. And we decide whether it's safe or not to move forward. And lastly, there's a safety system and the basically the job of the safety system is to ensure that this chip only runs software that's been cryptographic system signed by Tesla.

If it's not been signed by Tesla, then the chip does not operate.

Now I've told you a lot of different performance numbers, and I thought it'd be helpful maybe to put it into perspective a little bit. So throughout this talk, I'm going to talk about a neural network from our narrow camera. It uses 35 Giga 35 billion operations 35 gig ops. And if we use all 12 CPUs to process that network, we could do one and a half frames per second, which is super slow, not nearly adequate to drive the car. If we use the 600 Giga flop GPU, the same network, we get 17 frames per second, which is still not good enough to drive a car with a cameras. The neural network accelerators on the chip can deliver 2100 frames per second. And you can see from the scaling as we moved along, that the amount of computing and the CPU and GPU are basically insignificant to what's available in the neural network accelerator. It's really as night and day.

So moving on to talk about the neural network accelerator, we're just going to stop for some water.

On the left, there is a cartoon of a neural network. Just to give you an idea of what's going on, the data comes in at the top and visits each of the boxes. And the data flows along the arrows to the different boxes. The boxes are typically convolutions or D convolutions with reviews, the green boxes are pooling layers. And the important thing about this is that

the data produced by one box is then consumed by the next box, and then you don't need it anymore, you can throw it away. So all of that temporary data that that gets created and destroyed. As you flow through the network, there's no need to store that off chip and DRAM. So we keep all that data in SRAM. And I'll explain why that super important in a few minutes. If you look over on the right side of this, you can see that in this network, of the 35 billion operator, almost all of them are convolution, which is based on products, the rest are D convolution also based on product, and then reload and cooling, which are relatively simple operations. So if you were designing some hardware, you'd clearly target doing products, which are based on multiply ad, and really kill that. But imagine that you sped it up by a factor of 10,000. So 100%, all of a sudden turns into point 1% point oh 1%. And suddenly, the real and cooling operations are going to be quite significant. So our hardware doesn't, or hardware design includes dedicated resources for processing release and pooling as well.

Now, this chip is operating in a thermal constrained environment. So we had to be very careful about how we burn that power, we want to maximize the amount of arithmetic we can do. So we picked integer add, it's nine times less energy than a corresponding floating point ad. And we picked eight bit by bit integer multiply, which is significantly less power than any other multiply operations, and is probably enough accuracy to get good results. In terms of memory, we chose to use SRAM as much as possible. And you can see there that going off chip to DRAM is approximately 100 times more expensive in terms of energy consumption, then using local SRAM. So clearly, we want to use local restroom as much as possible. In terms of control, this is data that was published in a paper by Mark haurwitz, that is a sec where he sort of critique how much power it takes to execute a single instruction on a regular introduced CPU. And you can see that the ad operation is only point one 5% of the total power, all the rest of the power is control, overhead and bookkeeping. So in our design reset to basically get rid of all that as much as possible, because what we're really interested in is arithmetic. So here's the design that we finished,

you can see that it's dominated by the 32 megabytes SRAM, there's big banks on the left and right and in the center bottom, and then all the computing is done in the upper middle. Every single clock, we read 256 bytes of activation data out of the SRAM array, 128 bytes of weight data out of the SRAM array, and we combine it in a in a 96 by 96 mile an array, which performs 9000 multiply ads per clock at two gigahertz. That's a total of 3.63 36.8

teraflops.

Now, when we're done with a product, we unload the engine so that we shift the data out across the dedicated really unit, optionally across a cooling unit. And then finally, into a right buffer, we're all the results get aggregated up, and then we write out 128 bytes per cycle back into the SRAM. And this whole thing cycles along all the time continuously. So we're doing products while we're unloading previous results, doing pooling and writing back into the memory. If you add it all up, at two year hurts, you need one terabyte per second of SRAM bandwidth to support all that work. And so the hardware supplies that. So one terabyte per second of bandwidth per engine, there's two on the chip, two terabytes per second.

The chip has an accelerator has a relatively small instruction set, we have a DMA read operation to bring data in from memory, we have a DMA right operation to push results back out to memory, we have three dot product based instructions convolution, D convolution and inner product. And then to relatively simple scale is a one input one output up operation and Elvises, two inputs in one output. And then of course, stop when you're done.

We had to develop a neural network compiler for this. So we take the neural network that's been trained by our vision team, as it would be deployed an older cars. And when you take that and compile it for use on the new accelerator.

The compiler does layer fusion, which allows us to maximize the computing each time we read data out of the SRAM and put it back. It also does some smoothing, so that the demands on the memory system aren't too lumpy. And then we also do channel channel padding to reduce bank conflicts. And we do bank aware extreme allocation. And this was a case where we could have put more hardware in the design to handle bank conflicts. But by pushing it into software, we save hardware and power at the cost of some software complexity, we also automatically insert DMS into the graph. So that data arrives just in time for computing without having to stall the machine. And then at the end, regenerate all the code, regenerate all the weight data, we compress it and we add a CRC checksum for reliability.

To run a program, all the neural network descriptions or programs are loaded into SRAM at the start, then they sit there ready to go all the time. So to run a network, you have to program the address of the input buffer, which presumably is a new image that just arrived from a camera, you set the output buffer address, you set the pointer to the network weights, and then you set go. And then the machine goes off and will sequence through the entire neural network all by itself, usually running for a million or 2 million cycles. And then when it's done, you get an interrupt and can post process the results. So moving on to results,

we had a goal to stay under 100 watts. This is measured data from cars driving around running the full auto pilot stack, or dissipating 72 watts, which is a little bit more power than the previous design. But with the dramatic improvement in performance, it's still a pretty good answer of that 72 watts, about 15 watts is is being consumed running the neural networks.

In terms of cost, the silicon cost of this solution, there's about 80% of what we were paying before. So we are saving money by switching to this solution. And in terms of performance, we took the narrow camera neural network, which I've been talking about that has 35 billion operations in it. We ran it on the old hardware as in a loop as quick as possible. And we delivered 110 frames per second. We took the same data, the same network, compile it for hardware for the new FST computer. And using all four accelerators, we can get 2300 frames per second processed. So a factor of 21.

I think

this This is perhaps the most significant slide.

It's night and day.

I've never worked on a project where the performance increase was more than

three.

So this this was pretty fun.

If you compare it to say in videos Dr. Xavier solution, a single chip delivers 21 tariffs are full self driving computer with two chips as 140 40

erupts. So,

to conclude, I think we've created design that delivers outstanding performance 144 tear ups for neural network processing. It has outstanding power performance, we managed to jam all of that performance into the thermal budget that we had, it enables fully redundant computing solution has a modest cost. And really the important thing is that this FST computer will enable a new level of safety and autonomy and test those vehicles without impacting their cost or range. Something that I think we're all looking forward to.

Yeah. Why don't we do your q amp a after each segment? So if people have questions about the hardware they can ask right now. The The reason I asked Pete to do just a detailed,

far more detail than perhaps most people would appreciate.

Dive into the Tesla full self driving computer is because it at first it seems improbable. How could it be that Tesla, who has never designed a chip before were designed the best chip in the world. But that is objectively what has occurred not not best by a small margin best by a huge margin.

It's in the cars right now. Old Tesla's being produced right now have this computer. We switched over from the video solution for sex about a month ago, I switched over model three about 10 days ago. All cars being produced have the have all the hardware necessary compute and otherwise full, full self driving.

I'll say that again. Old Tesla cars being produced right now have everything necessary for full self driving. All you need to do is improve the software. And later today, you will try drive the cars with the development version of the improved software and you will see for yourselves

questions for Pete

a trip Chowdhry, global equities research very very impressive in every shape and form. I was wondering like I took some notes you are using activation function.

Arielle you the rectify leaner unit. But if you think about the deep neural network thinks it has multiple layers, and some algorithms may use different activation functions for different hidden layers, like softmax or 10. h, do you have flexibility for incorporating different activation functions rather than l you in your platform? Then have a follow up?

Yes. We have implementations of 10 inch and sigmoid for example.

Beautiful. One last question. Like in the nanometers, you mentioned 40 nanometers, as I was wondering, wouldn't make sense to come at a lower maybe 10 nanometers, two years down or maybe seven.

At the time we started the design, not all the IP that we wanted to purchase was available in 10 nanometer. So we finished the design and 14

it's maybe worth pointing out that we finished this design, like maybe one and a half, two years ago and began design of the next generation. We're not talking about the next generation today. But we're about halfway through it.

That will all the things that are obvious for next generation chip we're doing.

You talked about the software is that piece now You did a great job. I was blown away, understood 10% of what you said. But I trusted it's in good hands.

Thanks.

So it feels like you got the hardware pieces done. And that was really hard to do. And now you had to do the software piece. Now maybe that's outside of your expertise. How should we think about that software piece?

What can I ask for better introduction to

to Andre and Stuart, I think

you're there any questions for the trip part before we're the next part of the presentation? Is neural nets and software.

So maybe on the chip side, the last slide was 144 trillions of operations per second versus was it video at 21? That's right. And maybe Can you just contextualize that for a finance person? why that's so significant, that gap?

Thank you. Well, I mean, it's a factor of seven and performance delta. So that means you can do seven times as many frames, you can run neural networks that are seven times larger and more sophisticated. So it's a it's a very big currency that you can spend on on lots of interesting things to make the car better. I think that save your power usage is higher than ours,

Xavier powers.

I comparable, don't know that, I believe I believe it's like,

the

best my knowledge, the the power requirements would increase at least to the same degree of factor of seven. And, and costs would also increased by a factor of seven.

Great, so yeah, power power is real problem, because it also reduces range. So it has the house for power is very high. And then you have to get rid of that power, like the thermal problem becomes really significant. Because you gotta gotta get rid of all that power. So

thank you very much. I think we have a lot

of quite a bit this this app, ask the questions. If you guys don't mind the day running, but long, just we're going to do the drive demos afterwards. So if you've got if you're if you if anybody needs to pop out and do drive demos a little sooner, you're welcome to do that. But we want to make sure we answer your questions.

Yep.

Produced by money from UBS, Intel and AMD, to some extent, have started moving towards a template based architecture. I did not notice a template based design here. Do you think that looking forward, that would be something that might be of interest to you guys, from an architecture standpoint,

a template based architecture?

Yes.

We're not currently considering anything like that. I think that's mostly useful when you need to use different styles of technology. So if you want to integrate silicon, germanium, or DRAM technology on the same silicon substrate, that gets pretty interesting, but until the dye size gets of noxious, I wouldn't go there.

Okay, to clear the strategy here, and it started, you know, basically three over three years ago, was designed build a computer that is fully optimized and aiming for full self driving, then write software that is designed to work specifically on that computer, get the most out of that computer, see of tailored hardware. That is that is a master of one trade, self driving.

The

Nvidia is a great company, but they have many customers. And so when, as they, as they apply their resources, they need to do a generalized solution.

We care about one thing self driving, so that it was designed to do that incredibly well. The software is also designed to run on that hardware incredibly well. And the combination of the software in the hardware, I think is unbearable.

I

this chip

is designed to process video input. In case you use, let's say LIDAR, would it be able to process that as well? Or is that is it primarily for video,

which I explained to you today is that LIDAR is is a fool's errand. And anyone who luck relying on LIDAR is doomed.

doomed, expensive, expensive sensors

that are unnecessary. It's like having a whole bunch of expensive panda. Panda sees like a pet. One appendix is bad. Well, now there were a whole bunch of them. That's ridiculous. You'll say

hi. Hi.

Hi.

Hi. So just two questions on just on the power can

assumption is there a way to maybe give us like a rule of thumb on,

you know, every White is reduces range by

certain percent or certain amount, just so we can get a sense of how much

of

a model three, the target consumption is 250 watts per mile.

It depends on the nature of the driving as to how many miles that affects in city would have a much bigger effect than on highway. So you know, if you're driving for for an hour and sit in a city,

and you had a solution, hypothetically, that you know, was it was it was a kilowatt, you'd lose four miles

on a model three. So if you're only going, say 12 miles an hour, then then that's like there would be a 25% impact on range and city. It's basically powers of the power that the power of the system has a massive unsteady range, which is where we think most most of the robo taxi market will be.

So power is extremely important.

I'm sorry, I couldn't hear you. Thank you.

What's the primary design objective of

the next generation? Chip? We don't want to talk too much about the next generation chip, but it's

it'll be at least, let's say three times better than the current system.

About two years away.

Is is the chipping man, you don't manufacture the chip, you contract that out? And how much cost reduction? Does that save the overall vehicle cost?

But the 20% cost reduction? I said it was the peace cost per vehicle reduction. Not Not that wasn't a development costs. I was just the actual

Yeah, I'm saying but like if I manufacturing these in mass is a saving money in doing it yourself?

Yes. A little bit.

I mean, what most chips are made for most people don't make toast with their own fab. It's a pretty unusual. I think

that you don't see any supply issues without getting the chip mass produce

the cost saving pays for the development? I mean, the basic strategy going to Ilan was we're going to build this chip that's going to reduce the cost. And Ilan said, times a million cars a year. Deal.

That's correct. Yes.

Sorry.

If they're

really specific questions, we can ask them others. There will be a q&a opportunity after after Andre talk. And after Stewart talks. So there will be two other q&a opportunities. This is very specific, then.

Also, I'll be here all afternoon.

Yeah, exactly. And he will be here at the end as well. So good. I near you, thanks.

That died photo you had there's the neural processor takes up quite a bit of the DI I'm curious. Is that your own design? Or is there some external IP there?

Yes, that was a custom designed for by Tesla.

Okay, and then the I guess the follow on would be there's probably a fair amount of opportunity to reduce that footprint is you tweak the design?

It's actually quite dense. So in terms of reducing it, I don't think so it'll will greatly enhance the functional capabilities in the next generation.

Okay, and then last question, can you share where you're, you're having this part?

What were what were we?

Oh, as Samsung? Samsung?

Yes. Austin, Texas. Thank you.

Grant to knock at to knock apple.

Just curious how defensible your chip technologies and design is from a from a IP point of view. And hoping that you won't won't be offering a lot of the IP outside for free. Thanks.

We have filed on the order of a dozen patents on this technology.

Fundamentally, it's linear algebra, which I don't think you can patent. Not sure. But

I think if somebody started today, and they were really good, they might have something like what we have right now, in three years. At but in two years, we'll have some time, something three times better.

can go both the intellectual property protection, you have the best intellectual property, and some people just steal it? for the fun of it. I was wondering if we look at a few interactions with Aurora, that companies to industry believe they stole your intellectual property? I think the key ingredient that you need to protect is the weights that are associated to various parameters. Do you think your chip can do something to prevent anybody maybe encrypt all the weights so that even you don't know what the weights are at the chip level so that your intellectual property remains inside it? And nobody knows about it? And nobody can just feel it?

When I'd like to meet the person that could do that, because they were I would hire them in a heartbeat. Yeah, so a real hard problem.

Yeah. I mean, we do encrypt the, it's hard to crack. So if they can crack, it's very good. They can crack it, and then also figure out the software and the neural net system, and everything else. They can design it from scratch. Like that's, that's all.

It's our intention to prevent people from stealing all that stuff. I mean, if they do, we hope it at least takes a long time.

It will definitely take them a long time. Yeah. I mean, I felt like it was our goal to do that, how would we do it? You're very difficult.

But the thing that's, I think, a very powerful, sustainable advantage for us is the fleet. Nobody has the fleet, those weights are constantly being updated and improved, based on billions of miles driven.

Tesla has 100 times more cars with the full self driving hardware than everyone else combined.

You know, we, we have

no, this quarter will have 500,000 cars with the full eight cameras set up for motor Sonics. Someone will still be on hardware to but we're still have the data gathering ability. And then by a year from now, we'll have over a million cars with full self driving computer hardware everything.

Yeah. So we have a just a massive data advantage. It's similar to like, you know how, like, the Google search engine has a massive advantage because people use it. And people, people are programming, effectively program Google with the queries and their results.

Just press press you on that. And please reframe the questions and tackling and if it's appropriate, but you know, when we talked to a mo or a video, they do speak with a equivalent conviction about their leadership because of their competence in simulating miles driven. Can you talk about the advantage of having real world miles versus simulated miles? Because I think they express that, you know, by the time you get a million miles, they can simulate a billion and no Formula One race car driver, for example, could ever successfully complete a real world track without driving in a simulator? Can you talk about the advantages? It sounds like that, that you perceive to have associated with having data ingestion coming from real world miles versus simulated miles?

Absolutely. The simulator, we have a quite a good simulation too. But it just does not capture the long tail of weird things that happened in the real world if the simulation fully captured the real world? Well, I mean, that would be proof that we're living in a simulation, I think

it doesn't, I wish.

But it simulations do not capture the real world. They're real was really weird and messy. You need the you need the cars on the road.

We're actually gonna get into that in Adrienne's to his presentation. So okay, why don't we move on to to Andre? Great, thanks. Thank you.

Thank you, everybody. Thank you very much.

The last question was actually a very good segue.

Because one thing to remember about our FST computer is that it can run much more complex neural nets for much more precise image recognition. And to talk to you about how we actually get that image data and how we analyze them. We have our Senior Director of AI, Andre Karpati, who's going to explain all of that to you. Andre has a PhD from Stanford University, where he studied computer science, focusing on education, recognition and deep learning.

Andre, why don't you just talk do your own intro? is there's a lot of PhDs from Stanford, that's not important.

Yes. We don't care. Come on.

Thank you.

Andre started the computer vision classes. Stanford, that's much more significant. That's what matters. Just so if you please talk about your background in

that is not valuable. just telling a story of told about the separate done? Yeah, and then

sure, yeah. So yeah, I think I've been training neural networks, basically, for what is now a decade. And these neural networks were not actually really used in the industry until maybe five or six years ago. So it's been some time that I've been trained these networks, and then included, you know, institutions at Stanford, at at opening I at Google, and really just training a lot of neural networks, not just for images, but also for natural language, and designing architectures that coupled those two modalities for for my PhD.

So it was a computer science class.

Oh, yeah. And at Stanford actually taught the compositional neural networks class. And so I was the primary instructor for that class, I actually started the course and designed the entire curriculum. So in the beginning was about hundred and 50 students, and then it grew to 700 students over the next two or three years. So it's a very popular classes, one of the largest classes at Stanford right now.

So that was also really successful. I mean, Andre is like really one of the best computer vision people in the world, arguably the best.

Okay, thank you.

Yeah. So Hello, everyone. So Pete told you all about the chip that we've designed that runs neural networks in the car, my team is responsible for training of these neural networks. And that includes all of data collection, from the fleet, neural network training, and then some of the deployment onto that chip.

So what do the neural networks do? Exactly in the car. So what we are seeing here is a stream of videos from across the vehicle across the car, these are eight cameras that send us videos, and then these neural networks are looking at those videos and are processing them and making predictions about what they're seeing. And so the some of the things that we're interested in, and some of the things you're seeing on this visualization here are line markings, other objects, the distances to those objects, what we call drivable space, shown in blue, which is where the car is allowed to go. And a lot of other predictions like traffic lights, traffic size, and so on.

Now,

for my talk, I will talk roughly into in three stages. So first, I'm going to give you a short primer on neural networks and how they work and how they're trained. And I need to do this because I need to explain in the second part, why it is such a big deal that we have the fleet and why it's so important, and why it's a key enabling factor to really training this, you know, what works and making them work effectively on the roads. And then the third stage, I'll talk about vision, and LIDAR, and how we can estimate depth just from vision alone.

So the core problem that these networks are solving in the car, is that a visual recognition. So for United these are very, this is a very simple problem, you can look at all of these four images, and you can see that they contain a cello about an iguana or scissors. So this is very simple and effortless for us. This is not the case for computers. And the reason for that is that these images are to a computer, really just a massive grid of pixels. And at each pixel, you have the brightness value at that point. And so instead of just seeing an image, a computer really gets a million numbers in a grid, that tells you the brightness valleys at all the positions, the matrix, if you will,

it really is the matrix.

Yeah. And so we have to go from that grid of pixels and brightness values into high level concepts like a Guana, and so on. And as you might imagine, this Guana has a certain pattern of brightness values about iguanas actually can take on many appearances. So they can be in many different appearances, different poses and different brightness conditions against different backgrounds, you can have a different crops of that a Guana. And so we have to be robust across all those conditions. And we have to understand that all those different patterns actually correspond to ego on us. Now, the reason you and I are very good at this is because we have a massive neural network inside our heads, there's processing those images. So light has direct now travels in the back of your brain to the visual cortex. And the visual cortex consists of many neurons that are wired together, and that are doing all the pattern recognition on top of those images. And, really, over the last, I would say about five years, the state of the art approaches to processing images using computers have also started to use neural networks, but in this case, artificial neural networks. But these artificial neural networks, and this is just a cartoon diagram of it are a very rough mathematical approximation to your visual cortex, we really do have neurons, and they are connected together. And here, I'm only showing three or four neurons in three or four in four layers. But a typical neural network will have 10s to hundreds of millions of neurons, and each neuron will have 1000 connections. So these are really large pieces of almost simulated tissue. And then what we can do is we can take those neural networks, and we can show them images. So for example, I can feed my Guana into this neural network, and the network will make predictions about what it's seen. Now, in the beginning, these neural networks are initialized completely randomly. So the connection strengths between all those different neurons are completely random. And therefore the predictions of that network are also going to be completely random. So it might think that you're actually looking at a boat right now. And it's very unlikely that this is actually an iguana. And during the training, during the training process, really, what we're doing is we know that that's actually in the Guana, we have a label, so what we're doing is, we're basically saying, we'd like to the probability of A Guana to be larger for this image, and the probability of all the other things to go down. And then there's a mathematical process called back propagation of stochastic gradient descent, that allows us to back propagate that signal through those connections, and update every one of those connections.

And update every one of those connections, just a little amount. And once the update is complete, the probability of A Guana for this image will go up a little bit. So it might become 14%. And probably of the other things will go down. And of course, we don't just do this for this single image, we actually have entire large data sets that are labeled. So we have lots of images, typically, you might have millions of images, thousands of labels or something like that, and you are doing forward backward passes over and over again. So you're showing the computer, here's an image, it has an opinion, and then you're saying this is the correct answer. And it tunes itself a little bit, you repeat this millions of times. And you sometimes you show images, the same image to the computer, you know, hundreds of times as well. So the network training typically will take on the order a few hours, or a few days, depending on how big of a network your training. And that's the process of training and neural network. Now, there's something very intuitive about the way neural networks work that I have to really get into. And that is that they really do require a lot of these examples. And they really do start from scratch, they know nothing. And it's really hard to wrap your head around around this. So as an example, here's a cute dog. And you probably may not know the breed of this dog, but the correct answer is that this is a Japanese spaniel. Now all of us are looking at this and we're seeing Japanese spaniel, we're like, Okay, I got it, I understand kind of what this Japanese spaniel looks like. And if I show you a few more images of other dogs, you can probably pick out other Japanese spaniels here. So in particular, those three look like a Japanese spaniel, and the other ones do not. So you can do this very quickly. And you need one example. But computers do not work like this, they actually need a ton of data of Japanese spaniels. So this is a grid of Japanese spaniels showing them you need thousands of examples, showing them in different poses, different brightness conditions, different backgrounds, different crops, you really need to teach the computer from all the different angles what this Japanese spaniel looks like. And it really requires all that data together to work, otherwise, the computer can't pick up on that pattern automatically. So what does this imply about the setting of self driving, of course, we don't care about dog breeds too much, maybe we will at some point. But for now, we really care about line markings, objects, where they are, where we can drive, and so on. So the way we do this is we don't have labels like a Guana for images. But we do have images from the fleet like this. And we're interested in, for example, in line markings. So we a human typically goes into an image, and using a mouse annotates the line markings. So here's an example of an annotation that a human could create a label for this image. And it's saying that that's what you should be seeing in this image. These are the lane line markings. And then what we can do is we can go to the fleet, and we can ask for more images from the fleet. And if you ask the fleet, if you just do a nice job of this, and you just asked for images at random, the fleet might respond with images like this, typically going forward on some highway, this is what

you might just get like a random collection like this. And we would annotate all that data. Now, if you're not careful, and you only annotate a random distribution of this data, you network will kind of pick up on this this random distribution on data and work only in that regime. So if you show it slightly different example.

For

example, here is an image that actually the road road is curving, and it a bit of a more residential neighborhood. Then if you show the neural network, this image, that network might make a prediction that is incorrect. It might say that, okay, well, I've seen lots of times on highways, let's just go forward. So here's a possible prediction. And of course, this is very incorrect. But the neural network really can't be blamed, it does not know that the train on the the tree on the left, whether or not it matters or not, it does not know if the car on the right matters or not towards the landline, it does not know that the that the buildings in the background matter or not, it really starts completely from scratch. And you and I know that the truth is that none of those things matter. What actually matters is that there are a few white line markings over there and in a vanishing point. And the fact that they crow a little bit should pull the prediction. Except there's no mechanism by which we can just tell the neural network, hey, those little markings actually matter, the only tool in the toolbox that we have is labeled data. So what we do is we need to take images like this when the network fails, and we need to label them correctly. So in this case, we will turn blame to the right. And then we need to feed lots of images of this to the neural net, and your lead over time will accumulate will basically pick up on this pattern, that those things there don't matter. But those landline markings do, and we learned to predict to the correct lane. So what's really critical is not just the scale of the data set, we don't just want millions of images, we actually need to do a really good job of covering the possible space of things that the car might encounter on the roads. So we need to teach the computer how to handle scenarios where it's night, and what you have all these different specular reflections. And as you might imagine, the brightness patterns and these images will look very different. We have to teach a computer how to deal with shadows, how to deal with forks in the road, how to deal with large objects that might be taking up most of that image, how to deal with tunnels, or how to deal with construction sites. And in all these cases, there's no again explicit mechanism to tell the network what to do, we only have massive amounts of data, we want to source all those images, and we want to annotate the correct lines, and the network will pick up on the patterns of those. Now large and very data sets make basically make these networks work very well. This is not just a finding for us here at Tesla. This is a ubiquitous finding across the entire industry. So experiments and research from Google, from Facebook, from Baidu from alphabets DeepMind. Also similar plots, where neural networks really love data and love scale, and variety. As you add more data, these neural networks start to work better and get higher accuracies for free. So more data, it just makes them work better. Now, a number of companies have number of people have kind of pointed out that potentially, we could use simulation to actually achieve the scale of the data sets. And we're in charge of a lot of the conditions here. And maybe we can achieve some variety in the simulator. Now at Tesla. And that was also kind of brought up in the question. Questions just just before this. Now at Tesla, this is actually a screenshot of our simulator, we use simulation, extensively, we use it to develop and evaluate the software, we've also even used it for training quite successfully. So but really, when it comes to training data from neural networks, there really is no substitute for real data. The simulator simulations have a lot of trouble with modeling appearance, physics, and the behaviors of all the agents around you.

So

here are some examples to really drive that point across the real world really throws a lot of crazy stuff at you. So in this case, for example, we have very complicated environments, with snow with trees with wind, we have various visual artifacts that are hard to simulate potentially, we have complicated construction sites, bushes, and plastic bags that can go in that can kind of go around with the wind, complicated construction sites that might feature lots of people, kids, animals all mixed in and simulating how those things interact. And flow through this construction zone might actually be completing completely intractable, it's not about the movement of any one pedestrian in there. It's about how they respond to each other, how those cars respond to each other, and how they respond to you driving in that setting. And all of those are actually really tricky to simulate, it's almost like you have to solve the self driving problem to just simulate other cars in your simulation. So it's really complicated. So we have dogs, exotic animals, and in some cases is not even that you can't simulate it is that you can't even come up with it. Yeah. So for example, I didn't know that you can have truck, truck on truck like that. But in the real world, you find this and you find lots of other things that are very hard to really even come up with. So really, the variety that I'm seeing in the data coming from the fleet is just crazy. With respect to what we have in the simulator, we have a really good simulator.

It's anything like simulation, you're fundamentally a grain, you're grading your own homework. So you know, if you know that you're going to simulate it. Okay, you can definitely solve for it. But as Andre saying, you don't know what you don't know, the world is very weird, and has millions of corners.

And if you if somebody can produce a self driving simulation that accurately matches reality, that in itself would be in a monumental achievement of of human capability. They can't, there's no way.

Yeah. So

I think the three points that I really tried to drive home until now are to get neural networks to work well, you require these three essentials, you require a large data set, a very data set and a real data set. And if you have those capabilities, you can actually train your networks and make them work very well. And so why is Tesla in such a unique and interesting position to really get all these three essentials, right. And the answer to that, of course, is the fleet, we can really source data from it and make our network systems work extremely well. So let me take you through a concrete example of, for example, making the object detector work better to give you a sense of how we develop these neural networks, how we iterate on them, and how we actually get them to work over time. So object detection is something we care a lot about, we'd like to put bounding boxes around, say, the cars and the objects here because we need to track them and when you understand how they might move around. So again, we might ask human entertainers to give us some annotations for these. And humans might go in and might tell you that, okay, those patterns over there are cars, and bicycles, and so on. And you can train your neural network on this. But if you're not careful, the neural network will will make Miss predictions in some cases. So as an example, if we stumbled by a car like this, that has a bike on the back of it, then the neural network actually went when I joined, would actually create two deductions, it would create a car detection and bicycle detection. And that's actually kind of correct, because I guess both of those objects actually exist. But for the purposes of the controller and a planner downstream, you really don't want to deal with the fact that this bicycle can go with the car. The truth is that that bike is attached to that car. So in terms of like just objects on the road, there's a single object, a single car. And so what you'd like to do now is you'd like to just potentially annotate lots of those images, as this is just a single car. So the process that we that we go through internally, the team is that we take this image or a few images that show this pattern. And we have a mechanism and machine learning mechanism by which we can ask the fleet to source of examples that look like that. And the fleet might respond with images that contains those patterns. So as an example, the six images might come from the fleet, they all contain bikes on backs of cars. And we would go in and we would annotate all those as just a single car. And then the the performance of that detector actually improves. And the network internally understands that, hey, when the bike is just attached to the car, that's actually just a single car. And it can learn that given enough examples. And that's how we sort of fixed that problem. I will mention that I talked quite a bit about sourcing data from the fleet, I just want to make a quick point that we've designed this from the beginning with privacy in mind. And all the data that we use for training is anonymized. Now the fleet doesn't just respond with bicycles on backs of cars, we look for all the things, we look for lots of things all the time. So for example, if we look for boats, and the fleet can respond with boats, we look for construction sites, and the fleet can send us lots of construction insights from across the world, we look for even slightly more rare cases. So for example, finding the Bry on the road is pretty important to us. So these are examples of images that have strange to us from the fleet that show tires, cones, plastic bags, and things like that, if we can source these at scale, we can annotate them correctly, and the neural network will learn how to deal with them in the world. Here's another example. Animals, of course, also a very rare occurrence and event. But we want the neural network to really understand what's going on here at the these are animals. And we want to deal with that correctly. So to summarize, the process by which we iterate on neural network predictions look something like this, we start with a seed data set that was potentially source that random, we annotate that data set, and then we train your networks on that data set and put that in the car. And then we have mechanisms by which we notice in accuracies in the car when this detector may be misbehaving. So for example, if we detect that the neural network might be uncertain, or if we detect that,

or if there's a driver intervention, or any of those settings, we can create the trigger infrastructure that sends us data of those inaccuracies. And so for example, if we don't perform very well online detection on tunnels, then we can notice that there's a problem in tunnels, that image would enter our unit tests. So we can verify that we're actually fixing the problem over time. But now what you do is to fix this, in accuracy, you need to source many more examples that look like that. So we asked the fleet to please send as many more tunnels. And then we label all those tunnels correctly, we incorporate that into the training set. And we retrain the network, redeploy, and iterate the cycle over and over again. And so we refer to this iterative process by which we improve these predictions as the data engine. So iterative Lee, deploying something potentially shadow mode, sourcing in accuracies and incorporating the training set over and over again. And we do this basically, for all the predictions of these neural networks. Now, so far, I've talked about a lot of explicit labeling. So like I mentioned, we asked people to annotate data. This is an expensive process, in time, and also.

Yeah, it's just an expensive process. And so these annotations, of course, can be very expensive to to achieve. So what I want to talk about also is really to utilize the power of the fleet, you don't want to go through this human annotation bottle, like you want to just streaming data and automate it automatically. And we have multiple mechanisms by which we can do this. So as one example of a project that we recently

worked on, is the detection of currents. So you're driving down the highway, someone is on the left or on the right, and they cut an in front of you into your lane. So here's a video showing the autopilot detecting that this car is intruding into our lane. Now, of course, we'd like to detect a current as fast as possible. So the way we approach this problem is we don't write explicit code for is the left blinker on is a right blinker on track the keyboard over time and see if it's moving horizontally, we actually use a fleet learning approach. So the way this works is, we ask the fleet to please send us data whenever they see a car trends from a right lane to the central link or from left to center. And then what we do is we rewind time backwards, and we automatically can annotate that, hey, that car will turn will in 1.3 seconds, cut in front of the unfamiliar. And then we can use that for training to know that. And so the neural net will automatically pick up on a lot of these patterns. So for example, the cars are typically odd, the moving this way, maybe the blinkers on, all that stuff happens internally inside the neural net, just from these examples. So we asked the fleet to automatically send us all this data, we can get half a million or so images. And all of these would be annotated For Currents, and then we train the network. And then we took this cut in network, and we deployed it to the fleet, but we don't turn it on yet, we run it in shadow mode. And in shadow mode, the network is always making predictions, hey, I think this, this vehicle is going to cut in from the way it looks, this vehicle is going to cut in, and then we look for Miss predictions. So as an example, this is an clip that we had from shadowed of the Cartoon Network. And it's kind of hard to see, but the network thought that the vehicle right ahead of us in on the right I was going to call in. And you can sort of see that it's it's slightly flirting with the lane line is trying to sort of encroaching a little bit. And then I got excited and they thought that that was going to be cut in that vehicle will actually end up in our center lane. That turns out to be incorrect. And the vehicle did not actually do that. So what we do now is we just turned the data engine, we source that ran in the shadow mode is making predictions, it makes some false positives. And there are some false negative detections. So we got overexcited and sometimes, and sometimes we miss the cutting when it actually happened. All those create a trigger that streams to us. And that gets incorporated now for free. There's no humans harmed in the process of labeling this data incorporated for free into our training set, we returned to network and we deployed the shadow mode. And so we can spin this a few times. And we always look at the false positives and negatives coming from the fleet. And once we're happy with the false positives, negative three show, we actually flipped a bit and actually let the car control to that network. And so you may have noticed, we actually shipped one of our first versions of a company tractor, approximately, I think three months ago, so if you've noticed that the cars much better at detecting currents, that's fleet learning, operating at scale. Yes,

it actually works quite nicely.

So that's like learning no humans were harmed in the process. It's just a lot of neural network training based on data and a lot of shadow mode. And looking at those results. And other

for essentially like

everyone's training the network all the time is what it amounts to weather, the weather, what appears on or off, network is being trained, every mile that's driven for the car that's hotter to or above is training the network.

Another interesting way that we use this in the scheme of Fleet learning at the other projects that we'll talk about is a path production. So while you are driving the car, what you're actually doing is you are emptying the data, because you are steering the wheel, you're telling us how to traverse different environments. So what we're looking at here is a some person in the fleet who took a left through an intersection. And what we do here is we we have the full video of all the cameras and we know that the that this person took because of the GPS, the initial measurement unit, the wheel angle the wheel tix, so we put all that together, and we understand the path that this person took through this environment. And then of course, this, this, we can use this for supervision for the network. So we just source a lot of this from the fleet, we train the neural network on the on those trajectories, and then the neural predict paths, just from that data. So really what this is referred to typically as it's called imitation learning, we're taking human trajectories from the real world. And we're just trying to imitate how people drive in real worlds. And we can also apply the same data engine crank to all of this and make this work over time.

So here's an example of bad prediction.

Going through a kind of a complicated environment. So what you're seeing here is a video. And we are overlaying the predictions of the network. So this is a path that the network would follow. in green, and

some

three, the crazy thing is the network is predicting paths it can't even see was incredibly high Kersey, it can't see around the corner, but it's what it's saying the probability of that curve is extremely high. So that's the path

and it nails it, you will see that in the cars today, we're going to turn on augmented vision. So you can see the the lane lines and the path predictions of the cars overlaid on the video,

there's actually more going on under the hood that you can even tell you,

it's kind of scary.

Of course, there's a lot of details I'm skipping over, you might not want to annotate all the drivers, you might just you might want to just imitate the better drivers. And there's many technical ways that we actually slice and dice that data. But the interesting thing here is that this prediction is actually a 3d prediction that we project back to the image here. So the path here forward is a three dimensional thing that we're just rendering in 2d. But we know about the slope of the ground from all of this. And that's actually extremely valuable from driving. So our prediction actually is live in the fleet today, by the way. So if you're driving cloverleaf if you're in a cloverleaf on the highway, until maybe five months ago, so your car would not be able to do cloverleaf, now it can. That's Pat prediction, running live on your cars. we've shipped this a while ago. And today, you're going to get to experience this for traversing intersections, a large component of how we go through intersections in your drives today is all sourced from prediction from automatic labels.

So I talked about so far is really the three key components of how we iterate on the predictions of the network and how we make it work over time, you require large varied and real data set, we can really achieve that here at Tesla. And we do that through the scale the fleet, the data engine, shipping things in shadow mode, iterating, that cycle, and potentially even using fleet learning where no human animators are harmed in the process, and just using data automatically. And we can really do that at scale.

So in the next section of my talk, I'm going to especially talk about depth perception using vision only. So you might be familiar that there are at least two sensors in the car. One is vision cameras, just getting pixels. And the other is LIDAR that lot of there's a lot of companies also use, and LIDAR gives you these measurements of distance around you.

Now, one thing I'd like to point out, first of all, is, you all came here, you drove here, many of you and you used your, your neural net and vision, you were not shooting lasers out of your eyes, and you still ended up here, we might as well.

So clearly, the human neural net derives distance and all the measurements in the 3d understanding of the world just from vision, it actually uses multiple keys to do so I'll just briefly go over some of them, just to give you a sense of roughly what's going on inside. As an example, we have two eyes pointed out, so you get two independent measurements at every single time step of the road ahead of you. And your brain stitches, this information together to arrive at some destination, because you can triangulate any points across those two viewpoints. A lot of animals instead have eyes that are positioned on the sides. So they have very little overlap in their visual fields. So they will typically use structure for motion, the idea is that they bump their heads. And because of the movement, they actually get multiple observations of the world. And you can triangulate again depths. And even with one eye closed and completely motionless, you can still have some sense of depth perception. If you did this, I don't think you would notice me coming two meters towards you, or 100 years back. And that's because there are a lot of very strong binocular cues that your brain also takes into account. This is an example of a pretty common visual illusion, where you have you know, these two blue bars are identical, but your brain the weights stitches up the scene is it just expects one of them to be larger than the other because of the vanishing lines of this image. So your brain does a lot of this automatically. And, and neural nets, artificial neural nets can as well. So let me give you three examples of how you can arrive at depth perception from vision alone, a classical approach and to that rely on neural networks. So here's a video going down. I think this San Francisco, of a Tesla. So this is these are our cameras are sensing. And we're looking at all I'm only showing the main camera, but all the cameras are turned on the eight cameras. And if you just have the six second clip, what you can do is you can stitch up this environment in 3d using multimedia stereo techniques. So this

this is supposed to be a video

is not a video online. No, it's Oh, there we go.

So this is the 3d reconstruction of those six seconds of that car driving through that path. And you can see that this information is purely is very well recoverable from just videos. And roughly that's through process of triangulation. And as I mentioned, multi Syria. And we've applied similar techniques, slightly more sparse, and approximate also in the car.

So it's remarkable all that information is really there in the sensor, and just a matter of extracting it.

The other project that I want to briefly talk about is, as I mentioned, there's nothing about neural network networks are very powerful visual recognition engines. And if you want them to predict depth, then you need to, for example, look for labels of depth. And then they can actually do that extremely well. So there's nothing limiting networks from predicting this macular depth except for labeled data. So one example project that we've actually looked at internally is we use the forward facing radar, which is shown in blue. And that radar is looking out and measuring depth of objects. And we use that radar to annotate the what visually seeing the bounding boxes that come out of the neural networks. So instead of human annotator is telling you, okay, this, this car, and this bounding box is roughly 25 meters away, you can annotate that data much better using sensors, so you sensor annotation. So as an example, radars quite good at that distance, you can annotate that, and then you can train your network on it. And if you just have enough data of it, this neural network is very good at predicting those patterns. So here's an example of predictions of that. So in circles, I'm showing radar objects, and in and the keyboards that are coming out here are purely from vision. So the keyboards here are just coming out of vision. And the depth of those keyboards is learned via sensor annotation from the radar. So if this is working very well, then you would see that the circles and the top down view would agree with the keyboards. And they do. And that's because neural networks are very competent at predicting depths, they can learn to different sizes of vehicles internally, and they know how big those vehicles are. And you can actually derive depth from that quite accurately.

The last mechanism I will talk about very briefly, is slightly more fencing gets a bit more technical, but it is a mechanism that has recently

there's a few papers basically over the last year or two on this approach. It's called self supervision. So what you do in a lot of these papers is you only feed raw videos into neural networks with no labels whatsoever. And you can still learn, you can still get neural networks to learn depth. And it's a bit little bit technical. So I can't go into the full details. But the idea is that neural network predicts depth at every single frame of that video. And then there are no explicit targets that the neural network is supposed to regress to with the labels. But instead, the objective for the network is to be consistent over time. So whatever depth you predict, should be consistent over the duration of that video. And the only way to be consistent is to be right, as the neural network automatically predicts the correct steps for all the pixels. And we've reproduced some of these results internally. So this also works quite well.

So in summary, people drive with vision only No, no lasers are involved. This seems to work quite well. The point that I'd like to make is that visual recognition and very powerful recognition is is absolutely necessary for autonomy, it's not a nice to have, like we must have neural networks that actually really understand the environment around you. And, and LIDAR points are much less information rich environment. So vision really understands the full details, just a few points around are much, there's much less information in those. So as an example, on the left here,

is that a plastic bag? Or is that a tire? And LIDAR might just give you a few points on that. But vision can tell you, which one of those two is true. And that impacts your control? Is that person who is slightly looking backwards? Are they trying to merge in into your lane, the bike? Or are they just or are they just going forward? In the construction sites? What do those signs say? How should I behave in this world, the entire infrastructure that we have built up for roads is all designed for human visual consumption. So all the signs, all the traffic lights, everything is designed for vision. And so that's where all that information is. And so you need that ability, is that person distracted and on their phone, are they going to work walk into your lane, those answers to all these questions are only found in vision, and are necessary for level four, level five autonomy. And that is the capability that we are developing at Tesla. And through this is done through combination of large scale new level training through data engine and getting that to work over time and using power of the fleet. And so in this sense, LIDAR is really a shortcut. It sidesteps the fundamental problems, the important problem, visual recognition that is necessary for autonomy. And so it gives a false sense of progress, and is ultimately ultimately crushed. It does give like really fast demos.

So if I was to summarize the entire,

my entire talk in one slide, it would be this.

All of autonomy, because you want to level four level five systems that can handle all the possible situations in in 99.99% of the cases. And chasing some of the last few nights is going to be very tricky and very difficult. And it's going to require a very powerful visual system. So I'm showing you some images of what you might encounter in any one slice of that nine. So in the beginning, you just have very simple cars going forward, then those cars start to look a little bit funny, then maybe you have bikes, some cars that maybe have cars and cars, then maybe you start to get into really rare events like cars turnover, or even cars airborne, we see a lot of things coming from the fleet. And we see them at some rate, at like a really good rate compared to all of our competitors. And so the rate of progress at which you can actually address these problems, iterate on the software, and really feed the neural networks with the right data, that rate of progress is really just proportional to how often you encounter these situations wild. And we encountered them significantly more frequently than anywhere else, which is why we're going to do extremely well.

Thank you.

It's all super impressive. Thank you so much.

How much data? How many pictures are you collecting, on average from each car, per period of time. And then it sounds like the new hardware with the dual dual active active computers, gives you some really interesting opportunities to run in full simulation, one copy of the neural net while you're running the other one, going the other one drive the car and compare the results to do quality assurance. And then I was also wondering if there are other opportunities to use the computers for training when they're parked in

the garage for the 90% of the time that I'm not driving my Tesla around? Thank you much.

Yep. So for the first question, how much data do we get from the fleet. So it's really important to point out, it's not just the scale of the data set, it really is the variety of data center matters, if you just have lots of images of something going forward on the highway at some point, and then it just gets it, you don't need that data. So we are really strategic and how we pick and choose and the trigger infrastructure that we've built up is quite sophisticated analysis to get just the data that we need right now. And so it's not a massive amount of data. It's just very well picked

data.

For the second question, with respect to redundancy, absolutely, you can run basically the copy of the network on both. And that is actually how it's designed to achieve a level four level five system that is redundant. So that's absolutely the case. And your last question, I'm sorry, I did not

training. The the car is an inference optimized computer, we do have a major program at Tesla, which we don't have enough time to talk about today, cool dojo, that's a super powerful training computer. The goal dojo will be to be able to take in vast amounts of data, and train a video level and do unsupervised, massive training of vast amounts of video with the dojo program dojo computer. But that's for another day.

Test Pilot in a way, because I drive the four or 510. And all these really tricky, really long tail things happen every day. But the one challenge that I'm curious to how you're going to solve is changing lanes. Because whenever I try to get into a lane with traffic, everybody cuts you off. And so human behavior is very irrational when you're driving in LA, and the car just wants to do it safely. And you almost have to do it unsafely. So I was wondering how you're going to solve that problem?

Yeah. So one thing I will point out is I spoke about the data engine as iterating on the roadmap works, but we do the exact same thing on level of software and older hyper parameters that going to the choices of when we actually change how aggressive we are, we're always changing those potentially run them in shadow mode and seeing how well they work. And so to tune our heuristics around when it's okay to link change, we would also potentially utilize the data engine and the shadow mode, and so on. Ultimately, actually designing all the different heuristics for when it's okay to link change is actually a little bit intractable, I think, in the general case. And so ideally, you actually want to use fleet learning to guide those decisions. So when do humans link change? And what scenarios? And when do they feel it's not safe, the lane change, and let's just look at a lot of the data and training material and classic fires for distinguishing when it is too safe to do so. And those machine learning classic cars can can write much better code than humans, because they have the maximum data backing that so they can really tune all the right thresholds and agree with humans and do something safe.

We will probably have a mode that goes beyond Mad Max mode to LA traffic mode. Yeah, well, you know, Mad Max would have a hard time in LA traffic, I think.

Yeah. So it really is a trade off. Like you don't want to create unsafe situations, but you want to be assertive. But that little dance of how you make that work as a human is actually very complicated. And it's very hard to write in code. But I think we really do, it really does seem like machine learning approach is kind of like the right way to go about it, where we just look at a lot of ways that people do this and try to imitate

that. We're just being like more conservative right now. And then as we gain higher higher confidence, will allow users to select a more aggressive mode.

That'll be up to the user.

But in the more aggressive modes in trying to merge and traffic, there is a slight debate, you know, over no matter how many of you, there's a slight chance of like a fender bender, not a serious accident, but you basically will have a choice of Do you want to have a nonzero chance of a fender bender on freeway traffic, which is unfortunately the only way to navigate LA traffic? Yes.

Yeah.

Yes, yes. Yeah. I mean, yes.

Yes.

And it was like la story. It was great movie.

Because there's this game of chicken.

Yeah.

It will offer more aggressive options over time, that will be user specified. Yes, Mad Max plus? Exactly.

Oh, yes.

Hello.

Hi, George summer from Canada ingenuity. Thank you, and congratulations on everything that you've developed. When we look at the alpha zero, project, it was a very defined and limited variable in terms of the parameters on that which allowed for the learning curve to be so quick.

The risk or what, what you're trying to do here is almost developed consciousness in the cars through the neural network. And so

I guess the challenge is, how do you not create a circular reference in terms of the pulling from the centralized model of the fleet to that handoff where the car has enough information.

Whereas the where's that line, I guess, in terms of the the point of the learning process to handing it off, where there's enough information in the car and not having to pull from the, the, the fleet,

but the car can operate, if it's completely disconnected from the fleet.

It just it just the uploads the the training, that's, you know, better and better as the freak freak gets better and better. So simply, if you're disconnected from the fleet, from that point onwards, it would stop getting better.

But it will still function fine.

portion of your share in the heart of the previous version and talked about a lot of the power benefits of not storing a lot of the images in so in this portion, you're talking about the learning that's going on by pulling from the fleet.

I guess I'm having a hard time reconciling how, if there was a situation where I'm driving up the hill, as you showed, and I'm predicting where the road is going to go. That's coming from all of the other fleet variables that led to that.

That

intelligence

how I'm not how I'm getting the benefit of the the low power using the cameras with the neural network, that's where I'm losing the two, maybe it's just me, but I guess that's I mean,

the compute power and the full self driving computer is incredible.

And

I

mentioned that, if it had never seen that road before, it would still have made those predictions provided it was a road in the United States.

In the case of LIDAR, the march of nines, isn't there an example don't just get to your slam on LIDAR, because it's pretty clear. You don't like LIDAR,

and this

lateral flame lighters, man. Do

that. Isn't there like a case where at some point 99999 down the road? We're actually LIDAR may be helpful, and why not have it as some sort of redundancy or backups? That's my first question. And the second so you can still have your focus on computer vision, but just have it as a redundant. My second question is, if that is true, what happens to the rest of the industry, that's been their autonomy solutions on LIDAR,

they're all going to dump LIDAR. That's my prediction, walk my words.

I should point out that I don't actually super hate LIDAR as much as may sound. But at SpaceX, SpaceX Dragon uses LIDAR to navigate to the space station and Doc, not only that, we do SpaceX developed its own LIDAR from scratch to do that. And I spearheaded that effort personally. Because in that scenario, LIDAR makes sense. And it cars is friggin stupid. It's expensive, and unnecessary. And as Andrew was saying, once you saw vision, it, it's worthless, to have expensive hardware that's worthless on the car, that we do have a Ford radar, which which is low cost and is helpful, especially for occlusion situations. So if there's like fog, or dust, or you know, snow, the radar can see through that, if you're going to use active photon generation, don't use visible wavelength. Because once you with passive optical, you've taken care of all visible wavelength stuff, you want to if you want to use a wavelength that is occlusion penetrating like radar. So right liner is just active photon generation individual spectrum.

Active photon generation do it outside of visual spectrum, in the radars in the radar spectrum. So like 3.8 millimeters, versus 400, 700 nanometers, going to be much better occlusion penetration. And that's why we have afford radar. And then we also have

a 12. ultrasonic for for Near Field information. In addition to the eight cameras, and and the Ford radar, y'all need the radar and for direction, because that's the only direction going real fast. So that's, we're going over this multiple times, like always show we have the right time. Sweet. Should we add anything more now?

Hi. So right, right here. So you had mentioned that you ask the fleet for the information that you're looking for, for some of the vision. I have two questions about that what it sounds like, the cars are doing some computation to determine what kind of information to send back to you. Is that is that a correct assumption? And are they doing that in real time? Are they doing based on stored information?

Yep. So they absolutely do do computation and in real time on the car over and we will wait to basically specify condition that we're interested in. And then those cars are their competition there? If they did not, then we'd have to send all the data and do that offline in our back end. We don't want to do that. So all that competition have a solid car?

So it's

based on that question, it sounds like you guys are in a really good position to have currently half a million cars in the future, potentially millions of cars that are essentially computers, representing free almost free data centers for you to do computational is that a huge future opportunity for Tesla current his current current operation, and that's not really factored in for anything yet. That's incredible. Thank you.

We're 425,000 cars with hardware to and beyond, which is means they've got all eight cameras, the right, the radar and ultra Sonics. And they've got at least a video computer, which is enough to

essentially figure out what information is important. What is not compressed the information as important to the most salient elements and uploaded to the network for training for a massive compression of real world data.

You have these sort of network of millions of computers, which is like massive data centers, essentially, they're distributed data centers for computational capacity, do you see it use being used for other things besides self driving in the future?

I suppose it could possibly be used for something besides self driving, we're going to be focused on self driving. So you know, as we as we get that really nailed, maybe there's going to be some other use for you know, millions and then 10s of millions of computers with hardware three fossil driving computer.

Yeah, maybe there would be.

It could be could be this like some sort of AWS angle here as possible.

Hello. Hi, Ilan, Matt, choice loop ventures. I own a model three in Minnesota where it snows a lot. Since camera and radar cannot see road markings through snow. What is your technical strategy to solve this challenge? Is it involve high precision GPS

at all?

Yeah, so

actually, like today, actually, autopilot will do a decent decent job at in snow, even when labor markets are covered, even when when the markings are faded, covered, or when there's lots of rain on them. We still seem to drive relatively well. We didn't specifically go after snow yet with our data engine. But actually think this is this is completely tractable, because in a lot of those images, even then, when things are snowy, when you ask a human annotator, where are the landlines? They actually could tell you actually I was like relatively consistent in the rain doesn't last. As long as the animators are consistent on your data, then there's the neural network will pick up on those patterns and will do just fine. So it's really just about is the signal there even for the human annotator? If that is the answer to that UCS, then you know that we can do it just fine.

Yeah, this is actually there are a number of important signals under saying. So landlines are one of those things, but one of the most important signals is drive space. So what is drivable space and what is not drivable space? And what what actually really matters the most is, is drivable space more than landlines. And the and the prediction of drivable space is extremely good. And I think, especially after this upcoming winter will be incredible. It's like, it will be like how could it possibly be that good, that's crazy.

The other thing to point out is maybe it's not even only about human avatars, as long as you as a human can drive through that impairment. Through fleet learning, we actually know the path you took. And you obviously use vision to guide you through that path, you did not just use the landline markings, you use the entire geometry of the entire scene. So you see, like, you know, you see how the world is roughly Carling, you see how the cars are positioned around you, you know that work will pick up on all those patterns automatically incited, if you just have enough of the data people traversing those environments,

it's actually extremely important that things not be rigidly tied to GPS, because GPS error can vary quite a bit. And that is the the the actual situation for a road convention quite a bit. So the Greek construction that could be a detour. And if the car is is using GPS as primary, this is a real bad situation is asking for trouble. It's fine to use GPS for like tips and tricks. So it's like you can drive your your home neighborhood better than a neighborhood enough in like some of the country or some of the public country. So you know your neighborhood well. And you use kind of like the knowledge of your neighborhood to drive with more confidence to maybe have counterintuitive shortcuts and that kind of thing. But you, you it's the GPS overlay data should only be helpful, but never primary, if it's ever primary this problem.

So question back here in the back corner.

I just wanted to follow up partially

on that, because several of your competitors in the space over the past few years have made. You know, I've talked about how they are augmenting all of their perception and path planning capabilities that are kind of

on the car platform

with high definition maps of the areas that they are driving, does that play a role in your system? Do you see it adding any value Are there areas where you would like to get more data that is not collected from the fleet, but is more kind of mapping style data.

I think the high precision, the high high, high precision GPS, maps and lanes are a really bad idea. The system becomes extremely brittle. So any change like this this month, any change to the system makes it It can't adapt. So if it locks onto GPS and high precision lane lines,

and does not allow vision override, in fact, this vision should be the thing that that does everything. And then, like lane lines are a guideline. But they're not the main main thing. We we briefly bought up the tree of high precision lane lines, and then realized that was a huge mistake and and reversed it out.

It's not good.

So this is

very helpful for understanding annotation, where the objects are and how the car drives. But what about the negotiation aspect for parking and roundabouts and other things where there are other cars on the road that are human driven? Where it's more art than science?

Does pretty good, actually, like with cut ins and stuff? It's doing really well?

Yeah. So like I mentioned, we're using a lot of machine learning right now in terms of predicting kind of creating an explicit representation of what the role looks like. And then there's an explicit planner and a controller on top of that representation. And there's a lot of heuristics for how to traverse and negotiate and so on. There is a long tail, Justin, like in what visually Mars look like there's a long tail and just those negotiations and little game of chicken that you play with other people and so on. And so I think we have a lot of confidence that eventually there must be some kind of a feat learning components to how you actually do that. Because writing all those rules by hand is going to go into quickly. Blotto, I think.

Yeah, it we've dealt with this issue with cut ins, and it's like, will allow gradually more aggressive behavior on the part of the user, they can just dial the setting up and say be more aggressive, or less aggressive. You know, drive easy, chill mode, aggressive.

Yeah.

incredible progress. Phenomenal. Two questions. First, in terms of patterning, do you think the system is geared? Because somebody asked about when there is snow on the road. But if you have large light winning feature, you can just follow the car in front? Does your system is your system capable of doing that, then I have to follow ups.

So you're asking about power tuning. So I think like we could absolutely build those features. But again, if you just use if you just train your networks, for example, on imitating humans, humans already, like follow the car ahead. So that neural neural network actually incorporates those patterns internally, is just it figures out that

there's a correlation between the way the car ahead of you faces and the path that you are going to take. But that's all done internally, internet. So you're just concerned with getting enough data and tricky data. And the neural retraining process actually is quite magical, does all the other stuff automatically. So you turn all the different problems into just one problem, just flood your data set. And using a lot more training?

Yeah, this is that there's this three steps to self driving, you know, this being future complete, then there's being future complete to the degree that where we think that the person in the cloud does not need to pay attention. And then there's

been at our library level, we've also convinced regulators that that is true. So there's kind of like three levels, we expect to be so feature coming police and self driving this year. And we expect to be confident enough from our standpoint, to say that we think people do not need to touch the wheel look out of the window, sometime probably around, I don't know, second quarter of next year. And then we start to expect to get regulatory approval, at least in some jurisdictions for that towards the end of next year. What is that that's a roughly the timeline that I expect things to go on. And

probably for for trucks, the tuning will be approved by regulators before anything else. And you can have like, maybe if you're long whole, doing long haul freight, you can have one driver in the front and then have four semis trailing behind in a platoon manner. And I think that probably the regulators will be quicker to approve that than other things.

Regarding Of course, you don't have to convince us. LIDAR is a technology, in my opinion, which has an answer looking for a question probably that the I mean, this is very impressive what we saw today, and probably demo could show something more. I was wondering what is the maximum dimension of a metrics that you may be having in your training? Or in your deep learning pipeline, ballpark figure?

lessons dimension of the matrix. So

you know, our matrix built by operations inside the neural network, you're asking about the like, there's many different ways to answer that question. But I'm not 100% sure if they're, they're useful. They're useful answers. These neural hours were typically have, like I mentioned about 10s to hundreds of millions of neurons, each of them are, on average, have about 1000 connections to neurons below. So these are the typical skills that are kind of used across the industry, and also that that we would use as well.

Yeah, I've been actually very impressed by the rate of improvement on autopilot the past year on my model three. And the two scenarios I wanted your feedback on last week, first scenario was, I was on the right hand most lane of the freeway, and there was a highway on ramp. And then my model three actually was able to detect two cars on the side, slow down and let the car go in front of me and one cargo behind me. And I was like, Oh my gosh, this is like insane. Like, I didn't think my model three could do that. So that was like super impressive. But the same week, another scenario, which is I was on the right hand lane again. But my right hand lane was merging with the left lane. And it wasn't a on ramp, it's just a normal highway freeway lane. And my Model T wasn't able to detect really that situation. And I wasn't able to slow down or, or speed up, and I had to intervene. Kind of so can you from your perspective, kind of share kind of the background on how a neuron that would? How Tesla might adjust for that. And you know, like how that could be improved over the over time?

Yeah, so like I mentioned, we have a very sophisticated trigger infrastructure, if you have intervened, it's actually potentially likely that we receive that clip, and that we can actually analyze it and see what happened and tune the system. So it probably enter some statistics over Okay, at what rate? are we are we correctly merging the traffic, and we look at those numbers, and we look at the clips and we see what's wrong. And we tried to fix those clips and make progress against those benchmarks.

So

yeah, so we would potentially go through a phase of categorization. And then we look at some of the biggest kind of categories that actually seem to semantically be related to a simple of the same problem. And then we will look at some of those and then try to develop software against that.

Okay, we do have

one more presentation, which is the the software says like, essentially the the, what about hardware with with Stuart, this the sort of neural net vision with Andre, and then there's the software engineering at scale. That's a compute presented by Stuart. So next, an opportunity afterwards to ask questions. So yeah, thanks.

I just wanted to very briefly say if you have an early flight, then you want to do a test ride with our latest developments software, if you could please speak to my colleague and or drop her an email, and we can take you out for a test ride and Stuart's over to you.

Alright, so that's actually from a clip of a longer than 30 minute uninterrupted drive, with no interventions, navigate an autopilot highway system, which is in production today on hundreds of thousands of cars. So I'm Stuart, and I'm here to talk about how we build these systems at scale. Just like a really short induction, and kind of where I'm coming from what I do.

So I've been a couple companies or less, I've been writing software for question about 12 years, the thing that excites me most and I'm really passionate about is taking the cutting edge of machine learning, and actually connecting that with customers through robustness and scale. So at Facebook, I worked initially inside of our ads infrastructure to build some of the machine learners really, really smart people. And when she tried to build into a single platform that we could then scale to all the other aspects of the business from how we rank the news feed to how we deliver search results to how we make every recommendation across the platform. And that became the applied machine learning group as somebody who was incredibly proud of, and a lot of that wasn't just the core algorithms. And the really important improvements that happened there those that matters a lot is actually the engineering practices to build these systems at scale, the same thing was true at snap where I went, where we were really, really excited to sort of actually help to monetize this product. But the hardest part, we were using Google at the time, and they were effectively, you know, running us and up very small scale. And we wanted to build that same infrastructure, we take understanding of these users connect that with cutting edge machine learning, build that at massive scale, hundred billions, and then trillions of predictions and auctions every day in which is really robust. And so when the opportunity came to come to Tesla, that's something I'm just incredibly excited to do, which is specifically take the amazing things that are happening both in the hardware side, and the computer vision and AI side, and actually package that together with all the planning that controls the testing, the colonel patching of the operating system, all of our continuous integration, our simulation, actually build that into a product we get onto people's cars and production today. And so I want to talk about the timeline for how we did that with navigation on autopilot. And how we're going to do that as we get Naveen output off the highway and under city streets.

So we're at 770 million miles already for navigate on autopilot is something really, really, really cool. And I think one thing that is worth kind of calling out on this is that we're continuing to accelerate and keep learning from this data, like Andre talked about this data engine. As this accelerates up, we actually do make more and more assertive lane changes. We are learning from these cases where he will intervene because they failed to detect emerged correctly, or because they wanted the car to be a little more peppy in different environments. And we just want to keep making that progress. So to start all of this, we begin with trying to understand the world around us. And we talked about the different sensors in the vehicle. But I want to like dig in a little bit more here, we have eight cameras. But then we also have additionally 12 ultrasonic sensors or radar, an inertial measurement unit GPS. And then one thing we forget about is we also have the pedal and searing actions. So not only can we look at what's happening around the vehicle, we can look at how humans chose to interact with that environment. And so I'll talk this clip right now, this basically is showing what's happening today in the car. And we're continuing to push this forward. So would you start with the single neural network, we see the detections around it, we then build all that together multiple neural networks in multiple productions, we bring in the other sensors, and we convert that into electrons of vector space and understanding of the world around us. And this is something where as we continue to get better and better at this, we're moving more and more of this logic into the neural networks themselves. And the obvious endgame here is the the neural network looks across all the cars, brings it all information together and just ultimately outputs a source of truth for the world around us. And this is actually not like an artist rendering. In a sense, this is actually the output of one of the debugging tools that we use on the team every day to understand what the world looks like around us. So another thing that I think is really, really exciting to me, I think, when I do hear about sensors, like LIDAR, a common question is around just having extra sensor modalities, like why not have some redundancy on the vehicle, and I want to dig in on one thing that it's not, it's not always obvious with neural networks themselves. So we have a neural network running on a wide Fisheye Camera. That neural network is not making one prediction about the world, it's making many separate predictions, some of which actually audit each other. So it was a real example, we have the ability to detect a pedestrian, that's something we trained very, very carefully on and put a lot of work into. We also have the ability to detect obstacles in the roadway, and a pedestrian is an obstacle. And it's shown differently to the neural network, it says oh, there's a thing I can't drive through. And these together combined to give us an increased sense of what we can do in front of the vehicle and how to plan for that. We then do this across multiple cameras, because we have overlapping fields of view in many places around the vehicle. In front, we have a particularly large number of overlapping fields of view. Lastly, we can combine that things like the radar and the ultrasonic to build these extremely precise understandings was happening front of the car, we can use that both to learn future behaviors that are very accurate, we can also build very accurate predictions of how things will continue to happen in front of us. So one example is really exciting as we can actually look at bicyclists and people and not just ask Where are you now? But where are you going. And this is actually the heart of ordinary for our next generation automatic emergency braking system, which will not just offer people in your path, it will suffer who are going to be in your path. And that's running in shadow mode. Right now we'll go on to the fleet this quarter. I'll talk about shadow mode in a second.

So when you want to start a feature like this for navigate on autopilot on the highway system, you can start by learning from data and you can just look at how humans do things today. What is their assertiveness profile? How do they change change lanes, what causes them to either abort or change it like their maneuvers. And you can see things that are not immediately obvious like, Oh, yeah, simultaneous merging is rare, but very complicated and very important. And you can start to build opinions about different scenarios, such as a fast overtaking vehicle. So this is what we do, when you initially have some algorithm you want to try out, we can put them on the fleet, we can see what they would have done in a real world scenario, such as this car that's overtaking us very quickly. This is taken from our actual simulation environment, showing different paths that we have considered taking, and how those overlay on the real world behavior of a user, when you get those algorithms tuned up, and you feel good about them specifically. And this is really taking that out of the neural network, putting it in that vector space and building and tuning these parameters on top of it. Ultimately, I think we can do through more and more machine learning, you go into a controlled deployment, which for us is our early access program. And this is you get this out to a couple thousand people who are really excited to give you highly vigilant, but useful feedback about how it behaves not an open loop in a closed loop way in the real world. And you watch their internal tensions we talked about it's like when somebody takes over, we can actually get that clip, try to understand what happens. And one thing we can really do is we can actually play this back again, an open loop way and ask as we build our software, are we getting closer or further from how humans behave in the real world. And one thing, just super cool with the full self driving computers, we're actually building our own racks and infrastructure. So we basically can pit for full self driving computers fully racked up, build these into our own cluster and actually run this very sophisticated data infrastructure to actually understand over time, as we tuned in fix these algorithms are getting closer and closer to humans behave and ultimately, can we get can we exceed their capabilities. And so once we had this, we were really good about it, we want to do our wide rollout. But to start, we actually asked everybody to confirm the cars behavior vs. Confirm. And so we started making lots and lots of predictions about how we should be navigating the highway, we ask people to tell us, Is this right? Or is this wrong? And this is again, a chance to turn that data engine. And we did spot some really tricky and interesting long tails. In this case, I think a really fun example, like the are these very interesting cases of simultaneous merging where you start going and then somebody moves either behind her before you not noticing you. And what is the appropriate behavior here. And what are the tunings of the neural network, we need to do to be super precise about the appropriate behaviors here. We worked, we tuned these in the background, we made them better. And over the course of time, we get 9 million successfully accepted lane changes. And we use these, again, with our continuous integration infrastructure to actually understand how do we think we're ready. And this is one thing, we're full self driving also really exciting to me, since we own the entire software stack straight from the kernel patching all the way to the ice, in the tuning on the image signal processor, we can start to collect even more data that is even more accurate. And this allows us to do even better and better tuning these faster iteration cycles. And so earlier this month, we were kind of, we're ready to deploy an even more seamless version of navigate on autopilot on the highway system. And that seamless version does not require stop, confirm. So you can sit there, relax, put your hand on the wheel, and just oversee what the car is doing. And in this case, we're actually seeing over 100,000 automated lane changes every single day on the highway system. It is something just like super cool to us to deploy at scale. And the thing that I kind of most excited about from all this is the actual life cycle of this and how we actually are able to turn that data engine crank faster and faster and faster with time. And I think one thing it's really becoming very clear is the combination of the infrastructure we have built, the tooling we built on top of that, and the combined power of the full self driving computer. I believe we can do this even faster, as we move navigate an autopilot from the highway system onto city streets. And so yeah, without a hand off the line,

it to the best of my knowledge, all those lane changes have occurred with zero accidents. That is correct. Yeah,

I watch every single accident.

So it's conservative obviously. But but it's to have hundreds of thousands going to millions of changes and zero accidents is, I think a great achievement by the Tesla team.

Thank you.

So let's see in a few other things that are familiar with mentioning.

In order to have a self driving car or Robo taxi, you really need redundancy throughout the vehicle At the hardware level. So starting in every was October 2016. All cars made by Tesla hybrid done a power steering. So we were done at motors and the power steering. So any one failure of the if the motor fails, the car can still steer

all of the power and data lines have redundancy. So you can sever any given power line or any data line and the car will keep driving the auxiliary power system, even if the main pack, you lose complete power in the main pack, the car is capable of steering and breaking using the auxiliary power system. So you can completely lose the night pack. And this the car is safe.

The whole system from a hardware standpoint has been designed to for to be a robo taxi since basically October 2016.

So when we rolled out hardware autopilot version two,

we do not expect to upgrade cars made before that we think it would actually cost more to make a new car than to upgrade the cars. Just to give you a sense of how hard it is to do this.

Unless it's designed in its it's not worth it.

So we're going through the future of self driving, where it's glitz, it's hardware, its vision. And then there's a lot of software. And there's the software problem here should not be minimizing some massive software problem that

Yeah, managing vast amounts of data training against the data. How do you control a car based on the vision it's a very difficult software problem.

So going after going over just like Tesla Tesla master plan obviously we made a bunch of forward looking statements as they call it.

And let's go through some of our look forward looking statements that we've made their way back when we created the company we said rebuild Tesla Roadster they said it was impossible and that even if we did build it nobody would buy it

this is like universal opinion was that building an electric car was extremely dumb and would fail.

I agree with him that probably failures high but but that this was important. So we built Tesla Roadster

got a bunch of production in 2008. And shipping that car it's not collector's item.

That's what we're building more affordable car with the the Model S we did that. Again, we were told That's impossible. I was called a fraud and liar. There's not going to happen. This is all untrue. Okay, famous last words, now is we're in production with the Model S and 2012 exceeded all expectations. There is still in 2019. no car that can compete with Model S of 2012. It's seven years later.

Still waiting.

So it was

a an affordable car, maybe highly affordable, is affordable, more affordable. With with the model three, we bought the model three, we're in production said we get over 5000 cars we've model three, at this point 5000 cars week is is a walk in the park for us, it's not even hard.

To reduce large scale solar, we did through the Solar City acquisition. And that we're developing to play solar roof, which is going really well we're now on version three of the solar tile roof. And we expect this will a production of the solar Tile Roof significantly later this year.

I have it on my on my house and it's great.

And

I said remake the power role and the power pack. We made the power and power pack the effect of the power pack is now deployed in massive grid scale utility systems around the world, including the largest operating battery projects in the world that above 100 megawatts. And in the next probably by next next year, two years at the most, we expect to have a Giga gigawatt scale battery project completed. So all these things, I said we would do them. We did it. So we're do it, we did it. We're going to do the rover taxi thing to only criticism and it's a fairly and sometimes I'm not on time.

But I get it done. And the Tesla team gets it done.

So what we're going to do this year is we're going to reach a combined production of 10,000, a week between six and three, feel very confident about that. And we feel very confident about being future complete with self driving.

Next year will expand the product line with model wine semi. And we expect to have the first operating robot taxis next year.

With no one in them next year.

It's always difficult to like when things are on an exponential at an exponential rate of improvement. It's very difficult to kind of wrap one's mind around it because we're used to extrapolating on a linear basis.

But we've got massive amounts of of like as the hardware,

massive amounts of hardware on the road, that the the cumulative data is increasing exponentially, the software is getting better at an exponential rate.

I feel very confident predicting autonomous robot taxis for Tesla next year. Not an old jurisdiction old jurisdictions because we won't have regulatory approval everywhere. But I'm confident we will have least regulatory approval somewhere literally next year.

So any customer will be able to add or remove their car to the Tesla network. So expect this to operate

is certainly sort of like a combination of maybe the Uber and Airbnb model. So if you own the car, you can add or subtract it to the Tesla network and tells it would take 25 to 30% of the revenue. And, and then in places where there aren't enough people sharing their cars, we would just have dedicated Tesla vehicles.

So we will when you use the car will show you a ride sharing out do you able to go to some in the car from the parking lot get in and go for a drive.

It's really simple.

to just take the same Tesla app that you currently have one student will update the app and add as someone someone Tesla, or or commit your car to the fleet. So see that someone someone your car or somebody Tesla or add your add or subtract your car to the fleet, you'll be able to do that from your phone.

See,

potential for smoothing out the demand distribution curve.

And having a car operates at a much higher youtility than an old car operator. Like typically the use of a car is about 10 to 12 hours a week. So most people will drive one and a half to two hours a day, typically 10 to 12 hours a week of total driving. But if you have a car that can operate autonomously, then most likely you could probably most likely to have that car operate for a third of the week or longer. So they're hundred and 68 hours in a week. So probably you've got something on the order of 5560 hours a week of operation maybe a bit longer.

So the the fundamental utility of vehicle increases by a factor of five. So you look look at this from a macro economic standpoint and say just if this was like some if we were operating some big simulation, if you could upgrade your simulation to increase the utility of cars by a factor of five, that would be a massive increase in the economic efficiency of the simulation, just gigantic.

So

we'll do model three f3, edX taxis, but we made an important change to our leases. So if you lease a model three, you don't have the option of buying it at the end of the lease, we want them back. If you buy the car, you can keep you can keep it but if you lease it, you have to give it back.

And as I said where in any locations where there's not enough supply for sharing, I will tell it will just make its own cars and add them to the network in that place.

So the current cost of role model three Robo taxi is

less than $38,000. We expect that number to improve over time. And resenting the cars, the cars currently being built are all designed for a million miles of operation. The drive unit designed, designed and tested validated 4 million million miles of operation, the car battery pack is about maybe 300 to 500,000 miles, the new battery pack that probably go into production next year is designed exclusively for a million miles of operation, the entire vehicle, battery pack inclusive will, will it's designed to operate for a million miles with minimal maintenance maintenance. So we'll actually be adjusting

tire design and really optimizing the car for a hyper efficient rubber taxi. And at some point, you won't need steering wheels or pedals. And we'll just delete those. So as as, as these things become less and less important, we'll just delete parts just they won't, they won't be there.

If you say like probably

two years from now, we we make a call it has no steering wheels with pedals. And if we need to accelerate that time, we can always just leave part's easy.

probably say long term, three years, rubber taxis with with eliminated parts. Good. Maybe it ends up being $25,000 or less.

And you want a super efficient car. So the electricity consumption is very low. So we're currently at four and a half miles per kilowatt hour, but we can will improve that to five and beyond.

And there's just really not no company that has the full stack integration. We've got the vehicle design, and manufacturing, but the computer hardware in house we've got the in house soft development

and AI, and we've got by far the biggest fleet. It's extremely difficult, not impossible past but extremely difficult to catch up when Tesla has 100 times more miles per day than everyone else combined.

This is these these today. This is the cost of running a gasoline car or the average cost of running a car in the US is taken from triple A. So it's currently about a 62 cents a mile

30,000 miles from 50 million vehicles adds up to 2 trillion a year. These literally just taken from the triple A website.

Cost of ride sharing is going to your left is two to $3 a mile the cost run a rubber taxi we think less than 18 cents a mile.

And and dropping

like this is this would be current current cost. Future costs will be lower.

To say what would be the probable gross profit from a single Robo taxi.

We think probably something on the order of $30,000 per year.

And expect that we're literally design we're designing the cars the same way that commercial semi trailer semi trucks are designed. Commercial semi trucks are all designed for a million my life. And we're designing the cars for a million my life as well.

So not in nominal dollars, that would be, you know, a little over $300,000 over the course of 11 years, maybe higher. I think this consumption is actually relatively conservative. And this assumes that 50% of the miles driven art art does nothing are not useful. So there's only at 50% youtility.

by the middle of next year, we'll have over a million Tesla cars on the road with full self driving hardware feature complete at a reliability level that we will consider that no one needs to pay attention. Meaning you could go to sleep in your from our standpoint, if you fast forward a year to look maybe a year, maybe a year and three months.

But next year for sure.

We will have over a million Robo taxis on the road.

The fleet wakes up with an over the air update.

That's all it takes.

Say What does net present value of Robo taxi probably on the order of a couple hundred thousand dollars.

So buying a model three is a good deal.

Questions?

Well,

I don't know I've guess what long term we have probably on the order of 10 million vehicles.

production rates Generally, if you look at a compound annual production rates since 2012, which is like the that's our first full year of Model Model S production. We went from 23,000 vehicles produced in 2013, to

around 250,000 vehicles produced last year. So in the course of five years, we increased output by a factor of 10. I would expect that something similar occurs over the next five or six years.

As for sharing, sharing versus I don't know. But the nice thing is that essentially customers are fronting us the money for the car was great. So,

um, in terms of the one thing is the snake charger? I'm curious about that. And

how did you determine the pricing? It looks

like you're undercutting the average Lyft or Uber ride by about

50%. So I'm curious if you could talk a little bit about that the pricing strategy?

Sure, we expect the the to

solving it solving for the snake charger is it's pretty straightforward. From a vision profit standpoint, it's like a no win situation. Any kind of known situation with with vision is like like a charge port trivial.

So

So yeah, the cars were just automatically Park, the automatically plugin.

There would be no one no human supervision required.

Yeah. So

what was a pricing that we just threw some numbers on there? I mean, I think it's like, definitely plugin, whatever pricing you think makes sense. We're just kind of randomly set, okay, maybe $1.

And things like if there's like an order of 2 billion cars and trucks in the world. So robot taxis will be an extremely high demand for a very long time. And from my observation thus far as the auto industry is very slow to adapt. I mean, like I said, there's still not a car on the road that you can buy. today that is as good as the Model S was in 2012.

So that suggests a pretty slow rate of adaptation for the car industry. And so probably $1 is conservative for the next 10 years. Because, like, people sort of think, like, there's like actually not enough appreciation for the difficulty of manufacturing, manufacturing is insanely difficult. But a lot of people I talked to things like, if you just have the right design, you can like instantly make as much of that thing as the world wants. This is not true.

It's extremely hard to design a new manufacturing system for new technology.

I mean, Audi's having major problems manufacturing Iran, and they are extremely good at manufacturing. And if they're having problems What about others.

So

the

you know, there's this on the order of 2 billion cars and trucks in the world, on the on the order of about 100 million units per year of production capacity of vehicles, but but only of the old design,

it will take a very long time to convert all of that to for self driving cars. And there really needs to be electric because the cost of operation of gasoline diesel car is much higher than electric car. That so any, any any robotics that isn't electric will absolutely not be competitive.

Ilan, it's called rush from Oppenheimer

over here. You know, obviously, we appreciate that the customers are fronting some of the cash for this, this fleet getting built up. But it sounds like

a massive bounce your commitment from

the organization over the course of time, can you talk a little bit about what that looks like what your expectations are in terms of financing over the next call it three years, three, four years, for building up this fleet and trying to monetize it with with your, you know,

customer base.

What we're aiming to be, you know, approximately cash flow neutral during the fleet build up phase.

And then I expect the extremely cash flow positive once the robo taxis are enabled.

But I don't want to talk about financing rounds, or beautiful talk about financing rounds in this in this venue, but

what I think will make the right moves or will make the moves you think we should make?

Um, I have a question, if I'm Uber, why wouldn't I just buy all your cars? You know, why would I let you put me out of business?

There's a course that we put into our cars. I think it was about three or four years ago, they can only be using the Tesla network.

So even a private person, like if I go out and buy 10, model threes, I can I can run on the network. That's a bit now right?

You're only right to us, it says network,

right. But if I use the test network, in theory, I could run a car sharing Robo taxi business with my 10 model threes.

Yes, but it's like the app store the the you can only you can add only add or remove them through the Tesla network, and then tells it gets revenue share.

But it's similar to Airbnb though, in that I have this home my car, and now I can just rent them out. So I can make an extra income from owning multiple cars and just renting them out. Like I have a model three, I aspire to get this roadster here Next, when you build it, and I'm going to just rent my model three out, why would I give it back to you? You know?

I guess you could operate a rental car fleet. But I think this is very unwieldy. Yeah.

Seems easy.

Okay, try it.

In order to operate a robo taxi network, it It sounds like you have to solve certain problems like for example, auto pilot, today, if you oversteer it lets you take over. But if it's, you know, if it's a ride sharing product that someone else is getting in the passenger seat, like moving the steering can't let that person take over the car, for example, because they might not even be in the driver's seat. So is the hardware already there for it to be a robo taxi, and it might get into situations such as a cop pulling it over where some human might need to intervene?

Like using central fleet of operators that remotely sort of interact with humans or, I mean, it's all that type of infrastructure already built into each of the cars. Does that make sense?

I think there will be sort of a phone home thing where if the car gets stuck, I will just find a home to Tesla and asked for a solution. Things like being pulled over for, you know, by free software. That's, that's easy for us to program. And that's not a problem.

But it will be possible for somebody to take over using the steering wheel, or at least for some period of time, and then probably down the road, we'll just cap the steering wheel. So there's no steering control

will just take steering wheel off, put a cap on the end along you give it a couple years

hardware modification to the to the car in order for it to enable that or that we're literally just unbolt the steering wheel and put a cap on where the steering wheel panel currently is. But but that that is a like future car that you would put out. But what about today's cars where the steering wheel is a mechanism to take over autopilot like so if it's in a robo taxi mode, would someone be able to take it over by just simply moving the steering wheel type?

Yes, I think it'll be a transition period where a few will be able to take over and should be able to take over from the robot taxi. And then once regulators are comfortable with us not having a steering wheel will just delete that. And for cars that are on there in the fleet. You know, and obviously with the permission of the owner, if it's owned by somebody else, we were just take a certain will have and put a cap where the steering wheel currently touches.

So there might be like two phases to Robo taxi one where the the services provided and you come in as the driver, but could potentially take over and then in the future, there might not be a driver option. Is that how you see it as well or like

in the future that will in future will the probability of the steering wheel being taken away the future is 100%.

People consumers will demand it. But But initially you would love

this is clear. This is not me prescribing a point of view about the world. This is me predicting what consumers will demand consumers will demand in the future that people are not allowed to drive these two tone deaf machines.

I totally totally agree with that. Yeah, but in order for a model three today to be part of the robot taxi network, when you call it, you would then get into the driver's seat essentially. Because Yeah,

just to be on the set. Okay. Make sense? Things like we just sort of, like, you know, there were amphibians, you know, but then pretty much that things just become like land creatures. Little bit little bit of survive Vivian phase.

Hi.

Sorry. Okay.

Yes, the strategy we've heard from other players in the robo taxi space is to select a certain municipal area to create geo fenced, self driving. That way, you're using an HD map to have a more confined area with a bit more safety. A we didn't hear much today around the importance of HD maps, to what extent is an HTC map necessary for you. And the second, we also didn't hear much about deploying this into specific municipalities where you're working with the municipality to get the buy in from them. And you're also getting a more defined area. So what's the importance of HD maps? And to what extent are you looking at specific municipalities? For roll out?

I think HTC maps are a mistake. We actually had HTML for a while, actually can can that because you either need HTC maps, in which case, if anything changes about the environment, the car will will will break down.

Or you don't need HTML apps, in which case, why are you wasting your time during HD maps. So the HD maps thing is, like the two main crutches that all that should not be used and will, in retrospect, just retrospect, the obviously false and foolish or LIDAR and HD mass.

Hello.

If you need a geo fenced area, you don't have real self driving.

Just it sounds like maybe battery supply could be the only bottleneck left towards this vision. And also, could you just clarify how you get the battery packs to last a million miles?

I think sales will be a constraint. That's, that's, that's a subject for a whole separate, there's a whole separate subject.

And I think we're actually going to want to push us sort of standard range plus battery more than our long range battery. Because the energy content in the long range is 50% higher kilowatt hours.

So

essentially, you can make, you know,

a third more cars, if you if you just if if they all sort of standard range plus, instead of the long range pack, ones, like around 50 kilowatt hours, the other ones around 75 kilowatt hours. So we're actually probably gonna buy us our sales intentionally towards the small battery. In order to have higher volume of what makes the one of the obvious next day things use maximize the number of autonomous units, or the number of maximize the output that will subsequently result in the biggest autonomous leak down the road.

So we're making during a number of things in that regard, but it's just not for today's meeting.

The million dollar life is basically just about getting the the cycle life of the

that the pack to, you know, you need basically, you know, on the order, like let's say you've got a basic math, if you've got a 250 mile range pack, you know, you're going to need 4000 cycles.

So very achievable. We already do that with our stationary storage, so much to say stationary storage solutions like power pack, or we're ready to ready to play power pack with both thousand cycle life capability.

mask.

Sorry, yeah, it's like ventriloquism, I

obviously significant very constructive margin implications to the extent you can drive attach rates much higher of the full self driving option. I just be curious if you can level set kind of where you are in terms of those attach rates and how you expect to educate consumers about the robot tech scenario. So that attachments do materially improve over time.

Sorry, open hearted here your question.

Yeah, I do just curious where where we are today in terms of full self driving attach rates in terms of the financial implications, I think it's hugely beneficial if those attach rates materially increase, because of the higher gross margin dollar that that flow through to the extent people do sign up full FST. Just curious how you see that ramping.

For what the attach rates are today versus you know, where when do you expect? How do you expect to educate consumers and get them aware that they should be attaching FST to to their vehicle purchases?

We ramp that up massively after today.

Yeah, I mean, the fundamental, really fundamental message that consumers should be taking

today is that it's financially insane to buy anything other than a Tesla.

They will be like owning a horse in three years, we find if you don't own a horse, but you should go into it with that expectation.

If you buy a car that it does not that does not have the hardware necessary for for self driving, it was like buying a horse.

And the only car that has the harder network was up full self driving Tesla.

Like pills around really thinking about their approaches.

Any any other any other vehicle. It's it's basically crazy to buy any other carlon Tesla.

We need to make that

convey that argument clearly. And we will have to today.

Thanks for bringing the future to present very informational time. today. I was wondering like you did not talk much about

Tesla pickup. And let me give a context for that. I could be wrong. But the way I'm looking at test on network, it will add as an early adopter and something as a test bread. I think Tesla's pick up maybe the first phase of putting the vitals and network because the utility of Tesla pickup would be pretty much people who are either loading a lot of stuff or are in the profession off construction or little here and there, or items, like picking up stuff from Home Depot. Or I would say that, you know, maybe it needs to have a two stage process, pickup trucks exclusively for Tesco network as a starting point, then, people like me can buy them later. But what are your thoughts on that?

Well, today was really just about autonomy. There's there's a lot that we could talk about, such as cell production, pickup truck and future vehicle vehicles. But today was just focused on autonomy. And I agree, it's a major thing. I'm very excited for the Tesla pickup truck unveil later this year. It's gonna be great.

Column Lang

and UBS.

Just so we understand the definitions, we need to refer to feature complete

self driving,

it sounds like you're talking level five, no geo fences that what's expected by the end of the year, just so.

And then the regulatory front process? I mean, have you talked to regulators about this, this seems quite an aggressive timeline from what other people have put out there? I mean,

are they?

You know, what are the hurdles that are needed? And what

is the timeline to get approval? And do you need things like in California know, they're tracking miles that, you know, with an operator, Brian there, do you need those things? But what does that process gonna look like?

Yeah, I mean, we talked to regulators around the world all the time, as we introduce, you know, additional features, like, navigate on autopilot. We, you know, this requires like, regulatory approval on a project jurisdiction basis. So,

right, but I think fundamentally, regulators, my experience are convinced by data. So if you have a massive amount of data that shows that autonomy is safe, they listened to it, they may take, they may take time to digest the information.

The process may take a bit a bit of time, but they are I have always come to the right conclusion on what I've seen.

I have a question over

here.

I've got like some eyes and a pillar. Okay.

Just wanted just to,

you know, some of the work we've done trying to better understand

the retail market, it looks like it's very concentrated in major dense urban centers. So

is the way to think about this, that the robo taxis would probably deploy more into that area.

And the the additional fault for self driving for

personally on vehicles would be in the suburban

areas, I think like probably like Tesla owned rubber taxis or be in dense urban areas, along with customer vehicles. And then as you get to medium and low density areas, it would tend to be more

that people will own the car and occasionally lend it out.

Yeah, there are a lot of edge cases in Manhattan and say, downtown San Francisco.

But those are, you know, and there are various cities around the world that that have a challenging urban environments.

But we do not expect this to be a significant issue. When I say future complete, I mean, it'll work in downtown San Francisco and downtown Manhattan this year.

Hi,

I have a neural net architecture question done? Do you use different

models for say path planning

and perception are different types of AI? And sort of how do you split up that problem across the different pieces of autonomy?

Well, essentially, the, the right now, our neural nets for us really for object recognition. And we're still basically just using it as still frames. So identifying objects that still frames and tying it together in a perception path planning layer thereafter, the but what's happening is steadily is that the neural net is kind of eating into the software base more and more. And so over time, we expect the neural net do more and more.

Not enough. From a computational cost standpoint, there are some things that are very simple for,

for heuristic, and very difficult for neural net. And so it probably makes sense to maintain some level of heuristics in the system, because they're just computationally thousand times easier than a neural net, I can hear me that's like a cruise missile. And if you're trying to swat a fly, just use a fly swatter not a cruise missile.

So but over time, I would expect that it moves really to just training on against video, and then video in car, steering and pedals out, well, basically, video in that lateral longitudinal acceleration out almost entirely.

That's, that's what we're gonna use the dojo system for. There's no system that can currently do that.

Maybe over here,

just going back to the sensor sweet discussion you on the one area I'd like to just talk about it is a lack of side radars. And in a situation where you have an intersection with a stop sign where there's maybe a 3540 mile per hour cross traffic, are you comfortable with the sensor suite that side cameras being able to handle that just maybe talk about that?

Yeah, the problem.

Essentially, the cause going to do kind of what human would do, they can make a human is like, basically, a camera on a slow gimbal.

And it's quite remarkable that people able to drive the car and the way that they are because if you know what, you can't look at all directions at once, the car can literally look at all directions at once with multiple cameras. So

humans are able to drive just by sort of sort of looking this way looking that way, they're obviously stuck in the driver's seat, they can't really get out of the driver's seat. So it's like kind of one camera on a gamble. and is able to drive a conscientious drivers drive with very high safety.

The cameras in the cars have a better vantage points than the person.

So they're like up in the up in the B pillar, or at in front of the rearview mirror. They've really got a great vantage point. So if you're turning on to a road that's got a lot, a lot of high speed traffic, you can just do a quick, there's just like graduate, like turn a little bit there go fully into the road, that the cameras see what's going on. And if things look good, and then the rear cameras don't show any oncoming traffic, or if you go, and if it looks sketchy, you can pull back a little bit, just like the behaviors like remarkably, starts to become remarkably lifelike. It's like quite eerie, actually, the car just starts behaving like a person.

Over here, here, go.

ventriloquist right here. Okay.

Given all the value you're creating in your auto business, by wrapping all of this technology around your yourselves, I guess I'm curious as to why you would still be taking some of your cell capacity and putting it into a power wall and power pack, wouldn't it make sense to put every single you know, unit you can make into this part of your business.

We will already

stolen almost all cell lines for that were meant to go to power wall and power pack and and use them for model three. I mean, last year, in order to make our model three production and not be sell stuff, we had to convert all of the

2170 lines at the Giga factory to to to car cells.

The

actual output in in total gigawatt hours of stationary storage compared to vehicles is an order of magnitude different and precision a storage we can we can basically use a whole bunch of miscellaneous cells out there. So we can just gather cells of from multiple suppliers around the world. And you're you don't have a homologation issue or safety issue like you have with cars. So that's basically

our stationary battery business has been just kind of feeding off scraps for for quite a while.

So

but like, really think of like the production as being a there are many, many constraints of a massive construct of production system. It's like that like the degree to which manufacturing and supply chain is underappreciated is amazing.

There are a whole series of constraints. And what is the constraints in one week may not be the constraint in another week.

It's insanely difficult to make a car,

especially one which is rapidly evolving. So yeah, but I'll just take a few more questions. And then I think we should break for so you can try out the cars. Hi, Ilan, Adam Jonas.

Questions on safety? What what data? Can you share with us today? How safe

this technology is, which obviously be important in a

regulatory or insurance discussion.

We publish the accidents per mile recorder. And what we see right now is that what a pilot is about twice as safe as a normal, you know, normal driver, on average. And we expect that to increase quite a bit over time.

It Like I said, in the future, it will be consumers will want to outlaw and I'm saying they will succeed, or am I saying I agree with this position. But in the future consumers will want to outlaw people driving their own cars, because there's unsafe, if you think of like elevators, elevators used to be operated on a big lever, like go up and down the floor. There's like a big relay, and yet elevator operators, but then periodically, they would get tired or drunk or something. And then the turn the liberal the wrong time and several somebody in half. So now you do not have the elevator operators. And you'd be quite alarming. If you went into an elevator that had a big lever that could just move between floors arbitrarily. So there's just buttons. And long term again, not about your judgment. One saying I want the world to be this way, I'm saying consumers will most likely demanded that the people in our lab drive cars

in a llama follow up. Can you share with us how much

Tesla's spending on

auto pilot or autonomous technology?

My order of magnitude on an annual basis?

Thank you. It's basically our entire expense structure.

Here

on the economics of the Tesla network, just so I understand it look

like so you get a model three often lease $25,000 goes on the balance sheet would be an asset. And then you would cash flow $30,000 a year roughly. Is that the way to think about?

Yeah, something like that. Yeah. And then just in terms of the financing of it. And as a question earlier, you mentioned you would do it is it cash flow neutral to the robo taxi program or cash flow neutral?

To Tesla's a whole?

Sorry,

the calculators one terms of he asked

a question about financing the robo tax yet. It looks to me like they're self financing. But yeah, you mentioned that would be bad. Basically cash flow neutral. Is that what you were referring to? I just think

between now and when the robot taxis are fully deployed throughout the world, the sensible thing for us is to maximize rate and drive the company to cash flow neutral.

Once the once the robot taxi fleet is active, I would expect to be extremely capital positive in this so you were talking about production?

Yeah, to produce.

Good Thanks, Matt. maximize the number of autonomous units made. Thank you.

Okay, maybe one last question here. Hello. If I

if I add my Tesla to the robo taxi network who is liable for an accident, is it Tesla or is it me? If the vehicle has an accident that harms

I mean,

I'm probably Tesla it's probably Tesla.

I got the right thing to do is just make sure there are very, very few accidents.

Alright, thanks everyone. Please enjoy the drives. Thank you.