Tesla is finally getting serious about self-driving
"AI Day" presentation was long on technical details and short on Musk’s usual boosterism.
Last month, Elon Musk gave the public a remarkably in-depth look at Tesla’s artificial intelligence efforts.
“Tesla is much more than an electric car company,” Musk said at the “AI Day” event held on August 19. “We have deep AI activity in hardware, on the inference level, on the training level. We are arguably the leaders in real-world AI.”
This is typical Musk overstatement, but the presentation made clear that Tesla is making a serious effort to develop fully autonomous vehicle technology. Tesla is doing everything from designing custom AI chips to hiring more than 1,000 people to label real-world road images.
For years, Tesla fans have insisted that Tesla has a unique approach that will enable the company to leapfrog past rivals like Waymo, Cruise, Argo, Mobileye, and Zoox. But last month’s presentation left me with a different impression: Tesla is doing a lot of the same things as those other companies.
I don’t mean that as a criticism. Over the last five years, Elon Musk has repeatedly declared that Tesla was far ahead of its rivals, even as the real-world performance of Tesla’s software didn’t seem to support that claim. Last month’s presentation was less bombastic than some of Musk’s earlier statements, but it revealed a company that is more clear-eyed about the challenges ahead.
If you’re a hard-core Tesla fan, you might have been disappointed to learn that Tesla isn’t as far ahead as Musk had previously claimed. For example, Musk admitted that a key part of Tesla’s self-driving strategy—its Dojo supercomputer for training neural networks—isn’t even going to start operating until next year.
But if—like me—you’ve long suspected that Musk was exaggerating Tesla’s self-driving capabilities, the AI Day presentation should actually make you more optimistic about the company. Engineers acknowledged that they are using techniques that Musk had previously dismissed as unimportant or even counterproductive. Whatever Musk might say publicly, the people actually doing the technical work know that this isn’t an easy problem, and they’re pragmatically adopting industry best practices when necessary.
Elon Musk’s Reality Distortion Field
In his biography of Steve Jobs, Walter Isaacson writes about the “reality distortion field”—Jobs’s legendary ability to convince people to see the world his way.
“He can deceive himself,” said Bill Atkinson, an engineer who worked for Jobs on the original Macintosh. “It allowed him to con people into believing his vision, because he has personally embraced and internalized it.”
This is a surprisingly common trait for successful leaders. In his famous biography of Lyndon Johnson, Robert Caro argued that LBJ had a similar capacity to convince himself that whatever was convenient to say was actually the truth.
Musk has this capacity in spades, and it has been essential to Tesla’s success. Building a car company from scratch is insanely capital-intensive, and Musk has used every possible advantage to get the cash Tesla needed.
Some critics view Tesla’s entire self-driving program as a cynical play for cash. Since 2016, Tesla has been claiming its cars have the hardware necessary for full self-driving capability with only a software upgrade. Since 2016, Tesla has charged customers thousands of dollars to upgrade to Tesla’s “full self-driving” (FSD) software package, despite the fact that the software wasn’t yet close to enabling driverless operation.
And on numerous occasions since 2016, Musk has claimed that the company is less than two years away from delivering fully driverless capabilities to customers. In April 2019, for example, he predicted that there would be thousands of driverless Teslas serving as autonomous taxis by the end of 2020. Needless to say that didn’t happen.
Not only has Tesla reaped millions of dollars from selling the FSD software package, Tesla has benefited from a more general perception that the company is at the forefront of self-driving technology. Yet videos from last fall’s beta release of Tesla’s “full self driving” software showed the technology making errors far more frequently than Waymo’s taxi service in the Phoenix area.
But it’s crucial to remember that men like Jobs and Musk don’t just use their charisma to get people to believe things that may not be true. Often they use those persuasive powers in the pursuit of great goals. Steve Jobs really did create the Mac and the iPhone. Elon Musk really did create the Model 3 electric car and the Falcon 9 reusable rocket. Their flexible approach to truth-telling just means that you always have to take what they say with a grain of salt.
This was the context for last month’s AI Day event. Until now, Musk has had a strong incentive to exaggerate Tesla’s self-driving capabilities. But while Tesla customers are remarkably patient, they’re not infinitely so. Sooner or later, Tesla is going to have to actually deliver full self driving technology. The presentation was a chance for Tesla to show that it was making real progress.
Autonomy requires massive training data
The first stage in any self-driving system is perception: taking inputs from various sensors (cameras, radar, lidar, etc.) and combining them into a detailed 3-dimensional model. Over the last decade, neural networks have emerged as an industry-standard way to tackle the perception problem. They’re powerful because they don’t require human programmers to write explicit rules to distinguish, say, a fire hydrant from a toddler. Instead, neural networks can be “trained” by looking at millions of example images.
But first, somebody or something needs to look at each training image and mark the locations of toddlers and fire hydrants—not to mention curbs, traffic cones, and other cars. Human workers can do this, but it’s tedious to label millions of images by hand.
Obviously it would be nice to have software label the training images, but there’s a chicken-and-egg problem: if you already had software to accurately label images, you could just use that in your perception system. So the training process usually needs a certain amount of human input.
During the AI Day presentation, Tesla revealed it had a 1,000-person team dedicated to labeling training images. But even with 1,000 workers, it’s not possible to hand-label enough images for Tesla’s training needs. So Tesla described several strategies for boosting the productivity of its labeling team. For example, if you know there’s a car at a particular location in one frame of a video, software can make an educated guess about the location of the same car in the next frame, and the next one, and so forth.
A similar approach works when a car has multiple cameras with overlapping views. Rather than labeling each camera’s images individually, Tesla’s software projects all the images into a 3-dimensional “vector space.” Then when a human worker provides a label in this vector space, the software can use some basic geometry to label images from other cameras that captured the same object. Tesla says that techniques like these allowed human labelers to be 100 times more productive than if they had to hand-label every single video frame from each of a vehicle’s eight cameras.
Tesla’s reluctant embrace of HD maps
This approach can be taken one step further by correlating data from multiple trips and multiple vehicles. If 10 Tesla cars drive through the same intersection in a day, software can line their sensor data up precisely enough so that labels from one trip can be automatically applied to corresponding images from other trips. Effectively, Tesla is building what’s known in industry jargon as an “HD map”: a three-dimensional map that marks the precise location of significant objects like curbs and stop signs.
Most companies working on self-driving technology supply HD maps for their vehicles to use as they drive around. HD maps can be helpful in several ways:
They enable a vehicle to triangulate its own location with centimeter precision.
They increase the software’s confidence when it identifies a static object like a curb or a stop sign, allowing it to focus more on objects that move.
They can be especially valuable when objects are occluded—for example, locating curbs after a snowfall.
Musk has long staked out a contrarian position on HD maps, insisting that systems relying on them were “extremely brittle.” He often pairs this argument with a rejection of another technology that’s widely used in the self-driving industry: lidar sensors that use lasers to measure distances to nearby objects.
Musk and Tesla partisans argue that camera-based vision is so essential to the self-driving problem that it’s a distraction to use other information sources. They argue that either your vision-based perception system can identify objects around a vehicle (in which case you don’t need lidar or HD maps) or it can’t (in which case it’s not safe to use it). But this argument underrates the value of redundancy and the difficulty of reaching 100 percent certainty. HD maps and lidar will occasionally help identify objects the vision system missed, or help confirm identifications the vision system wasn’t sure about.
Tesla fans also argue that it’s not realistic to create a nationwide HD map and keep it regularly updated. They worry that self-driving software could blindly follow an outdated HD map, causing it to drive somewhere that’s no longer a lane.
But in its AI Day presentation, Tesla’s own experts effectively debunked these concerns by explaining how to update HD maps in a largely automated fashion. Popular self-driving platforms will have cars driving through any given stretch of roadway many times per day. These cars can compare their sensor readings to the latest map and report discrepancies. This information can then be used to update the map automatically.
During AI day, speakers seemed to recognize the tension between their own presentations and the past statements of their CEO. At one point, Tesla AI chief Andrej Karpathy acknowledged that in Tesla’s system “a number of cars could be collaborating to build this map, basically effectively an HD map.”
At another point, however, Autopilot software director Ashok Elluswamy said that “the point of this is not to just build HD maps or anything like that.” He said that it was only to speed up labeling images for training purposes. Unlike some other companies, he said, Tesla wasn’t planning to provide cars with HD maps they can use as they drove around.
Tesla’s business model is a poor fit for HD maps
But the obvious question is: why not? Wouldn’t it be helpful to make this data available to cars as they’re driving around?
Nobody at Tesla’s AI Day presentation addressed this question directly. But my guess is that the reasons are more economic than technical. Tesla competitors like Waymo and Cruise are planning to sell taxi services that will initially be limited to a small geographical area. Waymo, for example, is currently operating in only a portion of the Phoenix area. So if Waymo decides to use HD maps, they only need to maintain maps of a few Phoenix suburbs. Waymo can gradually expand the coverage of its maps as its business grows.
In contrast, Tesla customers can take their cars anywhere. So if Tesla’s self-driving software depends on HD maps, then Tesla will need an HD map that covers every road in the United States, if not the whole world.
A similar point applies to lidar. Good lidar sensors cost thousands of dollars apiece—too expensive for Tesla to make them a standard feature on its vehicles. Most of Tesla’s self-driving competitors are working on autonomous taxi services. The economics of this business leave more room to drop thousands—or even tens of thousands—of dollars on lidar and other custom hardware.
And this is where Musk’s reality distortion field comes in. Rather than just admitting that the economics don’t work out for Tesla to use lidar or HD maps, Musk has insisted—and convinced many fans—that doing without them is actually an advantage for Tesla.
A big danger for an organization with a charismatic leader like Musk is that people start believing their own propaganda. But the AI Day presentation gave me some reassurance that that isn’t happening. Tesla’s engineers are building and using HD maps in situations where it makes sense to do so. If it becomes economical to ship HD maps—or lidar—on its vehicles, I suspect Tesla will do that without worrying about Musk’s contradictory past statements.
Tesla is building a supercomputer
At a 2019 event focused on Tesla’s self-driving technology, Musk teased the existence of Dojo, a supercomputer to train the neural networks at the heart of Tesla’s self-driving software.
Musk didn’t reveal much information about Dojo in 2019. In particular, he didn’t say whether the Dojo supercomputer was already up and running or was scheduled to be brought online at some point in the future.
At last month’s AI Day presentation, Tesla finally offered specific details about Dojo. It’s a massive array of thousands of powerful computer chips organized into ten large cabinets. The system will have a computing capacity of more than an exaflop—that’s the capacity to perform a billion billion floating point operations per second. Oh, and it won’t be operational until 2022—three years after Musk first revealed its existence.
Powering the cluster will be a Tesla-designed chip that’s optimized for training neural networks.
So how impressed should we be with Tesla’s Dojo supercomputer? Google makes its own custom chip for neural network operations called the Tensor Processing Unit (TPU) and rents it out to cloud computing customers. In May, Google announced a new version of the TPU chip and said the new chip could be organized into clusters with as much as 1 exaflop of computing power.
Tesla says its chips are specifically optimized for the needs of self-driving neural networks, which may enable Tesla to achieve more real-world performance for a given exaflop rating. But there’s little reason to think Dojo will give Tesla a big performance edge when it comes online next year. Alphabet can supply Waymo with all the cash and chips it needs to stay competitive in this area.
Tesla is getting serious about simulation
Another place where Tesla is working to keep up with the competition is on simulation. Gathering data from real roads is a slow and expensive (and potentially dangerous) process, so self-driving companies have developed software to drive virtual cars on simulated roads. In the last couple of years, Waymo, Cruise, Aurora, and other self-driving companies have revealed a lot of details about their own simulation efforts.
At the AI Day event, Tesla showed off its own simulation software, which produces photo-realistic images of various driving scenarios. This wasn’t the first time Tesla revealed that it was using simulation, but there has been something of a change of emphasis. Back in 2019, Musk downplayed the simulation efforts of other companies, arguing that “we have quite a good simulation, too, but it just does not capture the long tail of weird things that happen in the real world.”
In last month’s AI Day presentation, on the other hand, a Tesla engineer argued that simulation was needed precisely to generate training data for these weird “edge case” situations. A presenter showed examples of a moose ambling through a busy intersection and a family of three sprinting along a freeway lane. Once again, Tesla seems to be bringing its approach more in line with that of major rivals.
A healthy focus on recruitment
Tesla closed AI Day with a call for engineers to join the company. I suspect that one reason Tesla revealed so many technical details was because it wanted to convince engineers that Tesla would be a fun and interesting place to work.
Engineers like to work on hard technical problems, but they usually don’t like to work for bosses with wildly unrealistic expectations. In its early years, Tesla suffered from high turnover in its Autopilot software team with Musk’s outlandish promises a key source of friction. Last month’s presentation seemed designed to allay some of those fears: it was long on technical details and short on Musk’s usual boosterism.
But there’s something else to note about the event’s focus on recruitment: you don’t embark on a big push to hire engineers if you think your product is six or 12 months away from launch. So Tesla is probably still quite a ways from achieving full self-driving capabilities. But the company seems to be taking the problem seriously in a way that it didn’t before.