What is 3D sensing anyway?
Pop quiz: What do self-driving cars, the Mars 2020 Rover, iPhone X and Lighthouse have in common? If you answered, “they’re all really cool,” okay, you get partial credit. But the answer I was really looking for, as you might’ve guessed from the title, is that each of these use some form of 3D sensing. More specifically, depth sensing. For us, it’s a big part of what makes Lighthouse so different, and why it can do what traditional home security cameras simply can’t.
Before we start geeking out a bit, the main thing you need to know is that depth sensing is used to gather really detailed information about the environment around us by, in essence, bouncing light off objects to “see” in 3D. After all, humans see in 3D, so getting things like Lighthouse, cars, phones, and things that are (literally) out of this world to also see in 3D leads to extraordinary things.
There are different types of systems for depth sensing depending on things like where you want to use it, the range, the cost and how much space you have to accommodate it. After all, you can’t make a self-driving car by strapping a Lighthouse camera to the top of your ride, and it’s definitely not a good idea to take an $80,000 Lidar from a self-driving car to make an interactive assistant and security camera for your home (we got ya covered there – and it’s only $299).
So, let’s take a quick look at what makes it possible to navigate Mars or just let you know if the dog walker is running late.
Time of Flight Sensor
Lighthouse uses something called a time-of-flight (TOF) sensor. There’s a set of illuminators at the top of Lighthouse that continuously emit invisible light into a room. The TOF sensor measures the time it takes for this light to bounce back from what’s in the room, whether it’s stationary or moving (like people and pets). When you put all of this data together, you can “see” the scene in 3D.
There are lots of things going on under the hood so to speak, but here’s an example of the kind of data Lighthouse uses to let you know what’s happening at home:
On the left side, you see a depth image from Lighthouse’s TOF sensor that shows one of my colleagues feeding treats to his cat. Using the data in this scene, the images on the right show an example of our AI models segmenting out what’s moving from the stationary background. Since the 3D sensor provides more data about what’s going on in the scene than what’s possible with traditional 2D-only sensors, it opens up lots of possibilities for us to surface the kind of information you want about what’s happening at home.
Speaking of data, here’s a fun fact: Did you know that Lighthouse uses more points of data than the typical sensor on self-driving cars? We’re talking 3 million points of data per second on Lighthouse vs. 2 million for sensors on a self-driving car. For something like Lighthouse, having more dense data helps inform our computer vision algorithms.
While TOF sensors gather more points of data per second, there are some key reasons Lidar works better for self-driving cars. First off, Lidar uses multiple beams – common systems have 64. By spinning them continuously (for a 360-degree view), you get 2 million data points. Lidar also works better for much longer distances, which is obviously critical when you’re talking about a car navigating the open road.
Even though Lidar doesn’t provide data as “dense” as a TOF sensor, the data is paired with other sophisticated and expensive equipment to see the world in a cool (and essential) way:
In contrast to Lidar’s beam for the great outdoors, structured light is commonly used for close-range sensing. The most prominent example of this method is the Face ID feature in the new iPhone X that’s used to unlock your phone. With structured light, a light pattern is projected onto an object at a distance of a few meters or less (if you use this method for longer ranges, the light gets dispersed to the point where it wouldn’t work well for creating a 3D map).
But in the case of iPhone X, this method makes perfect sense since you hold your phone relatively close to your face.
This method is much closer to home since it works very much like our own eyes. As the name implies, calculations are used from the perspectives of two cameras to create a depth map (while there’s a lot more to it, it’s sorta like what happens when you flip between closing one of your eyes to see the slight difference in each eye’s perspective and what the brain does to put these images together for what you actually see).
Unlike the other methods, stereo cameras use passive illumination, or whatever illumination is available, as opposed to actively using light to bounce off objects. This method holds an advantage in the sense that it can easily be used indoors and outdoors and for a much greater range, anywhere from 0.5 to 20 meters. There’s a lot more that goes into what kinds of calculations you have to do with this method, but since I’m no mathematician, I’ll leave it at that.
NASA actually uses stereo cameras in the Mars 2020 Rover to gather a 3D view of the terrain, helping it drive on its own. Check out the front “Hazcams” below.
So, there you have it. I hope this gives you a better idea of just how sophisticated, useful and cool it is to have 3D sensing capabilities in Lighthouse. If there are other technologies you want us to highlight in future posts, leave a comment here or hit us up on Facebook or Twitter.
- Sean Lindo, Product Marketing