For what seems like forever, we've been consuming flat content on flat screens. We use traditional cameras to capture all of this, but those cameras aren't suited for the new world of virtual and augmented realities. There needs to be a new technology that can provide immersive experiences.
The solution is volumetric capture. The easy way to think about this is that it's the complete opposite of another capture tech that's proven popular in VR: 360 video recording. While 360 video cameras are placed in the centre of something and capture everything around them, volumetric capture cameras are placed around something and capture every detail from the outside in.
Read this: The future of VR filmmaking
Microsoft recently invited me to check out its brand new capture studio in the newly renovated Microsoft Reactor in San Francisco. It turns out they were going to put me directly in the capture studio and turn me into a hologram.
Microsoft has been working on volumetric capture for the past seven years, says Steve Sullivan, general manager of of Microsoft Mixed Reality Capture Studio. It started out as a project within Microsoft Research before moving to an incubator, where it was then graduated into part of the HoloLens project, which is now part of the larger Mixed Reality platform.
The newly opened Capture Studio in San Francisco is only the second version outside of the original on Microsoft's Redmond campus. Sullivan says the company is already receiving a ton of demand from creatives to use the space, and is opening a third location called Dimension Studios in London in collaboration with Hammerhead and Digital Catapulti, giving European creators a chance to play with the tech without having to make the trek across the Atlantic and continental US.
Stepping into the Capture Studio is bizarre. It's absurdly bright, with what feels like a thousand lights concentrated on you from every direction. All of those lights also mean that you can feel the temperature rise once you walk past the grey curtains that separate the recording space from the Capture Studios' technical staff.
You stand in the centre of the space, which Microsoft says is big enough to fit two sumo wrestlers into (they actually did do that). Around you, 106 cameras are trained on you, capturing every moment. Half of those cameras are, well, cameras that are capturing imaging data. The other half of those cameras are infrared cameras shooting dots at you. As one of the staff members told me, the tech is similar to the infrared cameras in the old Xbox Kinect.
There aren't too many things you have to worry about when you step into the Capture Studio. You probably shouldn't wear anything too bright, like a pure white shirt, because it'll bounce the light away. You also shouldn't wear anything too dark, because that absorbs a whole lotta light. Other than that though, it's all pretty simple.
I was told to come up with a movement about five to ten seconds in length. I'd have to start with a base pose, do my little action, and then return to the base pose. As you can see, I went with a more contemplative action, weighing the great problems our world faces. Like, for instance, whether you should put ketchup or mustard on your hot dog first.
And that's it. I had been captured, the system taking 10GB of data a second at 30 frames a second to record my performance. The back end of the system is able to streamline and condense all of that data so that it can easily be shared over Wi-Fi. Or, as Microsoft put it, you can stream its holograms anywhere you can stream Netflix.
A developer would be able to use my mesh and place it into things like AR or VR experiences that are playable on low-end devices like phones or tablets as well as high-end devices like beefed up desktop PCs. And you wouldn't need AR or VR to view them either – we were able to play with and manipulate the same experiences on regular PCs with mice and keyboards and they played and looked fine. They were much better in VR, but it was still possible to see them – and they still looked amazing.
But how exactly does this all work?
A mesh of a time
The Capture Studio and all of those cameras creates a mesh version of yourself. It's not exactly a video or photograph as you and I have come to understand it. It's a little closer to a digital creation, like a character from a video game or a Pixar movie. Except, well, it looks absolutely incredible.
If someone told you this was a video you'd probably believe it. In some demos, Microsoft showed me – and the rest of the gaggle of press that were present – other captures it had done throughout the years. Some early captures, like that of Xbox head Phil Spencer's golf coach, had some artefacts that kept it from looking perfect. However, the more recent ones were simply stunning.
Microsoft's technology is advanced enough that it can do materials and cloth really well, which is why brands and companies who want to use the system to push clothing are incredibly interested in it, according to Sullivan. It flows and bends and crumbles like real fabric.
Creators can also adjust the mesh after the fact. For instance, in one demo a Microsoft employee simply moved a slider and turned a mesh of a Cirque du Soleil dancer into a rainbow of colours. You could even change the colour of someone's clothes if you really wanted to. One of the more advanced things that was done for a mesh was for a Buzz Aldrin experience.
The developer wanted to create an immersive environment that users could look and move around in, but they also realised that that could break immersion. If the user is noodling around in one area, and then they turn around and see Aldrin has his back to the user, explaining something to no one, it breaks the immersion of actually being with Aldrin. So they adjusted the mesh so that it would always turn toward the user.
And really, that's the big goal with the Capture Studio. Microsoft wants to enable creators of AR and VR experiences to make immersive, engaging performances.
"That's our goal, to give people the ability to have human beings, real human beings, performing and engaging with them in these experiences," Sullivan says. "So it's much more connecting than working with a CG environment, for example."
In the immediate future, Microsoft sees its capture studio – and volumetric capture as a whole – in the hands of creators making entertainment experiences. It's definitely the future of VR filmmaking, for instance. Educational and instructional institutions have also been turning toward the Capture Studio to add authenticity to experiences. However, Microsoft is hoping that one day it will have made the technology mainstream and prevalent enough that it can be for personal use.
"I've done this for the past four years with my kids and it's really interesting to go back and see them four years ago," Sullivan explains. "It's a very different sense of who they were than just looking at a photo or video."
A sense of scale
Currently, Microsoft's studios are built for more personal experiences. They can capture the performances of a couple of dancers or sumo wrestlers, and creators can take those performances and do what they need for their pieces.
It can be as simple as an immersive piece of tourism, like putting break dancers on the Seattle waterfront at night. Or it could be a couple of actors playing out a dramatic scene in a gritty detective movie. Or it could be a big budget companion piece to a blockbuster movie like Blade Runner 2049. These aren't possibilities, they're realities.
However, this system can also scale up. For example, Sullivan says you could theoretically volumetrically capture an entire football stadium. You'd lose out on some detail, and placing a mess of cameras around a stadium would be a massive engineering challenge, but it's possible.
For now, however, Microsoft is going to be working on opening Capture Studios in most major media markets so that creatives can capture performances for their pieces. Sullivan says the company is also hoping to create mobile kits that it can take to big events, like the Oscars, so that it can capture performances there.
Microsoft's capture technology feels like a "future is here" moment. You can almost taste the possibilities you'll experience one day on your Samsung Odyssey. However, there does seem to be a technological bottleneck that hampers the experience. The mesh captures are so incredible that it makes the background obviously fake. It's kind of like watching an old movie with bad CGI.
Let's hope the technology surrounding volumetric capture is able to improve and match the captures themselves, because the potential is here to fuel the AR and VR experiences of the future.
How we test