“Makes HDTV look low-res.”
“Wow, how did they make this?”
“Hyper-detailed and actually looks 3-D.”
on-demand event
These are just some of the words reviewers (opens in new tab) have used to describe the new Photosynth (opens in new tab). A technical preview, launched December 10, enables photography enthusiasts to create even more realistic 3-D views of objects and locations captured through still photos.
A user can upload multiple images of the same object or physical space to the Photosynth cloud service, which then processes the images into a “synth”—a composite of overlapping photographs that create a 3-D model of the space, with added depth and transitional images for smooth 3-D viewing.
Photosynth stitches together the photos to create spin, panorama, walk, or wall synths (opens in new tab) that draw the viewer along a path or a spin around an axis. It’s obvious there’s sophisticated technology at work here. Less apparent is the extent and duration of the relationship between Microsoft Research’s Interactive Visual Media (IVM) group and the Photosynth product team.
It Began with Photo Tourism
“The first bit of ground-breaking work was back in 2006 with Photo Tourism (opens in new tab),” says Eric Stollnitz (opens in new tab), IVM principal developer. “It began as a collaboration between Noah Snavely and Steve Seitz at the University of Washington and my colleague Richard Szeliski (opens in new tab). The goal was to take images of the same landmark from the web—Paris’ Notre-Dame or the Eiffel Tower, for example—taken by different photographers and at different times, and combine them to produce a 3-D visualization of a landmark. It was a photo-crowdsourcing idea.”
A Microsoft product group refined Photo Tourism and, in 2008, launched Photosynth (opens in new tab), a desktop product that stitched together a user’s collection of images into a 3-D model that could be uploaded to the Photosynth website for sharing.
Photosynth was enhanced in 2010 with the incorporation of Image Composite Editor (opens in new tab) (ICE), an advanced image stitcher that combines a set of overlapping photographs to create a seamless, 360-degree panorama at full resolution.
“The Bing (opens in new tab) team was really excited about this,” Stollnitz recalls, “and they added a panorama-viewing feature to the website, while the IVM team added the ability for ICE to export panoramas to the Photosynth desktop application, which meant Photosynth could upload your panorama to the website for sharing.”
In quick succession, Bing launched its Mobile Panoramas capture-and-display application for iOS devices in 2011 and added a Windows Phone (opens in new tab) app and improved social sharing in 2012. And now, the new Photosynth delivers even more 3-D realism, thanks to innovative work on transitions and navigation.
The Parallax View
Transitions are reconstructions of plausible views between actual photographs, synthesized to fill in gaps and provide smoother movement. As part of the IVM’s Spin project, a 2009 paper titled Piecewise Planar Stereo for Image-based Rendering (opens in new tab)—by researcher Sudipta N. Sinha (opens in new tab); Drew Steedly, principal development manager at Microsoft; and Szeliski—proposed a way to create more realistic transitions when moving from photograph to photograph.
“They decided to do something about synths that looked as though they’d been projected onto a single plane,” Stollnitz explains. “Essentially, we were showing flat screen after flat screen joined together. But in real life, you’d see objects from different angles and depths as you move along. We had to fill in more information gaps.”
The answer to more realistic transitions was to use computer-vision techniques to calculate the depth of each pixel. This required analyzing each pair of overlapping photographs and comparing objects that appeared in both images to determine how far they were from the camera.
“Objects that shift just a little bit from one image to the next image are farther away,” Stollnitz says, “while objects that have shifted quite a bit from one image from to the next are closer in. It’s all about parallax. Sudipta really focused on this challenge. He’s a master at this ‘depth from stereo’ technique.”
With a depth calculation for every pixel, the researchers were able to simplify the information and construct 3-D surfaces from a relatively small number of planes. By projecting images onto these coarse 3-D approximations instead of a single plane, the team created transitions that are far more immersive, with different depths and angles.
Spinning Through 3-D
Another enhancement the Spin project brought to Photosynth was simpler navigation. Depending on the number of images and their relationship to the physical space, there could be many potential paths through which to travel in a synth.
“That means many 3-D relationships,” Stollnitz says, “which is ultimately very powerful, but it can also be overwhelming and confusing if you’re able to rotate as well as move forward and back, left and right, up and down as you’re making your way through a collection. Furthermore, in situations where photos are only loosely connected to each other, it becomes difficult aligning points in each image and handling the transition in a smooth manner if you try to accommodate every possible path.”
Instead, researcher Johannes Kopf (opens in new tab) simplified navigation and, in doing so, also enhanced the 3-D experience. By restricting navigation to a circular path—either an outward-looking “panorama” from a fixed spot or a “spin” around an object—transitions between photos became much smoother.
“The computer-vision work for pixel depth was that much easier,” Stollnitz says, “and projections onto the 3-D surfaces looked really good. It was definitely a ‘wow’ moment when we saw how well these two approaches came together. The transitions in these synths offer a much more realistic sense of depth.”
The “spin” navigation and depth from stereo also impressed the Bing team, which immediately added the new technology to Photosynth and suggested two more synth scenarios: “walk,” for images taken with the camera moving forward into a scene, and “wall,” in which the camera takes images perpendicular to the direction of movement.
Another key decision was to host the new Photosynth in the cloud. Both the Bing and IVM teams felt the amount of processing demanded for 3-D spins would take too much time on smaller devices, so they re-engineered the technology to run on Microsoft Azure (opens in new tab) and to handle processing for thousands of incoming photo collections, a feat Stollnitz considers a significant engineering accomplishment.
“Photography—particularly travel photography—is one of my hobbies,” he says. “My wife and I always come back from trips with tons of photos. Collaboration with the Photosynth product team has been particularly satisfying, not just because I get to indulge my passion for visual media, but also because we’ve been able to amplify the ideas of the original research work and develop a robust, powerful, yet easy-to-use tool and viewing experience. And now, by releasing Photosynth as a cloud-based service, we’ve made it possible for just about anybody to use our technology.”
Close, Ongoing Collaboration
Stollnitz has been the interface between the IVM and Photosynth teams, responsible for product contributions, making code reliable and usable not just for research purposes but also solid enough to provide to product developers.
“A number of researchers at IVM have been involved with Photosynth,” Stollnitz says. “Rick with Photo Tourism and the first generation of Photosynth, for example, and Matt Uyttendaele (opens in new tab) with ICE. They’ve been great to have as technical advisers for this latest generation of Photosynth.
“On the Photosynth product team, David Gedye (opens in new tab) has been the lead program manager since the beginning, so we’ve had that continuity of vision over a multiyear collaboration, which we feel has been very productive.”
Gedye feels the same.
“Over the years,” he says, “the IVM group has moved beyond contributing just ideas and prototypes. Now, they’re providing shipping-quality code and development support. Just as an example, with the new Photosynth, Sudipta contributed the core computer-vision algorithms, made important technical-design contributions, and came to every product meeting. He’s been one of the driving forces behind this release. We feel very tightly coupled with the research team.”
The process culminating in the new Photosynth, Szeliski observes, has been a delight to behold.
“When you look back,” he says, “and watch how Photosynth capabilities have evolved, it’s terrific how different members of the IVM have contributed a diverse set of research ideas, including Sudipta’s foundational work in 3-D reconstruction, Johannes’ work on image-based rendering and 3-D navigation, and Eric’s work on user interfaces and cloud services. The Spin project, which produced all of these breakthrough features, also represents our closest working relationship yet with the Photosynth product team.”
The fruits of that partnership are now yours to enjoy. Sign up (opens in new tab), and you’ll receive confirmation and access within 24 hours.