Picture Perfect
When taking a picture, a photographer must typically commit to a composition that cannot be changed after the shutter is released. For example, when using a wide-angle lens to capture a subject in front of an appealing background, it is difficult to include the entire background and still have the subject be large enough in the frame.
Positioning the subject closer to the camera will make it larger, but unwanted distortion can occur. This distortion is reduced when shooting with a telephoto lens, since the photographer can move back while maintaining the foreground subject at a reasonable size. But this causes most of the background to be excluded. In each case, the photographer has to settle for a suboptimal composition that cannot be modified later.
As described in a technical paper to be presented July 31 at the ACM SIGGRAPH 2017 conference, UC Santa Barbara Ph.D. student Abhishek Badki and his advisor Pradeep Sen, a professor in the Department of Electrical and Computer Engineering, along with NVIDIA researchers Orazio Gallo and Jan Kautz, have developed a new system that addresses this problem. Specifically, it allows photographers to compose an image post-capture by controlling the relative positions and sizes of objects in the image.
Computational Zoom, as the system is called, allows photographers the flexibility to generate novel image compositions — even some that cannot be captured by a physical camera — by controlling the sense of depth in the scene, the relative sizes of objects at different depths and the perspectives from which the objects are viewed.
For example, the system makes it possible to automatically combine wide-angle and telephoto perspectives into a single multi-perspective image, so that the subject is properly sized and the full background is visible. In a standard image, the light rays travel in straight lines into the camera at an angle specified by the focal length of the lens (the field of view angle). However, this new functionality allows photographers to produce physically impossible images in which the light rays “bend,” changing from a telephoto to a wide angle as they go through the scene.
Achieving the custom composition is a three-step process. First, the photographer must capture a “stack” of multiple images, moving the camera gradually closer to the scene between shots without changing the focal length of the lens. The system then uses the captured image stack, and a standard structure-from-motion algorithm, to automatically estimate the camera position and orientation for each image. Next, a novel multi-view 3D reconstruction method estimates “depth maps” for each image in the stack. Finally, all of this information is used to synthesize multi-perspective images which have novel compositions through a user interface.
“This new framework really empowers photographers by giving them much more flexibility later on to compose their desired shot,” said Pradeep Sen. “It allows them to tell the story they want to tell.”
“Computational Zoom is a powerful technique to create compelling images,” said Gallo, NVIDIA senior research scientist. “Photographers can manipulate a composition in real time, developing plausible images that cannot be captured with a physical camera.”
Eventually, the researchers hope to integrate the system as a plug-in to existing image-processing software, allowing a new kind of post-capture compositional freedom for professional and amateur photographers alike.
Find out more about the project, or see results of the post-capture method on YouTube.