Deep Learning To Soon Bring Pro-Level Photography To Smartphones

Google researchers have been working on a project in which deep learning is used to create “professional-level photographs.” The project shows that in the near future, all photos could look like they were taken by professionals, and pictures taken with smartphones could benefit the most from this technology over the coming years.

Deep Learning Photography

Google’s researchers have created a deep learning system that mimics the workflow of a professional photographer in order to achieve similar results with its automatically edited photos. The system analyzed the best panoramas from Google Street View and searched for the best composition. Afterward it carried out various post-processing techniques to create an “aesthetically pleasing image.”

Google said that using supervised learning and manually selecting labels for various features of aesthetically pleasing images would have been be an intractable task. Therefore, the company used a collection of professional quality photos to automatically break down the aesthetics into multiple aspects.

The researchers used traditional image filters to generate negative training examples for saturation, HDR detail, and composition. They randomly applied image filters and brightness levels to professional photos to degrade their appearance. The “good” image that was supposed to be the end result would then be trained based on those negative images. This type of training technique is called a “generative adversarial network.”

Turing Test For Aesthetics

The researchers couldn’t develop an “objective” arbiter to evaluate the photographs’ beauty because, as they say, beauty is in the eye of the beholder. As far as we know, Google’s AI also doesn’t yet have feelings and emotions to be able to see “beauty” with its own “eyes.”

This is why the company used the next best thing: a panel of photographers that could look at photos edited either by humans or the deep learning system to identify which were which. The human judges were instructed to assign one of these four levels to each photograph they saw:

  1. Point-and-shoot without consideration for composition, lighting etc.
  2. Good photos from general population without a background in photography. Nothing artistic stands out.
  3. Semi-pro. Great photos showing clear artistic aspects. The photographer is on the right track of becoming a professional.
  4. Pro.

Google said that 40% of the photos that were automatically edited by its deep learning system received either a “semi-pro” or a “pro” rating.

From Street View To Smartphones?

The researchers working on this project said that for now, they used only Street View panoramas to test their project, but that the technology could one day be applied to take better photos in the real world.

Most people take photos with their smartphones these days, and Google already has a horse in the race with Android making up the vast majority of those handsets. Although we’ve seen impressive improvements to smartphone photo quality in the past few years, most people would still argue that we’re still not close to taking photos that look as good as those taken by thousand-dollar-plus professional DSLRs (and commensurately talented photographers).

Based on physical qualities alone, it should be impossible for smartphone cameras to catch up to DSLRs, maybe ever. That’s because, for one, it’s unlikely that the lenses of smartphones will match the quality of those of DSLR lenses anytime soon. Second, those cameras use significantly larger sensors to capture much more light, and they couldn’t possibly fit in a smartphone without making one look like a point-and-shoot camera.

Deep Learning Accelerators For Mobile

Deep learning photography systems, such as the ones being developed by Google right now, could optimize photos to the extreme and turn even mediocre-looking pictures into professional-quality photography.


To achieve those kinds of complex edits, smartphones are going to need specialized chips that can do all of those operations on a tight power budget. We’ve already seen interest from Qualcomm, Samsung, and MediaTek in designing such machine learning accelerators for mobile, but the first generation of such chips may not be completely optimized or adequate for such tasks.

We may first need the software that is capable of automatic professional editing before the chips that are optimized for that software can arrive. Similarly, we’ve seen Nvidia’s GPUs evolve over the past few years to deliver increasingly higher deep learning performance with each new architectural design after machine learning researchers had already decided on a handful of frameworks. The mobile deep learning chips will need to follow a similar path over the next few years to produce the best computational photography results.