A camera consists of a lens, a shutter, a light-sensitive surface, and, progressively, a series of complex algorithms. Though physical components continue to evolve, Google, Samsung, and Apple are increasingly investing in (and displaying) enhancements that are entirely code-based. The only true battleground right now is computational photography.
The explanation for this shift is simple: cameras can’t get any better than they are now, at least not without some major changes in how they operate. Here’s how mobile manufacturers reached a brick wall when it came to photography, and how they were forced to overcome it. Our mobile cameras’ sensors are truly awesome devices. The work that companies like Sony, OmniVision, Samsung, and others have done to develop and fabricate tiny yet responsive and flexible chips is truly amazing. The level of quality these microscopic sensors produce is nothing short of astonishing for a photographer who has seen the evolution of digital imaging from the beginning.
But those sensors don’t follow Moore’s Law. Or, to put it another way, just as Moore’s Law is reaching quantum limits at sub-10-nanometer speeds, camera sensors are approaching physical limits even sooner. Consider light hitting the sensor as rain falling on a bunch of buckets: you can put bigger buckets, but there are less of them; you can put smaller buckets, but they can’t capture as much each; you can make them square, stagger them, or do all sorts of other tricks, but there are just so many raindrops, and no amount of bucket-rearranging can alter that.
Yes, sensors are improving, but this rate is too slow to keep customers purchasing new phones year after year (imagine trying to sell a camera that’s 3% better), and phone manufacturers mostly use the same or similar camera stacks, so innovations (such as the recent move to backside illumination) are shared among them. As a result, no one is getting ahead solely on the basis of sensors. Maybe they will make the lens better? Not at all. Lenses have reached a level of complexity and perfection that is difficult to surpass, particularly on a small scale. It’s an understatement to suggest that space within a smartphone’s camera stack is tiny — there’s barely a square micron to spare. You may be able to change them marginally in terms of the amount of light that passes through and the amount of distortion, but these are old issues that have been already refined.
The only way to collect more light is to increase the lens’s size, either by projecting it outwards from the body, displacing vital components within the body, or increasing the phone’s thickness. Which of those choices does Apple seem to be most likely to accept?
In retrospect, Apple (and Samsung, and Huawei, and others) had no choice but to select D: none of the above. If you can’t get more sun, you’ll have to make do with what you have.
Isn’t it true that all photography is computational?
Computational photography, in its broadest sense, refers to any form of digital imaging. Even the simplest digital camera, unlike film, involves computation to convert the light hitting the sensor into a usable image. And camera manufacturers use a variety of methods to accomplish this, including various JPEG processing methods, RAW formats, and colour science.
There wasn’t much competition on top of this simple layer for a long time, partially due to a lack of computing capacity. Sure, filters and simple in-camera tweaks to enhance contrast and colour have been used. But, in the end, it’s just digital dial-twiddling. Object recognition and tracking for autofocus were arguably the first true computational photography features. Face and eye tracking made it easier to photograph people in difficult lighting or poses, while object tracking made sports and action photography easier by adjusting the system’s AF point to a moving target around the picture.
These were some of the first examples of extracting metadata from an image and using it to boost that image or feed it forward to the next.
Autofocus precision and flexibility are important features in DSLRs, so this early use case made sense; however, apart from a few gimmicks, these “serious” cameras used computation in a fairly standard way. Faster image sensors meant faster sensor offloading and burst rates, as well as more colour and detail preservation cycles. Live video and virtual reality were not possible with DSLRs. Similarly, until recently, mobile cameras were more like point-and-shoot cameras than the all-purpose communication tools we know them as today.