Galleries Home Contact

Click here for a light background

 

 

Equivalence

 

 

INTRODUCTION              

EQUIVALENCE ON THE QUICK

Q&A

DEFINITIONS OF TERMS AND ABBREVIATIONS

DEFINITION OF EQUIVALENCE

THE PURPOSE OF EQUIVALENCE

COMPARING SYSTEMS

 

THE SIX POSTULATES OF EQUIVALENCE

          perspective
          framing (FOV/AOV)
               EFL for a lens as a function of perspective
               thin lens approximation formula
               sensor sizes
          DOF / aperture
               background blur and sharpness vs DOF
               diffraction
               examples of format equivalents
          exposure time
          lightness
          display dimensions

 

IMAGE QUALITY

     attributes of a camera
     image quality vs a quality image
     attributes of IQ
     subjective vs objective
     how equipment affects IQ
     post-processing
     PPI & DPI
     role of sensor size in IQ

 

MYTHS AND COMMON MISUNDERSTANDINGS

     f/2 = f/2 = f/2
     larger sensor systems are bulky and heavy
     larger sensor systems have a DOF that is "too shallow"; smaller sensor systems have "more DOF"
     larger sensors require sharper lenses
     larger sensor systems have softer edges and more vignetting than smaller sensor systems
     assuming "equivalent" means "equal"
     assuming "equivalence" is based on equal noise
     larger sensors have less noise because they have larger pixel / higher ISOs result in more noise
     comparing images at their native sizes rather than the same output size
     larger sensor systems gather more light and have less noise than smaller sensor systems


     
EXPOSURE, LIGHTNESS, & TOTAL LIGHT

     the role of f-ratio in exposure and total light
     T-stop vs F-stop / fast lenses and lost light
     metering

 

NOISE

     photon noise
     read noise
     the role of ISO
     efficiency
     the three primary factors in noise
     pixel size vs noise
     quality vs quantity of noise
     detail vs noise

 

DYNAMIC RANGE

LENS VS SENSOR

MEGAPIXELS:  QUALITY VS QUANTITY

EQUIVALENT LENSES

IQ VS OPERATION

EXAMPLES

RELATED ARTICLES

CONCLUSION

ACKNOWLEDGEMENTS

 

 

 

 

 

INTRODUCTION

 

It is my hope and purpose to remodel the whole of dynamics, in the most extensive sense of the word, by the idea of my characteristic function.

̶  William Rowan Hamilton, 1834, in a letter to his uncle

 

We came up with this notion of equality.  It should have been equivalence all along.

̶  Jonathan Campbell

 

Probably the most famous statement of equivalence is Einstein's famous equation, E = mc², where the equivalence between mass and energy is revealed.  This essay is about relating different systems on the basis of six parameters (perspective, framing, DOF / diffraction / total amount of light on the sensor, exposure time, lightness, and display size) and defining "equivalent photos" as photos that share these six visual properties (exposure time becomes a visual property in terms of the resulting motion blur).

But why these six properties and not others, such as noise, detail, dynamic range, color, bokeh, distortion, etc., etc., etc.?  The reason is that these six properties are independent of the technology.  However, with proper assumptions about the technology, these other attributes can be added in to the equation, and this essay goes into great depth on those points (with special attention being paid to noise, detail, and dynamic range).

This essay is not a guide on how to take "good" photos.  Instead, it is a tutorial on the physics and technology of digital cameras and lenses with an emphasis on how to relate systems of different formats (sensor sizes) on the basis of the visual properties of the photo.  If this sounds like something you're interested in, please read on.  If not, thanks for coming this far!  Either way, please allow me to suggest spending a few moments to watch a hilarious video rant on the Nikon D3x, a quick lesson in photography, and a parody on tech-talk that many would say sums up this essay quite nicely.  :  )

 

 

 

 

 

  

QUICK SUMMARIES

 

Equivalence in 10 Bullets:

 

 

Before the 10 bullets that spell out what it means for two photos from two systems to be Equivalent, first a few bullets on what Equivalence is all about:

  • Equivalence is only relevant when comparing across formats, using a TC/FR, or cropping.  It is not relevant for using the camera in hand, although the individual principles of Equivalence are.
  • Equivalence is *not* about showing one format to be "better than" another, especially as there are so many attributes to a system that Equivalence does not cover (e.g. AF speed/accuracy, frame rate, video, build, ergonomics, etc., etc., etc.).
  • Equivalence is *not* a mandate that one should take photos with one format that are Equivalent to photos in another format (Equivalence says that one would endeavor to take the best photo they can with the equipment the photographer is using).
  • If there is a motive to Equivalence, it is to replace relative aperture (f-number) and exposure (density of light projected on the sensor) with aperture diameter (entrance pupil) and total light (total amount of light projected on the sensor) as the relevant measures for cross-format comparisons, and thus show that statements like "F2 = F2 = F2" are no more or less meaningful than "50mm = 50mm = 50mm".

 

That out of the way, Equivalence relates the visual properties of photos from different formats based on the focal length and aperture of the lens.  Neither the focal length nor the relative aperture of a lens change as a function of sensor (for example, a 50mm f/1.4 lens is a 50mm f/1.4 lens, regardless of the sensor behind the lens). However, the effect of both the focal length and the relative aperture on the visual properties of the photo very much depend on the sensor, and scale in direct proportion to the size of the sensor:

 

For a given scene and exposure time, 25mm f/1.4 on mFT is equivalent to 31mm f/1.8 on 1.6x (Canon APS-C), 33mm f/1.9 on 1.5x (all others' APS-C), and 50mm f/2.8 on FF (FX), where "equivalent to" means:

 

  • The photos all have the same diagonal angle of view (25mm x 2 = 31mm x 1.6 = 33mm x 1.5 = 50mm) and aperture diameter 25mm / 1.4 = 31mm / 1.8 = 33mm f/1.8 = 50mm / 2.8 = 18mm).
  • The photos all have the same perspective when taken from the same position.
  • The photos all have the same DOF (as well as diffraction softening) when they are taken from the same position with the same focal point and have the same display size.
  • The photos all have the same motion blur for the same exposure time (regardless of pixel count).
  • The same total amount of light is projected on the sensor for the same scene, DOF, exposure time, vignetting, and lens transmission (e.g. if the 25mm lens is t/1.6 at f/1.4, the 31mm lens is t/2 at f/1.8, the 33mm lens is t/2.1 at f/1.9, and the 50mm lens is t/3.2 at f/2.8).
  • The same total light projected on the larger sensor will result in a lower exposure than the smaller sensor (the same total light over a larger area results in a lower density of light on the sensor).
  • The larger sensor system will use a concomitantly higher ISO setting for a given lightness on the LCD playback and/or for the OOC (out-of-the-camera) jpg due to the lower exposure.  For example, ISO 200 on mFT is equivalent to ISO 250 on 1.6x which is equivalent to ISO 800 on FF.
  • However, the same total light will result in the same noise if the sensors record the same proportion of light falling on them (same QE) and add in the same electronic noise (the noise from the sensor and supporting hardware), regardless of pixel count and ISO setting.  It should be noted that sensors of the same, or nearly the same, generation typically record very nearly the same proportion of light falling on them regardless of brand, size, or pixel count (a notable exception would be BSI tech which records a third to half a stop more light for a given exposure than non-BSI tech) and that the electronic noise matters only for the portions of the photo made with very little light.  It should also be understood that, for a given exposure, the ISO setting affects noise only inasmuch as higher ISO settings result in less electronic noise than lower ISO settings -- e.g. a photo "properly exposed" at f/2.8 1/100 ISO 1600 will be less noisy than a photo of the same scene at f/2.8 1/100 ISO 200 pushed to the same lightness.
  • In addition, if the 25mm lens at f/1.4 in the example above is twice as sharp (lp/mm), the 31mm lens is 1.6x as sharp at f/1.8, and the 33mm lens is 1.5x as sharp at f/1.9 as the 50mm lens at f/2.8 (or any equivalent relative apertures), the sensors have the same number of pixels, and the AA filter introduces the same blur, then all systems will also resolve the same in the photo (lw/ph).
  • Elements of IQ, such as bokeh, color, flare handling, distortion, etc., as well as elements of operation, such as AF speed/accuracy, stabilization, size, weight, etc., are not covered in this use of the term "equivalent".  For example, the Canon 50 / 1.4 on the Canon 5D (13 MP FF) is equivalent to the Sigma 50 / 1.4A on the Nikon Z7 (46 MP FF) despite the fact that the latter system will have significantly higher resolution, lower noise, smoother bokeh, etc., etc..

 

 

 

Equivalence as a Teleconverter / Focal Reducer:

 

 

Given...
  • ...a 25 / 1.4 lens on an mFT camera,
  • ...the same lens and a perfect (aberration free) 1.25x TC (teleconverter) on a 1.6x camera,
  • ...the same lens and a perfect 1.33x TC on a 1.5x camera,
  • ...the same lens and a perfect 2x TC on a FF camera...
...and a photo of the same scene taken from the same position with the same focal point at...
  • ...f/1.4 1/200 ISO 100 on the mFT camera,
  • ...f/1.8, 1/200, ISO 160 on the 1.6x camera,
  • ...f/1.9, 1/200, ISO 180 on the 1.5x camera,
  • ...and f/2.8, 1/200, ISO 400 on the FF camera...
...then the resulting photos will not merely be Equivalent, but be identical (aside from differences in the aspect ratio), if the cameras had the same pixel count, recorded the same proportion of light falling on them, added in the same amount of electronic noise, and had the same sensor stack (color filter array, low bypass filter, IR filter, etc.).



 

Alternatively, given...

  • ...a 50 / 1.4 lens on a FF camera,
  • ...the same lens and a perfect (aberration free) 0.67x FR (focal reducer) on a 1.5x camera,
  • ...the same lens and a perfect 0.625x FR on a 1.6x camera,
  • ...the same lens and a perfect 0.5x FR on an mFT camera...
...and a photo of the same scene taken from the same position with the same focal point at...
  • ...f/1.4 1/200 ISO 400 on the FF camera,
  • ...f/0.9, 1/200, ISO 180 on the 1.5x camera,
  • ...f/0.9, 1/200, ISO 160 on the 1.6x camera,
  • ...and f/0.7, 1/200, ISO 100 on the mFT camera...
...then the resulting photos will not merely be Equivalent, but be identical (aside from differences in the aspect ratio), if the cameras had the same pixel count, recorded the same proportion of light falling on them, added in the same amount of electronic noise, and had the same sensor stack (color filter array, low bypass filter, IR filter, etc.).

 

 

 

Equivalence as Cropping:

 

 

Given four cameras, one with...
  • ...an mFT (2x) sensor,
  • ...another with a Canon APS-C (1.6x) sensor,
  • ...another with all others' APS-C (1.5x) sensor,
  • ...and another with a FF sensor...
...and...
  • ...a photo of a scene from the same position with the same focal point and the same settings (e.g. 25mm f/1.4 1/200 ISO 400) with all cameras,
  • ...the photos cropped to the same framing as the photo from the mFT camera,
  • ...and the photos are displayed at the same size...
...then the resulting photos will be Equivalent. In addition, if...
  • ...the sensors record the same proportion of light falling on them and add in the same amount of electronic noise, then all the photos will also have the same noise,
  • ...the pixels are all the same size, the AA filter the same strength, and the lens is the same sharpness, then all the photos will also have the same detail,
  • ...the exact same lens is used and the sensors are of the exact same design with the exact same size pixels, AA filter, CFA, and processing...
...then the photos will not merely be Equivalent, but be identical (aside from differences in the aspect ratio).

 

 

 

On the Quick:

 

 

 

  • Equivalent photos, as opposed to "equal" photos, are photos that have the same perspective, framing, DOF, shutter speed, lightness, and display dimensions.  Equivalent lenses, then, are lenses on different formats that produce Equivalent images (same diagonal angle of view and same aperture diameter).

  • Equivalence is only relevant when comparing different formats, comparing a crop with a wider focal length to the whole photo with a longer focal length, and/or comparing a lens with a TC (teleconverter) or FR (focal reducer) to the bare lens. Equivalence is also relevant when comparing a single exposure to multiple exposures combined by stitching and/or merging.  However, if we are comparing the performance of a single exposure from a 50mm lens designed for FF to a 50mm lens designed for APS-C or mFT (4/3), both lenses being used on the same camera, Equivalence does not come into play.

  • Neither the focal length nor the f-ratio of a lens change as a function of format:  50mm = 50mm and f/2 = f/2 regardless of the format the lens is used on.

  • The effect of the focal length and f-ratio of a lens, however, do change as a function of format.

  • The DOF is the same for all systems for a given perspective, framing, aperture diameter, and display display size.  For the same aperture diameter and shutter speed, the same total amount of light will fall on the sensor for all systems, resulting in a lower exposure for larger sensors systems (same total amount of light distributed over a larger area results in a lower exposure, since exposure is the density of the light falling on the sensors).

  • The same total amount of light falling on the sensor will result in the same noise for equally efficient sensors, regardless of pixel size or the ISO setting.  Typically, sensors of the same generation are rather close in efficiency, but there are most certainly exceptions.

  • The reason that smaller sensors are more noisy than larger sensors is not because they are less efficient, but because less light falls on them for a given exposure. If the larger sensor is more efficient than the smaller sensor, then the noise gap will widen, if the smaller sensor is more efficient, the noise gap will shrink.

  • Larger formats do not necessarily have a more shallow DOF than smaller formats.  Larger formats have the option of a more shallow DOF than smaller formats for a given perspective and framing when using a lens that has a larger aperture diameter, as the lenses for larger formats usually, but not always, have larger aperture diameters for a given AOV.  However, people using Auto, P, or Tv modes on the camera will likely find that the larger format camera will choose a wider aperture in many situations, resulting in a more shallow DOF.  In addition, many choose to use a wider aperture (resulting in a more shallow DOF) to get more light on the sensor and thus less noise.

  • Equivalence says nothing about shallow DOF being superior to deep DOF, as this is entirely subjective.

  • The resolved detail is a function of the lens, the AA filter, the sensor, and the processing (RAW vs default jpg, for example).  A sharper lens (greater lp/mm) on a smaller sensor will not necessarily resolve more than a less sharp lens on a larger sensor. Instead, we need to compare the resolutions in lw/ph, as DPR does with their MTF-50 tests (discussed in more detail here).  Furthermore, the resolved detail is also a function of the number of pixels on the sensor (discussed in more detail here), and all systems suffer the same amount of diffraction softening equally at the same DOF, although the system that began with more detail will retain more detail (although the advantage asymptotically vanishes as the DOF deepens -- discussed in more detail here).

  • Equivalence makes no claims whatsoever about which system is superior to another system, especially given that there are so many aspects about systems that Equivalence does not address.  For example, in terms of IQ, Equivalence says nothing about bokeh, moiré, distortion, color, etc., and in terms of operation, Equivalence says nothing about AF, build, features, etc.  In fact, Equivalence can even work against larger sensor systems by denying them their "noise advantage" when they need to match both the DOF and shutter speed of smaller sensor systems.

  • However, Equivalence does make the argument that it makes no sense to artificially handicap one system or the other by requiring identical settings for a comparison, when identical settings result in different effects on different systems.

 

 

 

 

 

 

 

Q&A

 

 

Q:  Are bigger formats better than smaller formats?

A:  For some specific purposes, yes; for others, no.  The more specific the purpose the of photography, the easier it becomes to say that System A is "better than" System B for a particular photographer; the more broad the photography, the less easy it is for one system to be superior to the other.

 

Q:  When are larger formats better than smaller formats?

A:  To answer this question, we need to invoke the "all else equal" clause, because there are so many variables that may make one system better than another for any particular photographer.  In short, the advantage of a larger sensor system over a smaller sensor system is that the larger sensor system will generally have lenses that have wider aperture (entrance pupil) diameters for a AOV (diagonal angle of view) than smaller sensor systems, which allows for more shallow DOFs (as an option, not a requirement) and will put more light on the sensor for a given exposure, resulting in less noise.  In addition, larger sensors typically have more pixels which, when combined with a lesser enlargement factor from the recorded photo to the displayed photo, results in more detailed photos (at least for a given DOF).  Whether or not these advantages are more important than the disadvantages (size, weight, cost, etc.) is another matter all together.

 

Q:  Isn't Equivalence a vehicle for promoting the "superiority" of larger sensor systems. specifically, FF?

A:  Not by a long shot.  Many believe that Equivalence is based on the presumed "superiority of FF" because equivalents are typically given in terms of FF.  However, that has nothing to do with Equivalence, per se.  The notion of FF as being the standard comes from the popularity of the 35mm film format immediately before the advent of consumer digital cameras, especially because FF lenses were adopted for the use on the first DSLRs.  However, there is no need to use FF as the reference format -- any format can be the reference format.  If there is an agenda to Equivalence, it is to change the photographic paradigm based on the relative aperture (f-ratio) and exposure with a new paradigm based on the effective aperture (entrance pupil) and total amount of light falling on the sensor, at least for cross-format comparisons.

 

Q:  So Equivalence is about the lens as opposed to the sensor?

A:  That's a good way to put it -- it's the effective aperture (entrance pupil) for a given AOV that is of central importance.  However, sensor size still plays a role, as larger sensors typically have more pixels and typically can absorb more light for a given exposure.

 

Q:  Isn't Equivalence all about DOF?

A:  No, Equivalence is not "all about DOF" -- it's also about the amount of light that makes up the photo.  Understanding that both DOF and and the amount of light making up the photo are intimately connected to the aperture is central to Equivalence.  That said, DOF, by itself, is still a critical consideration to the captured detail in the photo, since portions of the scene outside the DOF, by definition, will not be sharp, and all systems suffer diffraction softening equally at the same DOF.  Likewise, the noise in the photo is primarily due to the amount of light making up the photo.

 

Q:  Doesn't Equivalence say that we should shoot different formats at the same DOF?

A:  Not at all, and, in fact, quite the opposite.  That is, one does not choose one format over another to get photos Equivalent to what one would get on another format.  Rather, one chooses one format over another to get photos they could not get on another format, or get better photos than they could get on another format, assuming, of course, that differences in operation, size, weight, and cost are not significant enough to be the primary consideration.

 

Q:  Overall, then, isn't FF best the choice?

A:  Again, which is best is completely subjective.  While for me, personally, I prefer FF, it is my opinion that the vast majority are better served with smaller formats.  As all systems continue to improve, the number of situations where FF has a significant advantage over smaller formats narrows.  Of course, if size, weight, and price were not considerations, then larger is almost always better.  However, since size, weight, and price not only matter, but are often (usually) the primary considerations, then it is my opinion that the advantages of FF over smaller formats are not enough to offset the disadvantages for most people in most situations.

 

 

 

 

 

DEFINITIONS OF TERMS AND ABBREVIATIONS

 

Many of the misunderstandings come from people using different definitions for the same words. In particular, "f-ratio" is often confused with "aperture", and "exposure" is confused with "lightness" and "total light". The importance of these distinctions is often overlooked or simply not understood, so a quick browse through this section would be helpful in understanding the rest of the essay.

  • IQ:  Image Quality.

  • QT:  Quality Threshold.

  • PP:  Post Processing.

  • PPI:  Pixels per inch (not to be confused with DPI -- dots per inch -- which is a function of the printer).

  • lp/ph:  line pairs per picture height on the photo.

  • lw/ph:  line widths per picture height on the photo (lw/ph = 2 · lp/ph, lp/ph = ½ · lw/ph).

  • lp/mm:  line pairs per mm on the sensor (lp/ph = lp/mm · sensor height).

  • NR:  Noise Reduction.

  • Relative Sharpness:  Two lenses that resolve the same on the systems they are used on have the same relative sharpness.

  • AF:  Auto Focus.

  • AOV:  Angle of View (of the diagonal, unless otherwise specified).

  • FOV:  Field of View (framing).

  • UWA:  Ultra Wide Angle.

  • Format (Sensor Size):  The name for the size of the sensor: (e.g. 1", mFT (4/3), 1.5x, 1.6x, FF, etc.).

  • Aspect Ratio:  The ratio of the width to height (e.g. 3:2, 4:3, 16:9, etc.).

  • Output Size:  The number of pixels making up an image, or the dimensions of a print.

  • Perspective:  The relative position of objects in the frame (a function only of subject-camera distance -- format and focal length independent).

  • FL:  Focal Length.

  • EFL:  Equivalent Focal Length (usually in terms of FF).

  • Reach:  We say that System A has, for example, 50% more reach than System B if System A resolves 50% more detail then System B when System B is cropped to the same framing as System A, or when the two systems resolve the same detail when System B uses a 50% longer focal length than System A.

  • TC:  Teleconverter (usually 1.4x or 2x, and mounted on the rear of the lens, unless otherwise specified).

  • DOF:  Depth of Field (the depth of the image from the focal plane that is considered to be in critical focus).

  • aperture:  The physical aperture is the narrowest opening in a lens, the effective aperture (entrance pupil) is the image of the physical aperture when viewed through the FE (front element), and the relative aperture (f-ratio) is the quotient of the focal length and effective aperture.  In this essay, when the term "aperture" is used without a qualifying adjective, it is taken to be synonymous with the effective aperture (entrance pupil).

  • F-Ratio:  The ratio of the focal length and the aperture diameter (e.g. the f-ratio for a focal length of 50mm and an aperture diameter of 25mm is 50mm / 25mm = f/2).

  • Stop: A difference of one stop represents a doubling/halving of the amount light that falls on the sensor (e.g. f/2.8 to f/4 or 1/100 to 1/50) or a doubling/halving of the processing of the light that falls on the sensor (e.g. ISO 100 to ISO 200).

  • Ev:  Exposure Value: 0 Ev = 2.5 lux·seconds. A scene metered for f/1 and 1s has an Ev of 0.  Brighter scenes have higher Ev's, darker scenes have lower Ev's. A difference of 1 Ev is 1 stop.

  • Exposure:  The total light per area photons / mm²) that falls on the sensor while the shutter is open, which is usually expressed as the product of the illuminance of the sensor and the time the shutter is open (lux · seconds). The only factors in the exposure are the scene luminance, t-stop (where the f-ratio is often a good approximation for the t-stop), and the shutter speed (note that neither sensor size nor ISO are factors in exposure).

  • Lightness:  The lightness of a photo (what people usually think of as "exposure") is how light or dark the photo appears overall. For example, an exposure at an ISO 400 setting will be mapped into the image file such that the photo will appear 4x lighter than if the same exposure were taken at ISO 100.  Alternatively, an exposure at ISO 400 will result in a photo with the same lightness as a photo with 4x the exposure at ISO 100.

  • Total Light:  The total number of photons that falls on the sensor (lumen·seconds, or, equivalently, photons):  Total Light = Exposure x Effective Sensor Area.

  • Total Light Collected:  The total number of electrons released by the sensor (Total Light Collected = Total Light x QE, where the QE is the proportion of photons falling on the sensor that release electrons).

  • Noise:  The standard deviation of the recorded signal from the mean signal, but more usually used to mean the density of the noise (NSR -- Noise to Signal Ratio).

  • Efficiency:  How well the sensor captures and records the light that falls on it.

  • DR:  Dynamic Range -- the number of stops from the noise floor to the saturation limit of a particular area of the photo.

  • Diffraction Softening:  Loss of detail due to diffraction as the lens is stopped down.

  • Vignetting:  The radial light falloff from the center of a photo.

  • Distortion:  As used in this essay, the degree to which parallel lines stay parallel in the photo.

  • Bayer:  A color array where each pixel records one color (usually red, green, or blue).

  • Foveon:  A color array where each pixel records three colors.

 

 

 

 

 

DEFINITION OF EQUIVALENCE

 

Equivalent photos are photos of a given scene that have the:

As a corollary, Equivalent lenses are lenses that produce Equivalent photos on the format they are used on which means they will have the same AOV (angle of view) and the same aperture diameter. The following rules of thumb, which are a consequence of the above definition, are also helpful to understand:

  • For a given exposure, more light is projected on a larger sensor.
  • For a given scene, DOF, and exposure time, the same amount of light is projected on all sensors, regardless of size.
  • Thus, the only way for a larger sensor to collect more light is to use a more shallow DOF or longer exposure time.

Let's consider, for example, 25mm f/1.4 1/100 ISO 100 on mFT (4/3), 31mm f/1.8 1/100 ISO 150 on 1.6x (Canon APS-C), 33mm f/1.9 1/100 ISO 180 on 1.5x (all other APS-C), and 50mm f/2.8 1/100 ISO 400 on FF:

  • The focal lengths all have the same [diagonal] AOV, so we say the focal lengths are equivalent.
  • The apertures (entrance pupils) all have the same diameters (25mm / 1.4 = 31mm / 1.6 = 33mm / 1.9 = 50mm / 2.8 = 18mm), so we say the relative apertures are equivalent.
  • The same total amount of light falls on the sensor for the same equivalent relative aperture and same exposure time, so we say the exposures are equivalent.
  • The photos will have the same lightness for the same equivalent exposure and equivalent ISO setting, so we say the ISO settings are equivalent.

An important consequence of the same aperture (entrance pupil) diameter is that the DOF and diffraction will also be the same for a given perspective, framing, display size, viewing distance, and visual acuity.  In addition, since the exposure times are the same, the same total amount of light will fall on the sensor for all systems (with a given scene luminance and lens transmission), which means the photos will have the same noise if the sensors are equally efficient.  In addition, if the mFT (4/3) lens is twice as sharp as the FF lens, the 1.6x lens is 1.6x as sharp as the FF lens, the 1.5x lens is 1.5x as sharp as the FF lens, and the sensors have the same pixel count and AA filter, then all will capture the same detail.

It is important to understand that "equivalent to" is not the same as "equal to".  For example, sensors are not equally efficient (although often, but not always, close for a given generation), nor are all lenses proportionally sharp, have the same color, bokeh, distortion, or flare characteristics, nor do all sensors have the same pixel count, CFA, and AA filter.  However, that doesn't stop us from saying that a Canon 50 / 1.4 on a Canon 5D is not equivalent to a Sigma 50 / 1.4 "Art" on a Nikon D850 (46 MP FF) despite the fact that the D850 sensor records twice as much light as the 5D sensor for a given exposure, introduces less electronic noise, has nearly 3x the pixel count, and that the Sigma 50 / 1.4A is significantly sharper than the Canon 50 / 1.4 (at least at wide apertures) with smoother bokeh.

Indeed, according to Webster's, the primary definition of "equivalent" is:

1: equal in force, amount, or value

So, Equivalent photos have equal perspective, equal framing, equal DOF, equal shutter speeds, equal lightness, and equal display dimensions, which are all visual properties that are independent of the equipment.  Equality of other visual properties, such as noise, detail, etc., will require specific assumptions about the hardware.

The second and third definitions of "equivalent" also fit:

2a:  like in signification or import
3:    corresponding or virtually identical especially in effect or function

This is not to say that photos that are not Equivalent may not look more similar to photos that are Equivalent.  For example, let's say we have an old FF DSLR and a modern mFT camera that has a much more efficient sensor.  We may find that a photo at 25mm f/2.8 1/100 ISO 400 with the mFT camera may look more similar to a photo at 50mm f/4 1/100 ISO 800 on the FF DSLR than an Equivalent photo at 50mm f/5.6 1/100 ISO 1600, even though the f/4 photo on the FF DSLR has less DOF than the mFT photo, since the f/5.6 photo is much more noisy due to the less efficient sensor, and noise may matter more than DOF, depending on the scene and how large the photo is displayed.  On the other hand, sensors of the same generation are usually pretty close in terms of efficiency (see here for quite a few examples).  Still, there are other issues to consider, such as color and distortion (although these two can usually be corrected for in processing), bokeh, lens flare, etc..

So while differences in the technology may well make for differences that matter more than one or more of the parameters of Equivalence, Equivalent photos will typically be the closest in appearance (more so, of course, when the sensors are of the same generation).  The larger the photo is displayed, the more extreme the processing, and the lower the amount of light that makes up the photo, the more obvious the role that differences in technology will play.

In addition, there is a small niggle in the parameter of "same framing" for systems with different aspect ratios (e.g. 4:3 vs 3:2).  We can either crop one image to the the aspect ratio of the other (or crop both to a common aspect ratio) or compare at the same AOV and display with the same diagonal measurement.  The details of this are discussed at the end of this section.

It is important to note that Equivalent photos on different formats will not have the same exposure, and this is the source of most all resistance to the concept.  The reason is quite simple:  The same total amount of light will fall on the sensor for Equivalent photos which results in a lower density of light (exposure) on a larger sensor.  Many feel that exposure has been usurped with DOF, but this reflects not only a lack of understanding of what exposure actually is, but how much of a role DOF plays in a photo, even if DOF, per se, is not a consideration.  While the artistic value of DOF is subjective, the fact is that both the total light falling on the sensor and the DOF are functions of the aperture diameter.  Larger aperture diameters admit more light, but they also introduce more aberrations from the lens.  Thus, DOF, noise, and sharpness are all intrinsically related through the aperture of the lens.

Exposure is not how bright or dark a photo appears -- we can lighten or darken a photo as we see fit.  Exposure is the density of light (photons / mm²) that falls on the sensor.  However, it is not the density of light falling on the sensor that matters, but the total light (photons) that makes up the photo, since the total light, combined with the sensor efficiency, determine the image noise.  This crucial distinction between exposure and total light has an entire section of the essay devoted to it, as does the section on noise.

So, while equal noise is not a parameter of Equivalent images, it is a consequence of Equivalent images if the sensors are equally efficient.  The primary elements in image noise, in order, are:

  • The Total Amount of Light that falls on the sensor (exposure · sensor area)
  • The percent of this light that is recorded by the sensor (QE -- quantum efficiency)
  • The additional noise added by the sensor and supporting hardware (electronic noise)

Other factors, such as ISO and pixel count / size play a minor role in relative noise compared to the above three factors.  Because equivalent images are made from the same total amount of light (since equivalent images, by definition, will have the same framing, aperture diameter, and shutter speed), and sensors of the same generation usually have similar QE / read noise, equivalent images from cameras of them same generation will usually have similar relative noise for equivalent images.  People commonly believe that larger sensor systems have less relative noise because they have better sensors, when, in fact, it is instead because they collect more total light for a given exposure.

Thus, breaking the properties of Equivalence down into the properties of the photo, lens, and sensor:

  • If photos are taken of the same scene from the same position with the same focal point, have the same framing, same display dimensions, and same aperture diameter, they will have the same DOF (and diffraction).
  • If the exposure time is also the same, then the photos will also have the same motion blur / camera shake, as well as be made from the same total amount of light.
  • If the sensors record the same proportion of light falling on them (same QE) and add in the same electronic noise (the noise from the sensor and supporting hardware), then the noise will be the same regardless of pixel count and ISO setting (keeping in mind that sensors of the same, or nearly the same, generation typically record very nearly the same proportion of light falling on them regardless of brand, size, or pixel count (a notable exception would be BSI tech which records a third to half a stop more light for a given exposure than non-BSI tech) and that the electronic noise matters only for the portions of the photo made with very little light).
  • If the lenses resolve proportionally the same on their respective sensors at the same DOF (e.g. if a 25mm lens at f/1.4 on mFT resolved 2000 lp/mm, a 31mm lens on 1.6x at f/1.8 resolved 1600 lp/mm, a 33mm on 1.5x at f/1.9 resolved 1500 lp/mm, and a 50mm lens on FF at f/2.8 (or any equivalent relative apertures), the sensors have the same number of pixels, and the AA filter introduces the same blur, then all photos will have the same resolution (lw/ph).

Note that it said above that "equivalent images on different formats will usually have the most similar visual properties" -- but not always.  For example, if one system has a significantly less efficient sensor than another system, if motion blur and/or camera shake are not an issue, then a longer exposure at a lower ISO on the system with the less efficient sensor may more closely match the shorter exposure on the system with more efficient sensor.  Or, if the extreme corners are of some importance in the composition, and one system has greater edge sharpness than the other, the system with the softer edges may need to stop down more to achieve sharper corners.  But, in most circumstances, Equivalent images will be, if not the most similar, very close.

A competent photographer will use their equipment to obtain the best image possible, which often means trading one IQ component for another (for example, when using a wider aperture to get less noise at the expense of less sharpness and greater vignetting).  However, it is important that we understand that these compromises represent choices that a photographer makes, and are not requirements imposed by the format.  Extending the example, it is disingenuous to say that a smaller format is superior to a larger format because it has more DOF, or is sharper, "wide open" than the larger format, when "wide open" is a choice, not a mandate, that results in a different (sometimes radically different) DOF and relative noise, and the larger format can simply be stopped down for greater DOF and sharpness (although, if stopping down requires a concomitant increase in ISO to maintain a sufficient shutter speed, then the larger format may have to sacrifice some, or even all, of its noise advantage).

Thus, Equivalence is about the consequences of choices a photographer has in terms of IQ as a function of format.

Understanding the fundamental concepts of Equivalence requires making important distinctions between various terms which people often take to mean the same thing.  It is very much akin to making the distinction between "mass" and "weight", two terms which most people take to mean the same thing, when, in fact, they measure two different (but related) quantities.  While there are circumstances where making the distinction is unnecessary, there are other times when it is critical.

The first of these distinctions that needs to be made is between aperture and f-ratio.  The term "aperture", by itself, is vague -- we need a qualifying adjective to be clear.  There are three different terms using "aperture":

  1. The physical aperture (iris) is the smallest opening within a lens.
  2. The effective aperture (entrance pupil) is the image of the physical aperture when looking through the front element of the lens.
  3. The relative aperture (f-ratio) is the [reciprocal of the] quotient of the focal length and the effective aperture.

For example f/2 on a 50mm lens means the diameter of the effective aperture (entrance pupil) is 50mm / 2 = 25mm, since the "f" in the f-ratio stands for "focal length".  Likewise, a 50mm lens with a 25mm effective aperture has an f-number of 50mm / 25mm = 2 which gives us an f-ratio of 1:2.

The same relative aperture (f-ratio) will result in the same density of light falling on the sensor (exposure) for a given scene luminance and exposure time for all systems, whereas the same effective aperture (entrance pupil) will result in the same total amount of light falling on the sensor for a given shutter speed (as well as same DOF for a given perspective, framing, and display size).  Thus, equivalent lenses are lenses that have the same AOV (angle of view) and effective aperture (entrance pupil).

Interestingly, a "constant aperture" zoom is a zoom lens where physical aperture (iris) remains constant, but the effective aperture (entrance pupil) scales with the focal length, thus keeping the relative aperture (f-ratio) constant as well.  Let's consider a 70-200 / 2.8 zoom.  Assuming the diagram in the link is accurate, then the proportions of the diagram results in the diameter of the physical aperture (iris) to be 38.5mm, which is based on the assumption that the diameter of the FE (front element) is 77mm, and relative size of the FE and physical aperture (iris) in the diagram.

The effective aperture (entrance pupil) is the image of the physical aperture (iris), that is, it is how large the physical aperture appears when viewed through the front element.  The diameter of effective aperture is 25mm at 70mm f/2.8 (70mm / 2.8 = 25mm) and 71mm at 200mm f/2.8 (200m / 2.8 = 71mm).  So, as the lens zooms, neither the physical aperture (iris) nor the relative aperture (f-ratio) change, but the effective aperture (entrance pupil) changes in direct proportion to the magnification.  Since the diameter of the physical aperture (iris) is 38.5mm in the above design, then we can basically think of this particular 70-200 / 2.8 lens as a 108 / 2.8 prime (assuming that the effective aperture and physical aperture have the same diameter with the prime, which is probably not a good assumption) with additional elements that magnify or reduce the image as it zooms.

For simplicity, this essay uses the term "aperture diameter" to refer to the entrance pupil (effective aperture) diameter, and f-ratio to refer to the relative aperture.

The concepts of, and connections between, total light, DOF, and noise, are much more easily understood in terms of aperture rather than f-ratio, especially when comparing different formats.  While the same f-ratio will result in the same exposure (where exposure is the density of light that falls on the sensor -- photons / mm²), regardless of the format, the aperture diameter, together with the shutter speed, determines the total amount of light that falls on the sensor, where the same total amount of light falling on the sensor results in the same noise in the photo, for equally efficient sensors.  In addition, for a given perspective, framing, display size, viewing distance, and visual acuity, the same aperture diameter will result in the same DOF.

Because Equivalent photos result in a lower exposure for larger sensors (same total amount of light distributed over a larger sensor area results in a lower density of light on the sensor), we typically increase the ISO setting on the camera with the larger sensor to achieve the same lightness as the equivalent photo from the camera with the smaller sensor.  A common misunderstanding is that higher ISO settings are the cause of more noise, but this puts the cart before the horse.  We use higher ISO settings because less light is falling on the sensor, and it is this lesser amount of light falling on the sensor that results in greater noise, not the higher ISO setting.

Specifically, a higher ISO setting results in either a faster shutter speed, smaller aperture, or both in an AE (automatic exposure) mode where flash is not used.  The effect of a faster shutter speed and/or smaller aperture is to reduce the amount of light falling on the sensor, and it is the reduced amount of light on the sensor, not the higher ISO, per se, that increases the noise.  Since the same total amount of light falls on the sensor for equivalent photos, any differences in noise will be due to differences in sensor efficiency.  If anything, higher ISO settings result in greater sensor efficiency than lower ISO settings, which is discussed in more detail here.

Of course, we seek to maximize the light on the sensor, regardless of the system, but this maximization occurs within the constraints of DOF / sharpness, motion blur / camera shake, and blown highlights / noise where the competent photographer strives to get the optimal balance.  Specifically, this means that to maximize the total light on the sensor, the larger sensor system can use a larger aperture, but this will also result in a more shallow DOF, which may, or may not, be desirable.  However, in good light, or when motion blur is not an issue and a tripod is used, the larger sensor system can use whatever aperture is necessary to get the desired DOF, and use a longer shutter speed to collect more light.  Of course, if one system is using IS (image stabilization) and the other is not, that will often give the IS system an advantage in low light when motion blur is not an issue (or is desirable), since it will be able to use a lower shutter speed to get more light on the sensor in low light.

A common misunderstanding about Equivalence is the misguided notion that Equivalence is based on the "superiority" of FF.  This is not even remotely true.  Due to the popularity of the 35mm format in film before the advent of consumer digital cameras, it is common to use FF as the reference frame, but this has nothing to do with Equivalence, per se -- any format can be the reference format for a comparison between systems.

The equivalence ratio (more commonly referred to as the "crop factor") is the ratio that gives us equivalent settings between formats.  Most have no qualms applying the equivalence ratio to obtain the same AOV ([diagonal] angle of view) with respect to the focal length, but it also applies to the f-ratio, as this gives the same aperture diameter for a given angle of view.  The equivalence ratio is typically obtained by taking the quotient of the reference system sensor diagonal to the sensor diagonal of the target format.  For example, the equivalence ratio for FF and mFT using FF as the reference format is 43.3mm / 21.6mm = 2.  Likewise, the equivalence ratio for 1" and mFT using mFT as the reference format is 21.6mm / 15.9mm = 1.36.

However, if the aspect ratios of the sensors are different, and we wish to compare both formats with the same aspect ratio, then we will need to frame wider with one system and crop to the same framing as the other.  In this instance, we compute the equivalence ratio as quotient ratio of the smaller dimensions of the sensors if cropping the more elongated image to the aspect ratio of the more square sensor, or the ratio of the longer dimensions of the sensors if we are cropping the more square image to the aspect ratio of the more elongated sensor.  It's often convenient to express the equivalence ratio, R, in stops (S) and then round to the nearest 1/3 stop:  R (in stops) = 2 · log2 R ~ 2.885 · ln R.  Since the camera settings are often in 1/3 stop increments, it's helpful to recall that 1/3 ~ 0.33, and 2/3 ~ 0.67.

Let's calculate the equivalence ratio (crop factor) for various scenarios with FF (24mm x 36mm, 43.3mm diagonal, 864mm² area) and mFT (13mm x 17.3mm, 21.6mm diagonal, 225mm² area):

 

Framing

Multiplicative Equivalence Ratio

Additive Equivalence Ratio

 

 

 

Same diagonal angle-of-view:

R = 43.3mm / 21.6mm = 2.00x R = 2· log2 2 = +2 stops

 

   

3:2 FF photo cropped to the same framing as a 4:3 mFT photo:

R = 24mm / 13.0mm = 1.85x R = 2· log2 1.85 = +1.77 stops
     

4:3 mFT photo cropped to the same framing as a 3:2 FF photo:

R = 36mm / 17.3mm  = 2.08x R = 2· log2 2.08 = +2.11 stops

 

   

Same display area:

R = sqrt (864mm² / 225mm²) = 1.96x R = 2· log2 1.96 = +1.94 stops

 

Since the most common aspect ratios, by far, for digital cameras are 3:2 and 4:3, we can see that the practical differences in the equivalence ratio for the various framings differ by less than 1/3 of a stop, so it is not a significant factor in terms of total light gathered, and thus noise.  When 3:2 is cropped to 4:3, or vice-versa, 1/9 of the pixels will be cropped away from the edges, which will have negligible impact on the PPI of a print, but may be important in terms of comparing corner sharpness.  Regardless, as this essay regards differences less than 1/3 of a stop as trivial, so the differences in equivalence ratios as a function of aspect ratio is not considered.

In general, then, given two systems with an equivalence ratio, R, we compute the equivalent settings in the following manner:

  • The focal length of the target format times R gives the focal length for the same AOV on the reference format (e.g. 25mm on mFT x 2 gives 50mm FF equivalent).

  • The relative aperture of the target format times R (or plus R, if R is calculated in stops) gives the same aperture diameter for the same AOV as the reference format (e.g. f/1.4 on mFT x 2, or f/1.4 on mFT plus two stops, gives f/2.8 FF equivalent).

  • The ISO setting of the target format times R² (or plus R, if R is calculated in stops) gives the ISO for the reference format that results in the same LCD playback / OOC jpg lightness for the equivalent settings (e.g. ISO 400 on mFT x 2², or plus two stops, gives ISO 1600 FF equivalent).

  • The exposure times for both systems remain the same.

Equivalent photos, then, will be photos of the same scene taken from the same position using equivalent settings and displayed at the same size.

 

 

 

 

 

THE PURPOSE OF EQUIVALENCE

 

A common criticism of Equivalence is that some people say that it does nothing to help them to take better pictures, but this represents a misunderstanding of what Equivalence is all about.  Equivalence is simply a framework by which six visual properties -- perspective, framing, DOF / diffraction / total amount of light projected on the sensor, exposure time (motion blur), lightness, and display size -- relate between different formats.   It is not an "instruction manual" for how to take a photo, it is not an argument that "FF is best", it does not say that "bigger is always better".  In a word or two, it simply explains why the mantra "f/2 = f/2" is no more or less true, or useful, than saying "50mm = 50mm".

If one system can take a photo that another system cannot, and that results in a "better" photo, then, of course, we would do so.  For example, if low noise meant more than a deeper DOF in a scene where motion blur were a factor, then we would compare both systems wide open with the same shutter speed, as that would maximize the amount of light falling on the sensor and thus minimize the noise.  Equivalence tells us, however, that this would *necessarily* result in a more shallow DOF for the system using a wider aperture, and thus most likely result in softer corners.  So, we surely would not criticize the larger sensor system for having softer corners on the basis of a *choice* the photographer made.

The point of photography is making photos.  As such, one doesn't choose the particular system to get photos which are equivalent to another system.  A person chooses a particular system for the best balance of the factors that matter to the them, such as price, size, weight, IQ, DOF range, AF, build, etc..  By understanding which settings on which system create equivalent images, these factors can be more evenly assessed to choose the system that provides the optimum balance of the needs and wants of a particular photographer.

 

 

 

 

 

COMPARING SYSTEMS

 

As discussed in the section above on the Purpose of Equivalence, Equivalence is merely the baseline for a meaningful comparison between systems based on the visual properties of the final photo.  Often, it makes much more sense to compare systems on the bases of images that are not fully equivalent, in order to maximize the IQ of the systems being compared for specific shooting situations.  This section discusses these situations.

The first consideration is if the photographer is focal length and/or magnification limited by a particular system.  For example, let's say that due to size, weight, or financial reasons, the longest lens they can carry is 400mm.  In this instance, 400mm on a mFT (4/3) sensor is going to give a substantial IQ advantage over 400mm on FF cropped to the same framing due to the smaller pixels of the mFT (4/3) sensor, which will put more pixels on the subject.  If, however, the FF sensor had the same size pixels (4x the total pixel count), then the IQ differential would come down to differences between the lenses and the sensor efficiency, since the FF photo would have the same number of pixels on the subject, and could thus be cropped with no penalty (so long as the lens were as sharp for the portion of the scene spanned by the subject).

The next situation to consider is Equivalent photos, that is, a limited light situation where shallow DOF and motion blur are detrimental to the IQ of the photo, for example:

• 6D (FF) at 50mm, f/5.6, 1/200, ISO 1600
•
D500 (1.5x) at 33mm, f/3.5, 1/200, ISO 640
• 80
D (1.6x) at 31mm, f/3.5, 1/200, ISO 640
•
EM5II (mFT) at 25mm, f/2.8, 1/200, ISO 400

In this case, the aperture diameters will all be the same (50mm / 5.6 = 33mm / 3.5 = 31mm / 3.5 = 25mm / 2.8 = 9mm) so the DOFs will be the same and the same total amount of light will fall on the sensors, resulting in the same noise if the sensors are equally efficient.  The advantage of one system over another would be if it had a more efficient sensor or a lens that was sharper relative to the format.

Another situation to be considered is when there is enough light to achieve the desired DOF at base ISO without adverse affects due to motion.  For example:

• 6D at 50mm, f/5.6, 1/200, ISO 100
•
D500 at 33mm, f/3.5, 1/500, ISO 100
• 80
D at 31mm, f/3.5, 1/500, ISO 100
•
EM5II at 25mm, f/2.8, 1/800, ISO 100

The IQ advantage, all else equal, will go to the larger sensor systems in this case since they will record more total light as well as usually put more pixels on the scene.  On the other hand, the IQ differential in this situation is often going to be the least significant in that all systems are often well past "good enough" for most purposes.

Next up is when noise and/or a more shallow DOF matters more than captured detail:

• 6D at 50mm, f/1.4, 1/200, ISO 1600
•
D500 at 33mm, f/1.4, 1/200, ISO 1600
• 80
D at 31mm, f/1.4, 1/200, ISO 1600
•
EM5II at 25mm, f/1.4, 1/200, ISO 1600

The same f-ratio on all systems results in wider aperture diameters for the larger sensor systems (50mm / 1.4 = 36mm, 33mm / 1.4 = 24mm, 31mm / 1.4 = 22mm, 25mm / 1.4 = 18mm) which results in a more shallow DOF for the larger sensor systems as well as more total light falling on the sensor for the larger sensor systems, resulting in less noise for equally efficient sensors (or, at least, close to equally efficient).

However, it is more than likely that at such wide apertures, the lens will suffer greater aberrations for the larger sensor systems.  Thus, even for the portions of the scene within the DOF, we may find that the smaller sensor system records a more detailed photo (of course, this has to be taken on a lens-by-lens basis).  In any case, we would only compare the same f-ratio on different formats if DOF and/or noise mattered more than sharpness.

This brings up the next case, when sharpness is of the same importance as noise and DOF, but the larger sensor is less efficient.  For example, consider the following:

• 6D (FF) at 50mm, f/2, 1/200, ISO 1600
•
EM5II (mFT) at 25mm, f/1.4, 1/200, ISO 800

The EM5 sensor is more efficient than the 5D sensor, so if the 5D were to use fully equivalent settings (50mm, f/2.8, 1/200, ISO 3200), the resulting photo would be noticeably more noisy.  In this case, the 5D will produce a photo with approximately the same detail (in the portions of the scene within the DOF) and noise, but a more shallow DOF, which may, or may not, be an issue.

Lastly, we have to consider the effect of IS (image stabilization) when motion blur is not an issue, neither a flash nor tripod are viable options, one system has sensor IS or an IS lens when the other system does not, and a deeper DOF is desirable:

• 6D at 50mm, f/5.6, 1/50, ISO 1600
• EM5II
at 25mm, f/2.8, 1/13, ISO 100

In this case, the smaller system has a distinct noise advantage over the larger sensor system due to the operational advantage of sensor IS if the larger system does not have sensor IS and is not using an IS lens.

Thus, when comparing systems, we need to specify the purpose of the comparison and compare in a manner that is optimal for each system.  Artificially handicapping one system or another by arbitrarily comparing at the same f-ratio without taking into account the effect this has on the resulting photo is counter-productive.

 

 

 

 

 

 

THE SIX POSTULATES OF EQUIVALENCE

 

PERSPECTIVE

Perspective is how objects appear in relation to other objects, and the effect it can have on the image is dramatically demonstrated with these examples.  For a given scene and framing, perspective is a function only of the position the photo was taken from.  A good way to think of perspective is to consider two objects, one 10 ft from the camera, the other 30 ft from the camera.  If both objects are in the frame with the subject being the closer object, and we shoot at 50mm from 10 ft away, then the further object is three times as far away as the subject.  If, however, we step back another 10 ft and use 100 mm so that the subject is framed the same, then, if the the further object is even still in the frame, the subject will be 20 ft away and the other object 40 ft away -- only twice as far.  Conversely, if we get twice as close and frame use 25mm for the same framing of the subject, now the subject is 5 ft away, and the other object is 25 feet away -- five times as far.

Not only does the subject-camera distance change the perspective by changing the relative distances of subjects within the frame, it also changes, in a similar fashion, how widely separated they are in the frame.  In fact, when we use a longer perspective, we will often find that much of what was in the frame of a closer perspective is now outside the frame (the tree photos here are an excellent example of this).  Inasmuch as the scene as a whole matters, rather than simply the actual subject, perspective can be one of the most striking elements of a photograph.

 

 

FRAMING

For a given perspective, the framing can be thought of as the whole of the captured scene, and is synonymous with the FOV (field of view), which is a combination of the horizontal and vertical AOV (angle of view).  Unless otherwise specified, the term "AOV" refers to the diagonal AOV.  The distinction between AOV and FOV need not be made when systems share the same aspect ratio, but the greater the difference in aspect ratios, the more important the distinction between the terms.

In addition, it is important to note that the focal length (and f-ratio) marked on a lens is for infinity focus (magnification, m, is equal to zero).  As the magnification increases (subject-camera distance decreases), both the AOV and f-ratio will increase in the same proportion, which is an especially important point for macro, and near macro, photography, and discussed further down.

We can compute the horizontal, vertical, and diagonal AOVs for infinity focus with the following formula:

AOV = 2 · tan-1 [ s / [ (2 · FL) ]

where

AOV = angle of view (degrees)
s = sensor dimension (mm)
FL = focal length (mm)

 

For example, the diagonal, horizontal, and vertical AOV for infinity focus (m=0) on 35mm FF at 50mm is:

Diagonal AOV for 50mm on 35mm FF = 2 · tan-1 [43.3mm / (2 · 50mm)] ~ 47°
Horizontal AOV for 50mm on 35mm FF = 2 · tan-1 [36mm / (2 · 50mm)] ~ 40°
Vertical AOV for 50mm on 35mm FF = 2 · tan-1 [24mm / (2 · 50mm)] ~ 27°

Solving the AOV formula for focal length, we have:

FL = s / [ 2 · tan (AOV / 2) ]

Let's now compute the focal length for 35mm FF, 1.5x, 1.6x, and 4/3 for a diagonal AOV of 47° at infinity (m=0):

FL for FF = 43.3mm / [ 2 · tan (47° / 2) ] ~ 50mm
FL for 1.5x = 28.4mm / [ 2 · tan (47° / 2) ] ~ 33mm
FL for 1.6x = 26.7mm / [ 2 · tan (47° / 2) ] ~ 31mm
FL for 4/3 = 21.6mm / [ 2 · tan (47° / 2) ] ~ 25mm

Note that these focal lengths are all proportional to the sensor ratio:

50mm / 1.5 ~ 33mm
50mm / 1.6 ~ 31mm
50mm / 2 ~ 25mm

Now we'll repeat for a horizontal AOV of 40° at infinity (m=0):

FL for FF = 36mm / [ 2 · tan (40° / 2) ] ~ 50mm
FL for 1.5x = 23.7mm / [ 2 · tan (40° / 2) ] ~ 33mm
FL for 1.6x = 22.2mm / [ 2 · tan (40° / 2) ] ~ 31mm
FL for 4/3 = 17.3mm / [ 2 · tan (40° / 2) ] ~ 24mm

Once again, we see these are proportional to the sensor ratio:

50mm / 1.5 ~ 33mm
50mm / 1.6 ~ 31mm
50mm / 2.08 ~ 24mm

And for a vertical AOV of 27° at infinity (m=0):

FL for FF = 24mm / [ 2 · tan (27° / 2) ] ~ 50mm
FL for 1.5x = 15.7mm / [ 2 · tan (27° / 2) ] ~ 33mm
FL for 1.6x = 14.8mm / [ 2 · tan (27° / 2) ] ~ 31mm
FL for 4/3 = 13mm / [ 2 · tan (27° / 2) ] ~ 27mm

And, again, these focal lengths are proportional to the sensor ratio:

50mm / 1.5 ~ 33mm
50mm / 1.6 ~ 31mm
50mm / 1.85 ~ 27mm

 

The effective focal length (EFL) of the lens for a subject at a distance d (mm) from the aperture is given by:

 EFL = FL · (1 + m / p)

where

m = image magnification (ratio of the height of the image on the sensor to the height of the actual object)
p = pupil magnification (the ratio of the diameter of the exit pupil to the diameter of the entrance pupil)

Symmetric lenses have equal entrance pupil and exit pupil diameters.  Thus, p=1 for a symmetric lens and we can disregard it.   Normal lenses tend to be closer to symmetric designs, as a general rule, longer lenses tend to have larger entrance pupils than exit pupil, thus p<1 (progressively smaller the longer the lens), and wider lenses (especially retrofocal) are the opposite, with p>1.

The following table demonstrates the effect of focal distance on the EFL of a symmetric 50mm lens (p=1):

 

Magnification           EFL for a symmetric 50mm lens
   
1 : ∞   m = 0 50mm
1 : 20   m = 0.05 52.5mm
1 : 10   m = 0.1 55mm
1 : 5   m = 0.2 60mm
1 : 3   m = 0.33 67mm
1 : 2   m = 0.5 75mm
1 : 1   m = 1 100mm


A useful relationship between focal length, sensor size, subject to aperture distance, and the height or width of the focal plane in the photo is:

EFL / d = s / (s + h)

where all variables below are given in mm (1m = 1000mm, 1 ft = 304.8mm)

EFL = effective focal length
s      = sensor dimension (sensor height for landscape orientation, sensor length for portrait orientation -- given in the table just a bit further down)
d     = distance to subject
h     = height of frame

For low magnifications, the formula reduces to:

FL / d ≈ s / h

For example, let's say we have a landscape oriented photo of a model who is 5' 8" (1727mm) tall, takes up 2/3 of the frame from bottom to top, and wish to know what distance the model was from the camera if taken on FF with an 85mm lens.  The calculation is as follows:

85 / d = 24 / (1727 / ⅔) → d = 9175mm = 30 ft.

 

Listed below are tables of common ERs (equivalence ratios -- crop factors) in relation to 35mm FF for images using the same AOV (see here for a more complete list).  When given in stops, the ER is rounded to the nearest 1/3 stop.  The reason that 35mm FF (24mm x 36mm) is chosen as a standard is due to its popularity in the days of film and the fact that there are more lenses made for this particular format which many of the smaller sensor DSLRs also use, but we can use any format as a reference.  Due to different aspect ratios, when cropping to the dimensions of the more square sensor, we use the ratio of the shorter dimensions of the sensor to compute the ER, and when cropping to the dimensions of the more elongated sensor, we use the ratio of the longer sensor dimensions.  In the case of 3:2 being cropped to 4:3, or vice-versa, this will result in less than a 1/3 stop difference.

One side effect of cropping 3:2 images to 4:3 is that it greatly mitigates any softness that might show in the extreme corners.  However, we must also realize that this comes at the expense of removing 1/9 of the pixels from the image.  But as 3:2 systems generally have more pixels than 4:3 systems of the same generation, this can be done without any detail penalty when comparing systems.  Realistically, however, the extreme corners make up so little of the image, and are so close between systems anyway at the same DOF that it is only a consideration for the most hardcore of "pixel-peepers".  Please see this image as an example of what would be called a "huge" difference in the corners of different systems at the same DOF.  I simply see it as a non-issue, especially considering that the differences elsewhere in the frame matter more by far, but others see it as a serious disadvantage.  In any event, framing slightly wider and cropping to 4:3 will basically eliminate even that extreme case.
 

Compacts / Cell Phones:
 

Sensor Size

Dimensions (mm)

Diagonal (mm)

Area (mm²)

ER

ER (stops)

 

 

 

 

 

 

1/3.2” (iPhone)

3.42 x 4.54

5.68

15.5

7.62x

5.86 6

1/2.7”

4.04 x 5.37

6.72

21.7

6.44x

5.38 5 1/3

1/2.5”

4.29 x 5.76

7.18

24.7

6.02x

5.18 5 1/3

1/2.33" 4.60 x 6.13 7.66 28.2 5.65x 5

1/1.8”

5.32 x 7.72

8.93

41.0

4.84x

4.56 4 1/2

1/1.7”

5.7 x 7.6

9.5

43.3

4.55x

4.38 4 1/3

2/3”

6.6 x 8.8

11.0

58.1

3.93x

3.95 4

1" (Sony RX100) 8.8 x 13.2 15.9 116 2.73x 2.89 → 3

DSLRs / mirrorless:

Sensor Size

Dimensions (mm)

Diagonal (mm)

Area (mm²)

ER

ER (stops)

 

 

 

 

 

 

CX (Nikon 1) 8.8 x 13.2 15.9 116 2.73x 2.89 3

4/3 (Olympus, Panasonic)

13.0 x 17.3

21.6

225

2.00x

2

APS-C (Sigma)

13.8 x 20.7

24.9

286

1.74x

1.60 1 2/3

APS-C (Canon)

14.9 x 22.3

26.8

332

1.61x

1.38 1 1/3

APS-C (Sony, Nikon, K-M, Pentax, Fuji)

15.7 x 23.7

28.4

372

1.52x

1.22 1 1/3

APS-H (Canon 1D series) 19.1 x 28.7 34.5 548 1.26x 0.66 2/3

35mm FF (Canon 1Ds series, 5D; Nikon D3, D700)

24 x 36

43.3

864

1.00x

0

Leica S2 30 x 45 54.1 1350 0.80x -0.64 -2/3
Pentax 645 33 x 44 55 1452 0.79x -0.69 -2/3
MF (Mamiya ZD) 36 x 48 60 1728 0.72x -0.94 -1

 

Rather than relate to an arbitrary standard, such as 35mm FF, the ER between any two systems using the lengths of their respective sensors, or, more simply, either divide the ERs of the respective systems, or subtract their sensor ratios when using stops, using the values in the table above.  For example, the SR between a Canon 40D and Olympus E3 can be computed (for the same AOV) as 2.00 / 1.62 ~ 1.23 (2/3 of a stop to the nearest 1/3 stop, or, more simply:  2 stops - 1 1/3 stops = 2/3 of a stop).  Thus, 25mm f/2 ISO 100 on 4/3 would have the same AOV, DOF, and shutter speed as 31mm f/ 2.5 ISO 160 on 1.6x, since 25mm x 1.23 ~ 31mm, f/2 x 1.23 ~ f/2.5, and ISO 100 x 1.23² ~ ISO 160 (or, alternatively, f/2 + 2/3 stops = f/2.5 and ISO 100 + 2/3 stops = ISO 160).

 

DOF / Diffraction / Total Amount of Light on the Sensor

The DOF, diffraction, and total amount of light projected on the sensor are all intimately related to the aperture diameter.  This section will begin by discussing DOF, followed by a discussion on diffraction.  The discussion on the total amount of light projected on the sensor is a different section, Exposure, Lightness, and Total Light.

The DOF (depth of field) is the distance between the near and far points from the focal plane that appear to be in critical focus and is a central player in the amount of detail rendered in an image. It is also important not to confuse DOF with background blur (which is discussed further down).  Photos with:

will have the same DOF (and diffraction). Alternatively, photos with:

will also have the same DOF (and diffraction).

Note that neither number of pixels nor the size of the pixels figure into the CoC at all, except inasmuch as the size we display a photo depends on the size and/or number of pixels that make up the photo, such as when viewing 100% crops on a computer monitor.  The mathematics demonstrating the equivalencies is worked out a bit further down -- do try to contain your excitement! ; )

Moving right along, only an infinitesimally small portion of the image is actually in focus (the focal plane), but as our eyes and brain cannot see with infinite precision, the focal plane is perceived to have some depth. As we enlarge the image, we can more clearly see that less and less of the image is within focus, and this is how the DOF changes with enlargement.

Of course, no lens is perfect, so the focal plane is not a plane at all, but rather a surface.  In some instances, the curvature of the focal plane (field curvature) can be extreme enough that what appears to be edge softness is actually a flat surface falling outside the focal "plane".  In addition, the focus falloff is gradual -- the closer elements in the scene are to the focal surface, the sharper they will appear.  The DOF is the depth from an ideal focal plane in which we consider elements of the scene to be "sharp enough".

The number of pixels, or sharpness of the lens, on the other hand, have nothing to do with DOF.  These are independent factors in the sharpness of the photo -- a low resolution image displayed with large dimensions does not necessarily have low DOF -- the blur is a result of the lower resolution.  The difference between the blur due to limited DOF and the blur due to other factors (soft lens, low pixel count, camera shake, diffraction, etc.) is that these other sources of blur affect the entire photo equally, whereas the blur associated with shallow DOF will be greater for the portions of the scene further from the focal plane.  Blur do to motion, of course, will selectively affect objects that have the greatest relative motion in the frame (that is, a slow moving object close to the camera may have greater blur than a fast moving object far from the camera) and blur due to field curvature will increase as we move away from the focal point, which in many cases may mimic a more shallow DOF.

Most, if not all, online DOF calculators (as well as DOF tables) are based on "standard viewing conditions" of an 8x10 inch photo (or any photo displayed with a 12.8 inch -- 325mm -- diagonal) viewed from a distance of 10 inches with 20-20 vision.  Change any of those parameters (and please note that the pixel size is not one of the parameters), and you'll change the DOF (although, for example, if you double both the display dimensions and the viewing distance, these two effects will cancel each other out), and these parameters are accounted for with the CoC (circle of confusion) in the DOF formula(s).

Let's compute the CoC for the "standard viewing conditions" with FF, APS-C, and mFT (4/3):

  • Viewing distance = (10 in) · (2.54 cm / in) = 25 cm
  • Final image resolution for 20-20 vision with a viewing distance of 10 in (25 cm) = 5 lp / mm
  • Enlargement:  325 mm / 43.3 mm = 7.5 for FF, 325 mm / 28.4 mm = 11.4 for 1.5x, 325 mm / 26.8 mm = 12.1 for 1.6x, and 325 mm / 21.6 mm = 15 for mFT (4/3)

Plugging into the CoC Formula, CoC (mm) = viewing distance (cm) / desired final-image resolution (lp/mm) for a 25 cm viewing distance / enlargement / 25, we get:

  • FF:    CoC = (25 cm) / (5 lp / mm) /  7.5  / 25 = 0.027 mm
  • 1.5x:  CoC = (25 cm) / (5 lp / mm) / 11.4 / 25 = 0.018 mm
  • 1.6x:  CoC = (25 cm) / (5 lp / mm) / 12.1 / 25 = 0.017 mm
  • mFT:  CoC = (25 cm) / (5 lp / mm) /  15   / 25 = 0.013 mm

Let's compute one more example for the CoC using a 20x30 inch photo viewed from 2 ft away with 20-20 vision taken with a FF camera (24mm x 36mm sensor):

  • Viewing distance = (2 ft) · (12 in / ft) · (2.54 cm / in) = 61 cm
  • Final image resolution for 20-20 vision = 5 lp / mm
  • Enlargement = (30 in x 25.4 mm / in) / 36 mm = 21.2

Plugging into the CoC Formula, CoC (mm) = viewing distance (cm) / desired final-image resolution (lp/mm) for a 25 cm viewing distance / enlargement / 25, we get CoC = (61 cm) / (5 lp / mm) / (21.2) / 25 = 0.023 mm, which is what we would expect, since viewing a 20x30 inch photo at 2 ft is equivalent to viewing a 8.3x12.5 inch photo at 10 inches (very close to "standard viewing conditions").

More simply, however, there is the Zeiss Formula for calculating the CoC, which is simply the sensor diagonal divided by 1730.  In the examples worked above, it comes out to the same as if we used the sensor diagonal divided by 1600:

  • FF:    CoC = 43.3mm / 1600 = 0.027 mm (Zeiss:  43.3mm / 1730 = 0.025 mm)
  • 1.5x:  CoC = 28.4mm / 1600 = 0.018 mm (Zeiss:  28.4mm / 1730 = 0.016 mm)
  • 1.6x:  CoC = 26.8mm / 1600 = 0.017 mm (Zeiss:  26.8mm / 1730 = 0.015 mm)
  • mFT:  CoC = 21.6mm / 1600 = 0.013 mm (Zeiss:  21.6mm / 1730 = 0.012 mm)

In any case, what this demonstrates is that the CoC is proportional to the sensor diagonal for a given display size, viewing distance, and visual acuity and independent of the pixel count.  A popular online DOF calculator, DOFMaster, uses sensor diagonal / 1400 for the CoC.  This online calculator allows you select the CoC; however, for comparative purposes across formats, the CoC will scale by the equivalence ratio (crop factor).

On the other hand, the DOF formulas do not include how closely we scrutinize a photo.  In other words, two photos might have the same DOF per the mathematical formulas, but if we scrutinize one photo more closely than another (perhaps it is more interesting, for example), then the DOFs may appear different:

Scrutinizing one image more critically than another has the same effect as looking at that image with a higher visual acuity than the another.

However, for two photos of the same scene displayed at the same size and viewed from the same distance that have the same computed DOF, then whatever the subjective impression of the DOF is for one photo, it will be the same for the other photo (although, as discussed above, it's easy to confuse "blurry" with "less DOF").

As the DOF deepens, more of the image is rendered sharply, both because more of the image is within the DOF, and because the aberrations of the lens lessens as the aperture gets smaller -- up to a point.  Depending on the sensor pixel size and display size of an image, the effects of diffraction softening will begin to degrade the sharpness of the image more than the deeper DOF and lesser aberrations increase the sharpness.  However, the point diffraction softening outweighs a deeper DOF and lesser aberrations depends tremendously upon the scene and the lens sharpness.  It is common to read about "diffraction limited apertures", but these are based on a "perfect" lens and images where the whole of the scene lies within the DOF.  In other words, it is quite common to achieve a sharper and more detailed image that is past the "diffraction limited" aperture due to the deeper DOF including more of the scene.

At the opposite end of the DOF spectrum, shallow DOFs serve to isolate the subject from the background.  However, while a more shallow DOF does lead to a greater background blur, it is not the only, or, in many instances, even the major player in the quantity of background blur, much in the same way that many confuse the bokeh (the quality of the out-of-focus areas of an image) with the quantity of the blur.  For example, if the subject is 10 ft from the camera, 50mm f/2 will have the same framing and DOF on the same format as 100mm f/2 for a subject 20 ft away.  That is, the same distance from the focal plane will be considered to be in critical focus.  But the nature of the background blur will be very different -- the longer focal length will magnify the background blur.

In fact, we can be more specific.  The amount of background blur (assuming the background is well outside the DOF) is proportional to the ratio of the aperture diameters.  For example, while the DOF for 50mm f/2 and 100mm f/2 will be the same for the same framing (in most circumstances), the background blur will be double for 100mm f/2 since the aperture diameter is twice as large for 100mm f/2 than for 50mm f/2 (100mm / 2 = 50mm, 50mm / 2 = 25mm).  A good tutorial on this can be found here and here is an excellent blur calculator/demonstrator.

We can now make the following generalizations about the DOF of images on different formats for non-macro situations (when the subject distance is "large" compared to the focal length), keeping in mind that aperture diameter = focal length / f-ratio, and assuming that all images are viewed from the same distance with the same visual acuity:

 

 
  • For the same perspective, framing, relative aperture, and display size, larger sensor systems will yield a more shallow DOF than smaller sensors in proportion to the ratio of the sensor sizes.

  • For the same perspective, framing, aperture diameter, and display size, all systems have the same DOF.

  • If both formats use the same focal length and relative aperture (and thus also the same aperture diameter), but the larger sensor system gets closer so that the subject occupies the same area of the frame, and the photos are displayed at the same dimensions, then the larger sensor system will have a more shallow DOF in proportion to ratio of the sensor sizes.

  • For the same perspective and focal length, larger sensor systems will have a wider framing.  If the same relative aperture is used, then both systems will also have the same aperture diameter.  As a result, if the photo from the larger sensor system is displayed at a larger size in proportion to ratio of the sensor sizes, or the photo from the larger sensor system is cropped to the same framing as the image from the smaller sensor system and displayed at the same size, then the two photos will have the same DOF.

 

Let's give examples for each scenario using mFT (4/3), 1.6x, and FF (forgive me for leaving out 1.5x, as it is so close to 1.6x as to be all but redundant to use for the purpose of examples, as I am repeating the process several times).  As noted earlier, the condition of "same display size" only requires the same diagonal length, rather than the same length and width.  This distinction is unnecessary when the systems have the same aspect ratio, but can sometimes be a factor when the aspect ratios are not the same (for example, if we display a photo with a 15 inch diagonal, then a 4:3 photo would be 9 x 12 inches and a 3:2 photo would be 8.3 x 12.5 inches).  In all cases, we assume the same viewing distance and visual acuity:

 

  • Let's say we are taking a photo of a subject 10 ft away, and use 40mm f/2.8 on mFT (4/3), 50mm f/2.8 on 1.6x, and 80mm f/2.8 on FF.  All will have the same perspective, since the subject-camera distance is the same, and all will have the same AOV, since 40mm x 2 = 50mm x 1.6 = 80mm.  Since all are using f/2.8, then if we display the photos at the same size, FF will have the least DOF, 1.6x will have 1.6x more DOF than FF, and mFT (4/3) will have the twice the DOF of FF (1.25x more DOF than 1.6x).

  • Again, let's say we are taking a photo of a subject 10 ft away, but this time use 40mm f/4 on mFT (4/3), 50mm f/5 on 1.6x, and 80mm f/8 on FF.  Once again, all will have the same perspective since the subject-camera distances are the same, and all will have the same AOV since 40mm x 2 = 50mm x 1.6 = 80mm.  The aperture diameters will also be the same since 40mm / 4 = 50mm / 5 = 80mm / 8 = 10mm.  In this case, all photos will have the same DOF when displayed at the same dimensions.

  • This time, let's shoot the subject from 20 ft at 40mm f/4 on mFT (4/3), 16 ft at 40mm f/4 on 1.6x, and 10 ft at 40mm f/4 on FF.  While the perspectives are different (since the subject-camera distances are not the same), the AOVs are the same since 20 ft / 2 = 16 ft / 1.6 = 10 ft, but FF will have the most shallow DOF, 1.6x will have a DOF 1.6x deeper, and mFT (4/3) will double the DOF.

  • We now shoot the same subject from 10 ft away with all formats, but this time use the same focal length and same f-ratio as well (for example, 50mm f/2.8).  If we display the mFT (4/3) photo with a 12 inch diagonal, the 1.6x photo with a 15 inch diagonal, and the FF photo with a 24 inch diagonal, and view the images from the same distance, then all will have the same DOF.  Note how the diagonals correspond to the focal multipliers of the respective systems:  12 in x 2 = 15 in x 1.6 = 24 in, which means that if we cropped the photos to the same framing, they would all be the same dimensions.

 

Let's now demonstrate the DOF equivalence mathematically.  As stated earlier, the DOF is the distance from the focal plane where objects in this zone are considered to be critically sharp.  However, the distance from the focal plane is not always an even split.  When the subject distance (d) is "large" compared to the focal length of the lens (non-macro distances), the far limit of critical focus (df) , near limit of critical focus (dn), and DOF can be computed as:

  • df ~ [H · d] / [H - d]

  • dn ~ [H · d] / [H + d]

  • DOF = df - dn ~ [2 · H · d²] / [H² - d²]

where d is the distance to the subject and H is the hyperfocal distance.  We can now compute the DOF behind the subject and the DOF in front of the subject:

  • DOF behind = df - d = d² / [H - d]

  • DOF in front = d - dn = d² / [H + d]

Note that the smaller the subject-camera distance (d) becomes in comparison to the hyperfocal distance (H), the more evenly the DOF is split in front and behind the subject, since (H - d) and (H + d) are nearly equal for values of d that are small compared to H.  In other words, the common wisdom that 1/3 of the DOF is in front of the subject and 2/3 of the DOF is behind the subject is not always true.  This "rule" is valid when only when the subject-camera distance, d, is equal to 1/3 the hyperfocal distance,  H.  As the subject distance changes from that particular value, the 1/3 - 2/3 DOF split becomes a progressively less accurate description of the split of the DOF in front and behind the subject. In another scenario, it is also interesting to note that as subject distance approaches the hyperfocal distance, the far distance of critical focus approaches infinity, and the near distance of critical focus approaches half the hyperfocal distance, thus giving infinite DOF beyond half the hyperfocal distance.

Another interesting scenario to consider is that when the subject-camera distance, d, is small compared to the hyperfocal distance, H, then, for the same format, the DOF will be essentially the same for the same framing and f-ratio.  For example, 50mm at 10 ft has the same framing as 100mm at 20 ft on 35mm FF.  If we shoot the scene at f/2 in each case, we will get the same DOF since the hyperfocal distance is 137 ft for a CoC of 0.03mm (the value used in most DOF calculators for 35mm FF, which corresponds to an 8x10 inch print viewed from a distance of 10 inches), which is much larger than the subject distance of 10 ft.  However, were we instead to compare 24mm f/2 at 30 ft to 48mm f/2 at 60 ft (same framing), we would get a different DOF since the hyperfocal distance works out to 30 ft (for a CoC of 0.03mm), which is the same, rather than much larger, than the subject-camera distance.

In any case, we can see that the DOF is a function only of the hyperfocal distance (H) and the subject distance (d).  The role of the focal length (FL), f-ratio (f), and CoC (c) are contained in the hyperfocal distance:

H ~ FL² / (f · c)

If we scale the focal length, f-ratio, and CoC by the equivalence ratio (R), the hyperfocal distance remains the same:

H' ~ (FL·R)² / [(f · R) · (c · R)]

    = [FL² · R²] / [(f · c) · R²]

    = FL² / (f · c)

    = H

Consequently the DOF is invariant for the same perspective, framing, and aperture diameter. By expressing H in terms of aperture diameter (a), angle of view (AOV), and the proportion of the sensor diagonal that the CoC covers (p), we get a format independent expression for the hyperfocal distance, and consequently DOF:

H ~ a / [2·p·tan (AOV/2)]

Thus, for non-macro situations, the DOF for the same perspective, framing, and output size is also the same.

A consequence of a larger sensor means that a longer focal length is required for the same perspective and framing, as well as a larger f-ratio to obtain the same aperture diameter.  For example, let's consider images taken of the same scene from the same position with the same framing:

  • A7R2 at 80mm, f/8 (aperture diameter = 80mm / 8 = 10mm)
  • D500 at 53mm, f/5 (aperture diameter = 53mm / 5 ~ 10mm)
  • 80D at 50mm, f/5 (aperture diameter = 50mm / 5 = 10mm)
  • EM1.2 at 40mm, f/4 (aperture diameter = 40mm / 4 = 10mm)

Since the perspective, framing, and aperture diameters are all the same, then for the same display size and viewing distance, their DOFs will also be the same. As a side, if the shutter speeds are also the same (which will require a higher ISO for the higher f-ratios to maintain the same lightness), then the images will be made with the same total amount of light as well, which will result in the same relative noise if the sensors have the same efficiency.

Another reason that DOF is so important, even if DOF, per se, is not an issue to the photographer, is that it is also intimately connected with sharpness, diffraction softening, and vignetting.  The reason that DOF affects sharpness is twofold. First of all, as shown above, the DOF is directly related to the aperture, and the larger the aperture diameter, the greater the aberrations, and, in some instances, the greater the field curvature.  Secondly, a more shallow DOF means that less of the scene will be within the DOF, and, by definition, elements of the scene outside the DOF will not be sharp.  This second point is especially important, since, as noted earlier, DOF calculators usually base their calculations off a CoC for an 8x10 print viewed from 10 inches away.  Since so many now evaluate the sharpness of the lens on the basis of 100% crops on a computer monitor, the DOF that is seen at 100% on the computer screen is significantly more narrow than the DOF computed by the calculators.

 

In addition to DOF and sharpness, the aperture is also intimately connected to diffraction.  Diffraction softening is the result of the wave nature of light representing point sources as disks (known as Airy Disks), and is most definitely not, as is misunderstood by many, an effect of light "bouncing off" the aperture blades.  The diameter of the Airy Disk is a function of both the f-ratio and the wavelength of light:  d ~ 2.44·λ·f, where d is the diameter of the Airy Disk, λ is the wavelength of the light, and f is the relative aperture. Larger relative aperture (deeper DOFs) result in larger disks, as do longer wavelengths of light (towards the red end of the visible spectrum) so not all colors will suffer from diffraction softening equally.  The wavelengths of light in the visible spectrum differ by approximately a factor of two, so that means, for example, that red light will suffer around twice the amount of diffraction softening as blue light.

Diffraction softening is unavoidable at any aperture, and worsens as the lens is stopped down.  However, other factors mask the effects of the increasing diffraction softening:  the increasing DOF and the lessening lens aberrations.  As the DOF increases, more and more of the photo is rendered "in focus", making the photo appear sharper.  In addition, as the aperture narrows, the aberrations in the lens lessen since more of the aperture is masked by the aperture blades.  For wide apertures, the increasing DOF and lessening lens aberrations far outweigh the effects of diffraction softening.  At small apertures, the reverse is true.  In the interim (often, but not always, around a two stop interval), the two effects roughly cancel each other out, and the balance point for the edges typically lags behind the balance point for the center by around a stop (the edges usually suffer greater aberrations than the center).  In fact, it is not uncommon for diffraction softening to be dominant right from wide open for lenses slower than f/5.6 equivalent on FF, and thus these lenses are sharpest wide open (for the portions of the scene within the DOF, of course).

The optimum DOF is often more a matter of artistic intent than resolved detail.  Clearly, more shallow DOFs have less of the scene within critical focus, but this is by design.  What is not by design is that, at very wider apertures, lens aberrations reduce the detail even for the portions of the scene within the DOF, so even if the photographer prefers the more shallow DOF, they may choose to stop down simply to render more detail where detail is important.  Likewise, while a photographer may stop down with the intent to get as much of the scene as possible within the DOF so as to have a more detailed photo overall, portions of the scene that were within the DOF at wider apertures will become softer due to the effects of diffraction.  Thus, the photographer must balance the increase in detail gained by bringing more of the scene within the DOF against detail lost for portions of the scene that were within the DOF at wider apertures.  In addition, deeper DOFs require smaller apertures, which means either longer shutter speeds (increasing the risk/amount of motion blur and/or camera shake) or greater noise since less light will fall on the sensor at more narrow apertures for a given shutter speed.

A common myth is that smaller pixels suffer more from diffraction than larger pixels.  On the contrary, for a given sensor size and lens, smaller pixels always result in more detail.  That said, as we stop down and the DOF deepens, we reach a point where we begin to lose detail due to diffraction softening.  As a consequence, photos made with more pixels will begin to lose their detail advantage earlier and quicker than images made with fewer pixels, but they will always retain more detail.  Eventually, the additional detail afforded by the extra pixels becomes trivial (most certainly by f/32 on FF).  See here for an excellent example of the effect of pixel size on diffraction softening.

In terms of cross-format comparisons, all systems suffer the same from diffraction softening at the same DOF.  This does not mean that all systems resolve the same detail at the same DOF, as diffraction softening is but one of many sources of blur (lens aberrations, motion blur, large pixels, etc.).  However, the more we stop down (the deeper the DOF), diffraction increasingly becomes the dominant source of blur.  By the time we reach the equivalent of f/32 on FF (f/22 on APS-C, f/16 on mFT and 4/3), the differences in resolution between systems, regardless of the lens or pixel count, is trivial.

For example, consider the Canon 100 / 2.8L IS macro on a 5D2 (21 MP FF) vs the Olympus 14-42 / 3.5-5.6 kit lens on an L10 (10 MP 4/3).  Note that the macro lens on FF resolves significantly more (to put it mildly) at the lenses' respective optimal apertures, due to the macro lens being sharper, the FF DSLR having significantly more pixels, and the enlargement factor being half as much for FF vs 4/3.  However, as we stop down past the peak aperture, all those advantages are asymptotically eaten away by diffraction, and by the time we get to f/32 on FF and f/16 on 4/3, the systems resolve almost the same.

For the same color and f-ratio, the Airy Disk will have the same diameter, but span a smaller portion of a larger sensor than a smaller sensor, thus resulting in less diffraction softening in the final photo.  On the other hand, for the same color and DOF, the Airy Disk spans the same proportion of all sensors, and thus the effect of diffraction softening is the same for all systems at the same DOF.

Let's work an example using green light (λ = 530 nm = 0.00053mm). The diameter of the Airy Disk at f/8 is 2.44 · 0.00053mm·8 = 0.0103mm, and the diameter of the Airy Disk at f/4 is half as much -- 0.0052mm.  For FF, the diameter of the Airy Disk represents 0.0103mm / 43.3mm = 0.024% of the sensor diagonal at f/8 and 0.005mm / 21.6mm = 0.012% of the diagonal  at f/4.   For mFT (4/3), the diameter of the Airy Disk represents 0.0103mm / 21.6mm = 0.048% at f/8 and 0.005mm / 21.6mm = 0.024% at f/4.

Thus, at the same f-ratio, we can see that the diameter of the Airy Disk represents half the proportion of a FF sensor as mFT (4/3), but at the same DOF, the diameter of the Airy Disk represents the same proportion of the sensor. In other words, all systems will suffer the same amount of diffraction softening at the same DOF and display dimensions.  However, the system that began with more resolution will always retain more resolution, but that resolution advantage will asymptotically vanish as the DOF deepens.  In absolute terms, the earliest we will notice the effects of diffraction softening is when the diameter of the Airy Disk exceeds that of a pixel (two pixels for a Bayer CFA), but, depending on how large the photo is displayed, we may not notice until the diameter of the Airy Disk is much larger.

Typically, the effects of diffraction softening do not even begin to become apparent until f/11 on FF (f/7.1 on APS-C and f/5.6 on mFT -- 4/3), and start to become strong by f/22 on FF (f/14 on APS-C and f/11 on mFT -- 4/3).  By f/32 on FF (f/22 on APS-C, f/16 on mFT -- 4/3) the effects of diffraction softening are so strong that there is little difference in resolution between systems, regardless of the lens, sensor size, or pixel count.

We can now summarize the effects of diffraction softening as follows:

  • Diffraction is always present. As the lens is stopped down, optical aberrations lessen and diffraction softening increases.
  • All else equal, more pixels will always resolve more detail, regardless of other sources of blur, including diffraction.
  • The "diffraction limited aperture" is the relative aperture where the effects of diffraction softening overcome the lessening lens aberrations, and will vary from lens to lens as well as where in the frame we are looking (e.g. center vs edges, where the edges typically, but not always, lag around a stop behind the center).
  • The pixel count has a very minor effect on the diffraction limited aperture. For example, if the diffraction limited aperture on a 12 MP sensor is f/5.6, it may be at f/4 on a 36 MP sensor, all else equal (but the 36 MP sensor will certainly resolve more at f/5.6 than the 12 MP sensor).
  • All systems suffer the same diffraction softening at the same DOF, but do not necessarily resolve the same detail at the same DOF, as diffraction softening is merely one of many forms of blur (e.g. lens aberrations, motion blur, large pixels, etc.).
  • As the DOF deepens, all systems asymptotically lose detail, and by f/32 on FF (f/22 on APS-C, f/16 on mFT -- 4/3), the differences in resolution between systems is trivial, regardless of the lens, sensor size, or pixel count.

It is worth noting that some lens tests show much greater discrepancies in the effects of diffraction softening that we would expect. Per the lens tests at www.slrgear.com, we can see huge disparities between f / 16 and f / 22 even with high end lenses like the Zuiko 50 / 2 macro (7 blades) and Zuiko 150 / 2 (9 blades), which are far greater than can be accounted for by the minor differences in the aperture shapes.  In fact, the Canon 100 / 2.8 macro and the Sigma 105 / 2.8 macro both have 8 blades, but show the same huge differences in sharpness from f / 22 to f / 32 on 1.6x as the Zuikos.  The most likely explanation for this is that at the minimum aperture, not all lenses are equally accurate.

For example, consider a 50mm lens and a constant "aperture bias" of -0.5mm, that is, the lens always sets the aperture 0.5mm smaller than it should be (whether as a result sloppy quality control or sloppy design).  At f/4, the aperture diameter should be 50mm / 4 = 12.5mm.  However, a bias of -0.5mm would make the aperture diameter 12mm instead, resulting in a true f-ratio of 50mm / 12mm = f / 4.17 -- 1/9 of a stop off -- which is insignificant. At f / 8, the aperture diameter should be 50mm / 8 = 6.25mm.  Again, a bias of -0.5mm would make the aperture diameter 5.75mm resulting in a true f-ratio of 50mm / 5.75mm = f / 8.7 -- 1/4 of a stop off -- bordering on significant, but still small enough to go unnoticed by most people.  At f / 22, however, the error becomes much more of an issue. The aperture diameter should be 50mm / 22 = 2.27mm.  This time, the -0.5mm bias would make the aperture diameter 1.77mm for a true f-ratio of 50mm / 1.77mm = f / 28 -- 2/3 of a stop different -- very noticeable, and resulting in a considerable difference in diffraction softening at such small apertures. Furthermore, the "aperture bias" need not be constant, and could vary depending on the selected f-ratio, producing even greater differences at small apertures.

Of course, this hypothesis for the discrepancies in the effects of diffraction softening in the SLR Gear tests would need to be verified by comparing the exposures at different f-ratios.  In addition, the effects of vignetting can confound the issue at wide apertures, but, as demonstrated above, small errors in the aperture diameters are insignificant at wider apertures anyway.  Thus, we would test at small apertures, such as f / 22 and smaller, where the discrepancies due to aperture bias error are most noticeable.  Unfortunately, SLR Gear does not host (or even still have) these images to make such a comparison, so this conjecture needs to be verified.  Furthermore, it is not unlikely that an "aperture bias" could have been an issue with the particular lens they tested, but not endemic to all (or most) copies of the lens. Furthermore, while it is well-known that the shape of the aperture plays a role in how the bokeh is rendered, it is unlikely that it plays any role in the degree of diffraction softening so long as the area of the aperture is the same. Regardless, the effects of diffraction softening are not particularly significant until very small apertures.

To get a DOF larger than what the lens can stop down to achieve, we either use a shorter lens and TC (teleconverter), or frame wider and crop to the desired framing.  The effect of a TC is to multiply the relative aperture by the same factor as the focal length. For example, by using a 50mm macro at f/22 with a 2x TC, we would effectively be at 100mm f/45.  While more convenient than using a TC, the downside to framing wider and cropping is that it costs us pixels.  However, since the lenses for all systems can stop down to the diffraction limited resolution of the sensor, much of the detail lost by cropping would have been lost from diffraction softening regardless.  For example, an image at 100mm f/32 will have the same DOF and nearly the same detail as an image at 50mm f/16 taken from the same distance and then cropped to the same framing, despite having 1/4 the number of pixels on the subject.  This is because the f/32 image has already lost almost the same amount of detail due to diffraction softening, although it will still retain slightly more detail, due to the oversampling of a greater number of diffraction limited pixels still renders slightly more detail than a fewer number of larger pixels.

Of course, it would be nice if we didn't have to stop down to increase sharpness for the portions of the image within the DOF, especially as this helps us avoid the effects of diffraction softening.  For example, let's say we are taking a photo of a landscape where the entire scene is within the DOF, even at f/2.8. Thus, there would be no reason to shoot at a different f-ratio on different systems to maintain the same DOF.  However, the aberrations for larger apertures are more problematical than the aberrations for smaller apertures, and, once again, we realize that larger sensor system will require a higher f-ratio to maintain the same aperture diameter. Thus, even though the DOF may not an issue per se, the aberrations, as well as vignetting, most certainly can be.

Of course, one might ask why we simply don't choose the settings on each system that produce the "best" results for each. Well, of course that is how we would use the systems. The section on partial equivalence talks more about this.

Putting it all together in terms of AOV, DOF, and shutter speed, let's look at some examples of equivalent settings from common cameras (using the same AOV) with all f-ratios and ISOs rounded to the nearest 1/3 stop, which show how the available DOFs on different formats differ:


Camera

Focal Multiplier

Focal Length (mm)

f-ratio

Shutter Speed

ISO

 

 

 

 

 

 

Canon S3

6.02x

8.3

f / 2.8

1/400

100

Canon G7

4.84x

10.3

f / 3.2

1/400

125

Canon Pro1

3.93x

12.7

f / 4

1/400

160

Olympus E3

2.00x

25

f / 8

1/400

800

Sigma SD14 1.74x 29 f / 9

1/400

1000

Canon 40D

1.62x

31

f / 10

1/400

1250

Nikon D300 1.52x 33 f / 11

1/400

1250
 Canon 1DIII 1.26x 40 f / 13

1/400

1600
Canon 5D 1.00x 50 f / 16

1/400

3200
Leica S2 0.80x 62.5 f / 20

1/400

5000
Mamiya ZD 0.72x 67 f / 21

1/400

6400

 

 

EXPOSURE TIME

The exposure time (shutter speed), obviously, is the length of time the shutter remains open to achieve the desired exposure.  The reason Equivalent photos have the same shutter speed is because the amount of motion blur will be the same for a given shutter speed.  However, there are many times when we would not compare formats with the same shutter speed since there is enough light to stop down to achieve the desired DOF and still have a fast enough shutter so that motion blur is a non-issue.  Under these circumstances, the larger sensor system can deliver both deliver more detail subject to lens sharpness and pixel count) in addition to a cleaner image since the lower shutter speed results in more light falling on the sensor for a given DOF.

For example, let's say we are shooting a landscape.  The following settings would be likely candidates for a particular scene:

  • A7R2 at 24mm, f/11, 1/100, ISO 100
  • D500 at 16mm, f/7.1, 1/250, ISO 100
  • 80D at 15mm, f/7.1, 1/250, ISO 100
  • EM1.2 at 12mm, f/5.6, 1/400, ISO 100

While landscapes are a common scenario, and such a comparison is of practical value to most photographers, we must take care to note that this partially equivalent scenario is only valid when the shutter speeds are sufficiently high to avoid motion blur, and, if a tripod is not being used, to avoid camera shake.  If, instead, we were engaged in street photography near dusk, we would need to compare with fully equivalent settings since a sufficient shutter speed would be crucial to stopping motion blur for the required DOF:

  • A7R2 at 24mm, f/11, 1/100, ISO 400
  • D500 at 16mm, f/7.1, 1/100, ISO 250
  • 80D at 15mm, f/7.1, 1/100, ISO 250
  • EM1.2 at 12mm, f/5.6, 1/100, ISO 100

Alternatively, if one system has IS and the other system does not, then if motion blur is not an issue, then the IS system will be able to use the lower shutter speed if a tripod is not used on the non-IS system.  In this case, the system with IS will have the noise advantage for a given DOF since more light will fall on the sensor.

So if we are using anything other than base ISO, then we cannot discount the importance of shutter speed in comparing systems, since the only time we would not be at base ISO is when shutter speed is a factor.  Under these circumstances, the only way for the larger formats to achieve less relative noise than the smaller formats is by using a more shallow DOF, rather than raising the ISO, to maintain the necessary shutter speed.

 

LIGHTNESS

The lightness of the photo is the how bright the photo appears, and is usually adjusted by the ISO setting of the camera.  Let's say we have a perfect sensor that is a photon counter.  That is, each photon that falls on the pixel is recorded so that if 100 photons fell on a pixel, the image file would record a value of 100 at base ISO.  Then at ISO 400, the image file would record a value of 400, at ISO 1600, the image file would record a value of 1600, etc., where "brighter" values would be displayed on a computer monitor or printed on a printer with greater "lightness".  See here for a much more in depth discussion.
 

 

DISPLAY DIMENSIONS

The display dimensions is the physical size of the viewed image, whether it be a print or on a computer monitor  People, including reviewers, tend to compare IQ at the pixel level, rather than the image level, which leads to incorrect conclusions about the image, unless the images are made from the same number of pixels.  If two images are made from a different number of pixels, if we are to compare them at the pixel level, then we need to properly resample the images to a common number of pixels.  We can increase the IQ of an image by increasing either the native pixel count or increasing the quality of the individual pixel.  Thus, if we compare two images with unequal pixel counts at the pixel level (often referred to as a "100% comparison"), we are disregarding the increase in IQ that comes from the additional pixels, which is discussed in more detail in the Megapixels:  Quality vs Quantity section of the essay.

For example, let's say we wish to compare the Canon 1DsIII (21 MP) and the Nikon D3 (12 MP).  Comparing images from the two systems at the pixel level is the same as comparing 16x24 inch prints from the 1DsIII to 12x18 inch prints from the D3, which is hardly a fair comparison.  The best way to compare images is to compare in the manner that they will be displayed.  For example, if you are going to print the images, then print them and compare. Of course, this is impractical to do unless we already had access to both systems.  And, even if the reviewer provides us with the files to print ourselves, that is a bit of a pain, and certainly not a basis for an objective conclusion that we can share with others as all will not be using the same printer.

So, what to do?

The easiest solution is to resample both images to a common dimension that is at least as large as the larger image and then compare at the pixel level. The reason to compare at a dimension at least as large as the larger image is because downsampling the larger image will cause it to lose detail, which, I presume, is one of the qualities of IQ being measured in the comparison.  In addition, if we are comparing relative noise, it only makes sense to do so at the same level of detail, so we would apply NR to the more detailed image to match the level of detail of the less detailed image.  Of course, care need be taken in the resampling process, since a poor resampling method can lead to incorrect conclusion about the comparative IQ between systems.  This is especially true when comparing relative noise.  We simply cannot downsample the larger file to the dimensions of the smaller file.  We first need to apply NR (or a specific form of blur) and then downsample.  In any event, it is better to upsample the smaller image rather than downsample the larger image.

Again, using the example of the 1DsIII vs D3 comparison, we could resample both images to 54 MP (300 PPI for a 20x30 inch print) and then compare at the pixel level. Of course, there's nothing magical about 54 MP, but we would like to incorporate some kind of "future-proofing" for comparisons with future cameras, and need some value larger than 21 MP, so 300 PPI for a 20x30 inch print sounds like a good "standard", as very few would print larger than this, no matter what pixel counts the future holds or what format they shoot.  Of course, for those that do print larger, they would, of course, want to compare at the larger output size.

Another option would be for a reviewer to print the images at a variety of sizes (e.g. 4x6, 8x12, 12x18, 16x24, and 20x30 inches) on a top-of-the-line printer, scan the prints, and then compare the scans from the same size prints.  'Tis a pain, but probably the most fair way to compare, although I honestly don't know if it would produce different results than resampling the two images to the "appropriate" PPI for each print size.  And, of course, we cannot discount the effects of viewing images on non-calibrated monitors (I've seen more than one comparison where someone claimed the highlights of the image to be blown with several others chiming in that they need to calibrate their monitor).

Thus, comparing images that have different pixel counts at the pixel level is a very poor way to compare the IQ between systems.  However, the closer the pixel counts are, the better such a comparison will approximate the actual differences. For example, it's reasonable to say that a comparison between the 12.1 MP Nikon D700, 12.1 MP Nikon D3, 12.3 MP Nikon D300, and the 12.7 MP Canon 5D would be easily "close enough" without resampling.  But when comparing the 10.1 MP Canon 40D, 10.1 MP 1DIII, or the 10.1 MP Olympus E3 to the aforementioned cameras at the pixel level, we are beginning to stretch a bit (12% difference in linear pixel count), and we are certainly stretching when comparing the 1DsIII to any of the above cameras at the pixel level for native image sizes (32% difference in linear pixel count between the 1DsIII and the D3, for example).

So, while no comparison is without its potential problems, the easiest mistake to correct is to carefully resample images to a common dimension, as well as applying NR as necessary for comparing relative noise, before comparing at the pixel level.

 

 

 

 

 

IMAGE QUALITY

 

The primary attributes of a camera, in no particular order, are:

• IQ (Image Quality)
• Operation (AF speed/accuracy, viewfinder, ease of use, etc.)
• Available Lenses, Flashes, Accessories
• Features (IS, video, liveview, etc.)
• Ergonomics (size, weight, build, etc.)
• Price

While this section is concerned solely with IQ, it is important to note that IQ is but one attribute of a camera system.

But what, exactly, is IQ, and what does it have to do with the "success" of a photo?  The first step in defining "IQ" is to make the distinction between "image quality" and a "quality image".  Many would take it as a given that if we have two photos of the same scene with the same composition then, all else equal as well, the photo with "higher IQ" would be "more successful".  That is, the photo with "higher IQ", for example, would place better in a photo competition, would be more likely to sell, would sell for a higher price, etc.  For sure, this may certainly be true for a large number of photos, such as a landscape photo displayed at a huge size.  But it is important to acknowledge that there is a class of photography where image quality, as opposed to a quality image, is all but irrelevant (please see these outstanding photos, for example).

Furthermore, while one system may yield "higher IQ" than another, those differences may not be large enough to make any significant difference in the appeal of the photo, depending on the QT (quality threshold) of the viewer, the scene itself, the size at which the photo is displayed, and how closely it is scrutinized (see here for an interesting example of this point), and the processing applied to the photo.  In other words, it is not merely whether System A has "higher IQ" than System B, but under what conditions it has higher IQ (and, indeed, which has "higher IQ" may flip-flop, depending on the conditions), and if the IQ is "enough higher" to make any significant difference.

For some photographers, IQ may be the most important aspect of photography.  For others, it may play no role at all or simply be an added plus.  But it is time well spent to reflect on just how important IQ is to our own photography, given that IQ is, at best, merely a means to achieving a quality image, and, at worst, completely irrelevant to the photo.

With all the disclaimers said about the relevance of IQ to the "success" of the photo, we can discuss what IQ is.  The attributes of IQ include, but are not limited to:

• Detail
• Contrast
• Color
• Noise
• Dynamic Range
• Tonal Gradation
• Bokeh
• Flare
• Distortion
• Vignetting

Attributes of IQ do not include:  subject, composition, focus accuracy, DOF, etc., which are attributes of system operation, available lenses, artistic design, and/or photographer skill.  Of course, it's important to note that operational differences, such as focus accuracy, can have a substantial effect on the ability to capture a "high IQ" image.  The "overall IQ" of a photo is a function of how the viewer subjectively weighs the individual objective components of IQ, which often depend greatly on the scene and how large the photo is displayed.

The subjective element of IQ is an important point to discuss.  For example, let's take vignetting, which is considered by many a drawback that distracts from an image.  However, some people even add vignetting artificially in PP (post-processing) to "enhance" an image.  Another hotly debated element of IQ is noise.  While low relative noise is almost universally hailed as high IQ, once again, noise is sometimes added to an image as an artistic effect.  More than that, it is not merely the quantity of noise, but the quality of the noise, that is important.  Simply because one image has more noise than another per some mathematical measure does not mean that it has the more pleasing appearance, even given that low noise is desired.

Another critical factor that needs to be mentioned is in-camera processing and PP (post-processing).  For example, comparing images from different systems based on in-camera jpgs tests the in-camera jpg engine (firmware) as much, if not more than, as it does the camera hardware.  For people who loathe PP, comparing systems on the basis of in-camera jpgs, of course, makes the most sense.  But such a comparison will have less to do with the IQ potential of a system and more to do with operational convenience.  However, for people looking to get the most out of their hardware, the "appropriate" format is RAW.  To this end, it is important that we choose a RAW conversion that portrays each system at its best.  Unfortunately, we are right back to the subjective with what looks "best", but the different conversions here demonstrate just how much of an impact the RAW converter can have.

Thus, rather than say that the IQ of one system is "higher" than another, which only has any meaning if everyone is on the same page as to what constitutes "higher", it's better to be far more specific.  That is, we should instead say that A is sharper than B, or B has smoother bokeh than A, or A is less noisy than B for the same level of detail, or B has less distortion than A, etc.  In other words, we simply cannot assign point values to each criterion and get an average score, as not all criteria will be given the same weight by all people, and even feel exactly the opposite on some point (color, for example).

For the most part, the individual components of IQ are objective.  The subjective nature of IQ comes from how we value the various objective measures of IQ.  For example, few people would dispute that sharper means "higher" IQ or that one image with "better" bokeh than another would have "higher" IQ.  However, let's say we have two images, one slightly sharper but with a less pleasing bokeh, and the other less sharp but with a more pleasing bokeh.  Which image has the "higher" IQ?  How we value these different objective elements of IQ is where the subjective comes in.

That said, let's discuss the elements of the equipment that affect IQ (keeping in mind that the artistic, photographic, and processing skills of the photographer are, by far, the most important elements).  In no particular order, they are:

•  The lens
• 
The sensor and supporting hardware
• 
The camera's internal involuntary image processing and/or jpg engine
• 
IS (image stabilization)

where the sensor and supporting hardware can be further broken down into the following:

•  Sensor Size
• 
Pixel Count
• 
Microlens Efficiency (percentage of light directed into the pixel)
• 
QE (quantum efficiency -- percentage of light falling on pixel that is recorded)
• 
Read Noise (additional noise added by the sensor and supporting hardware)
• 
CFA (color filter array)
• 
AA Filter (amount of blur introduced to inhibit moiré)

Depending on the image, various elements of IQ will have varying levels of importance.  For example, relative noise will usually play little role in ISO 100 images, edge sharpness will play basically no role in shallow DOF images, sharpness will play little role in images where motion blur is used for artistic effect, etc., etc.  So, while we can discuss the differences in IQ between systems, we cannot say which elements of IQ are more important than others.  Thus, while one system may have significantly more appeal on the basis of IQ to a vast majority, that does not mean that it will have higher IQ in the eyes of all.  Hence, when comparing the IQ of different systems, as mentioned further above, we are best served comparing specific elements of IQ, rather than trying to speak of "overall" IQ.

So, is it simply a waste of time to compare IQ between systems?  Some believe so, but I disagree.  Some elements of IQ that most people value are predictable and quantifiable on the basis of the sensor and available lenses.  This essay discusses the relationship between the glass and the sensor in how they determine some aspects of IQ, in particular, detail, sharpness, contrast, vignetting, and relative noise.  However, it is also important to note the aspects of IQ that this essay does not discuss, such as bokeh, color, and distortion.

All these qualifiers and disclaimers said, a critical consideration to IQ is the individual's QT (quality threshold), that is, the point at which additional IQ makes no difference to the viewer at a given output size.  For example, System A may satisfy one person's QT at 8x12, but fail to do so at 12x18.  Or, one system may fail to satisfy a viewer's QT at any output size due to factors that are independent of the image dimensions (bokeh, for example).

Regardless, it's still not possible to reach universal agreement that one image, or system, has higher IQ than another.  The reason for this is that images from two different systems are never identical, and whatever differences there are between them may appeal to different people differently, as people value different aspects of IQ differently.  For example, let's say one image is sharper everywhere than another, except in the extreme corners.  Which image has the higher IQ?  Different people will have different answers depending both on the type of photography they do or enjoy, and on the degree to which the differences in sharpness vary in the images.  Another difficulty is when one system shows higher IQ in one circumstance, but lower IQ in another.  Likewise, a sensor with a weaker AA filter will render a sharper image, but be more subject to moiré, so in some instances it will have higher IQ and in other instances lower IQ, depending on the scene.  In other words, there's still a great deal of subjectivity even within this very narrow set of parameters for IQ.

Noise is perhaps the most hotly contested of the IQ parameters.  As mentioned earlier, it is not simply the total amount of relative noise, but the quality of the noise -- the distribution of the noise in the various color channels, the balance of color vs luminosity noise, and the grain of the relative noise (which is a function of the native pixel count of the sensor).  But while noise can even have a pleasing effect in some images, I've never heard of anyone saying the same for pattern noise and banding.  Thus, a noisy image without pattern noise or banding will likely look significantly better than a cleaner image with pattern noise or banding, depending on the pattern, degree of banding, and how large the difference in total relative noise is.  Furthermore, since different cameras will apply NR (noise reduction) to various degrees (some even to RAW files), it is important to recognize that while one image may be more noisy than another, it may also yield more detail, which may well matter more than the relative noise.  If not, then we should apply NR and/or downsampling the more detailed image to match the level of detail of the less detailed image before comparing relative noise.

Another critical factor that needs to be mentioned is in-camera processing and PP (post-processing).  For example, comparing images from different systems based on in-camera jpgs tests the in-camera jpg engine (firmware) as much, if not more than, as it does the camera hardware.  For people who loathe PP, comparing systems on the basis of in-camera jpgs, of course, makes the most sense.  But such a comparison will have less to do with the IQ potential of a system and more to do with operational convenience.  However, for people looking to get the most out of their hardware, the "appropriate" format is RAW.  To this end, it is important that we choose a RAW conversion that portrays each system at its best.  Unfortunately, we are right back to the subjective with what looks "best", but the different conversions here demonstrate just how much of an impact the RAW converter can have.

In addition, the IQ differential, while present, may not always be noticeable.  Let me explain that odd statement, since it would seem obvious that if you can't see a difference in IQ, then there is no difference in IQ.  Well, yes and no.  True, if for a particular image you cannot see a difference, then there is no meaningful difference in IQ.  But depending on how much processing is applied to the image, we may find that one image withstands that processing much better than the other.  In addition, as mentioned earlier, the IQ differential may not show at one print size, but become apparent at another.  Thus, the "hidden" IQ of an image may become apparent only under strong PP or larger prints.  It's for that exact reason that so many shoot RAW instead of jpg.  In many cases, the IQ differential between jpg and RAW conversions are completely insignificant, whereas in some cases, the differences are substantial.  So just as RAW has higher IQ than jpg, one system may have higher IQ than another, but that higher IQ does not always manifest itself.  Hence, while for one person the IQ difference is non-existent, for another, the IQ difference is significant. 

Furthermore, it is fair to say that the elements of IQ that can be corrected with PP matter less than the elements of IQ that are resistant to PP.  For example, vignetting and distortion are easy corrections in post (and, in fact, can be automatically "corrected" in some RAW converters, along with even PF), whereas detail and DR are not.  Other attributes are intermediary -- relative noise can be lowered, but this comes at the expense of detail.  Sharpness can be enhanced, but this comes at the expense of artifacts.  Still other effects are primary:  bokeh, flare, and moiré are often beyond the abilities of PP (unless one wishes to painstakingly hand-edit every portion of the image), but these attributes occur in only certain types of photos, and thus may not be important considerations to some people.  Nonetheless, despite the fact that there is no way around the subjective elements of IQ and the narrow definition used in this essay, generalizations about the IQ of different systems can be made.

Lastly, I would like to more thoroughly address the issue of output size, which is a critical consideration in determining what level of IQ, especially in terms of sharpness and detail, really matters.  For many, if not most, the web is their primary venue for displaying images.  Thus, even a 1.3 MP image is good for a 1280x1024 presentation.  This begs the question as to how much resolution is "enough"?  One take on the issue is that the required PPI (pixels per inch) for a "high quality photo" is given by the formula PPI = 3438 / Viewing Distance in inches (click here for the full article), depending, of course, on the quality of the pixels.

And, since I bring up printing, it's no small point that the printer and paper used for the final image is a critical component of the final image.  However, this topic of this essay is comparing camera systems (camera and available lenses), and it is presumed that we are taking care to process the images as best we can and use the same quality printer and paper for both systems.

To that end, let's consider the PPIs for common print sizes (in inches).  The table gives the PPIs for 10, 20, 30, and 40 MP images for the with a native 3:2 aspect ratio / 4:3 aspect ratio cropped to the given print dimensions:

 

Print Dimensions (inches) / PPI for a high quality photo viewed from 1.5x the diagonal of the photo

PPI for 10 MP

PPI for 20 MP

PPI for 30 MP

PPI for 40 MP

 

 

 

 

 

8x10 / 179

323 / 342

457 / 484

559 / 592

646 / 684

8x12 / 159

323 / 304

457 / 430

559 / 527

646 / 608

11x14 / 129

235 / 249

332 / 352

407 / 431

470 / 498

12x18 / 106

215 / 203

304 / 287

372 / 352

430 / 406

13x19 / 100

199 / 192

281 / 272

345 / 333

398 / 394

16x20 / 89

161 / 171

228 / 242

279 / 296

322 / 342

16x24 / 79

161 / 152

228 / 215

279 / 263

322 / 304

18x24 / 76

143 / 152

202 / 215

248 / 263

286 / 304

20x30 / 64

129 / 122

182 / 173

223 / 211

258 / 244

24x36 / 53

108 / 101

153 / 143

187 / 175

216 / 202

30x40 / 46

86 / 91

122 / 129

149 / 158

172 / 182

 

As we can clearly see, even 10 MP easily provides "enough" resolution for "high quality" prints viewed from a "normal" distance of 1.5x the display diagonal.  But this presumes that we are talking about "high quality" pixels.  This brings up the concept of "equivalent pixel count".  For example, let's say we have a camera-lens combo that resolves 2000 lw/ph.  This resolves 167 PPI for a 12x18 inch print, which is considerably less than the 215 PPI for a 10 MP file, and is independent of the actual pixel count of the sensor.  Thus, we are likely better served by using the lw/ph of the particular camera-lens combo divided by the display height of the photo to compute effective PPI for a photo, as opposed to the pixel count of the sensor.

Other factors, such as noise, also affect the quality of the pixel.  We wouldn't expect 10 MP from a compact to deliver the same quality as 10 MP from a DSLR, for example, nor can we simply upsample a 10 MP image to 20 MP and expect a marked improvement (in fact, the utility of upsampling for the purposes of increasing print quality is of debatable value).  On the other hand, the reality is that for deep DOF photos at base ISO and smaller print sizes (8x12 inches and smaller, and even larger, depending on the scene and QT of the viewer), few will be able to distinguish, or care, about the differences in IQ between most formats. An interesting article on that point is Michael Reichmann's "You've Got to be Kidding -- No, I'm Not".

But, as we all know, "enough" and "high quality" are very subjective, which involve the visual acuity of the viewer, how closely a photo is viewed (not necessarily from 1.5x the display diagonal), and the viewer's personal standards.  Furthermore, it's worth noting that since Bayer arrays record only one color per pixel, the PPIs in the above table may be more accurately represented by pixel counts twice as large as given.  But, even then, 10 MP on a Bayer CFA will still satisfy the requirements for a "high quality" print viewed from 1.5x the display diagonal, lens permitting, of course.

In any event, there are many elements to IQ that matter even at smaller print sizes, such as bokeh and DR.  Thus, even though one system may be able to output a sharper and more detailed image at larger dimensions, these qualities may not be as important as the other qualities of IQ, depending on the image.  Even the artistic consideration of DOF depends greatly on how large we display the photo and how closely we view it.  Since the artistic considerations often outweigh the technical considerations of an image, this brings us full circle back to the distinction between a quality image and image quality.

So what IQ advantages does a larger sensor have?  Typically, the larger sensor system will deliver "higher overall IQ" over smaller sensor systems of the same generation in the following ways:

  • Larger sensor systems of a given generation usually have less noise for the same exposure
  • Larger sensor systems usually allow for the option of a more shallow DOF
  • Larger sensor systems often resolve more detail

This, of course, invites the question as to when smaller sensor systems will have "higher IQ".  This can happen when:

  • The sensor for the smaller format is more efficient and equivalent photos are being taken
  • The lenses designed for the smaller sensor system are superior optics in terms of bokeh, flare, distortion, etc.
  • The smaller sensor system has an operational advantage such as more accurate AF or in-camera IS

If we think about all these situations, it's easy to see how the balance of these advantages and disadvantages play into the type of photography a person does.  The more narrow the scope of photography, the easier it is for one system to be superior to another for the particular application.  The more broad the scope, the more difficult it is for a single system to be able to be a clear winner overall.

Typically, cameras of the same generation have sensors that are close in terms of efficiency, but sometimes one camera may have a significant advantage over another based on its sensor (see here for a partial lists of camera sensors and their efficiencies).  Another important consideration is focus accuracy -- if the photo is OOF (out-of-focus), resolution is a moot point.  In addition, in-camera IS is sometimes a very powerful plus.  While many argue that in-lens IS is superior to in-camera IS (but neither are as good as using a chicken -- click here for a demonstration), it is definitely not superior if the lenses you use do not have it.

Regardless of what IQ differences there may be between systems, we have to decide when, if ever, these differences in IQ have any meaning.  For example, a Suzuki GSXR-1000 may significantly outperform a Yamaha R-6 on a track, presuming the driver is skilled enough to make use of the extra performance.  But if all you use the bikes for is traveling back and forth to work or school, the difference in performance between the bikes is meaningless -- it is more a matter of comfort, gas mileage, and other aspects of the bike that matter more by far.

Thus, it is my opinion that for the sizes that most people print (or display on the web), the differences in IQ (and DOF options) between modern systems are insignificant for the vast majority in most situations, just as the performance differences in bullet bikes is insignificant for most riders.  Instead, the the primary consideration for most people when choosing a system is the size, weight, price, and operation of the system, depending, of course, on how different the sensor sizes are.

 

 

 

 

 

MYTHS AND COMMON MISUNDERSTANDINGS

 

The motivation behind this essay on "equivalence" was prompted by the many myths about the differences between formats.  In particular, the following myths and misunderstandings are common:

 

1) f/2 = f/2 = f/2

This is perhaps the single most misunderstood concept when comparing formats.  Saying "f/2 = f/2 = f/2" is like saying "50mm = 50mm = 50mm".  Just as the effect of 50mm is not the same on different formats, the effect of f/2 is not the same on different formats.

Everyone knows what the effect of the focal length is -- in combination with the sensor size, it tells us the AOV (diagonal angle-of-view).  Many are also aware that  f-ratio affects both DOF and exposure.  It is important, however, to understand that the exposure (the density of light falling on the sensor -- photons / mm²) is merely a component of the total amount of light falling on the sensor (photons):  Total Light = Exposure x Effective Sensor Area, and it is the total amount of light falling on the sensor, as opposed to the exposure, which is the relevant measure.

Within a format, the same exposure results in the same total light, so the two terms can be used interchangeably, much like mass and weight when measuring in the same acceleration field.  For example, it makes no difference whether I say weigh 180 pounds or have a mass of 82 kg, as long as all comparisons are done on Earth.  But if makes no sense at all to say that, since I weigh 180 lbs on Earth, that I'm more massive than an astronaut who weighs 30 lbs on the moon, since we both have a mass of 82 kg.

The reason that the total amount of light falling on the sensor, as opposed to the density of light falling on the sensor (exposure), is the relevant measure is because the total amount of light falling on the sensor, combined with the sensor efficiency, determines the amount of noise and DR (dynamic range) of the photo.

For a given scene, perspective (subject-camera distance), framing (AOV), and shutter speed, both the DOF and the total amount of light falling on the sensor are determined by the diameter of the aperture.  For example, 80mm on FF,  50mm on 1.6x, and 40mm on 4/3 will have the same AOV (40mm x 2 = 50mm x 1.6 = 80mm).  Likewise, 80mm f/4, 50mm f/2.5, and 40mm f/2 will have the same aperture diameter (80mm / 4 = 50mm / 2.5 = 40mm / 2 = 20mm).  Thus, if we took a pic of the same scene from the same position with those settings, all three systems would produce a photo with the same perspective, framing, DOF, and put the same total amount of light on the sensor, which would result in the same total noise for equally efficient sensors (the role of the ISO in all this is simply to adjust the lightness of the LCD playback and/or OOC jpg).

Thus, settings that have the same AOV and aperture diameter are called "Equivalent" since they result in Equivalent photos.  Hence, saying f/2 on one format is the same as f/2 on another format is just like saying that 50mm on one format is the same as 50mm on another format.
 

2) Larger sensor systems are bulky and heavy

While larger sensor systems usually are more bulky and heavy than smaller sensor systems, this is not necessarily the case.  In fact, sometimes even the exact opposite is true.  The reason is not as much due to the larger sensor as it is due to the fact that the lenses designed for larger sensor systems usually have larger maximum aperture diameters than lenses designed for smaller sensors.  But when equivalent lenses do exist in both systems, such as the 35-100 / 2 on 4/3 vs the 70-200 / 4L IS on 35mm FF, the lenses for the larger sensor systems are usually lighter (but often longer for the telephoto lenses) and less expensive.  There are exceptions, of course, such as the Canon 300 / 2.8L IS on 1.6x vs the Canon 500 / 4L IS on FF.  But if reach is the primary consideration, and light gathering ability secondary, then smaller sensor systems will usually have a size/weight/price advantage, the most extreme example of this being the 12x zooms of compact digicams.  Thus, smaller sensor systems are usually significantly smaller, lighter, and less expensive when compared only for the same AOV, but not when compared for the same AOV and aperture diameter.
 

3) Larger sensor systems have a DOF that is "too shallow"; smaller sensor systems have more DOF

Larger sensor systems usually have the option of a more shallow DOF than smaller sensor systems with existing lenses, since the lenses for larger sensor systems usually have larger maximum aperture diameters for a given AOV.  However, the photographer using the larger sensor system can always stop down to get whatever DOF they need, keeping in mind that for the same perspective and framing, the effects of diffraction softening affect all systems equally at the same DOF, not the same f-ratio.

For photographers who shoot in Auto, P, or Tv mode, the camera may well often choose a more shallow DOF on a larger sensor system than a smaller sensor system, and thus make the smaller sensor system more convenient for photographers who do not want to choose the appropriate f-ratio themselves by using Av or M mode, or by adjusting in the other modes.

However, for the extreme end of the deeper DOFs, the lenses for smaller sensor systems will often, but not always, be able to go deeper.  At such DOFs, the effects of diffraction softening will be severe (although not necessarily apparent for images resized for web display).  For example, the 50/2 macro on 4/3 attains it's minimum aperture at f/22 (equivalent to 100mm f/45 on 35mm FF), the 60 / 2.8 macro on 1.6x attains it's minimum aperture at f/32 (equivalent to 96mm f/51 on 35mm FF), and the Canon 100 / 2.8 macro on FF attains its minimum aperture at f/32 (but the Sigma 105 / 2.8 macro stops down to f/45).  However, if the FF shooter needs deeper DOFs, they can simply use the same focal length as 4/3 or 1.6x, in conjunction with a 2x or 1.4x TC, respectively.  For example, the 50 / 2.8 macro used on 35mm FF attains its minimum aperture at f/32, and is equivalent to a 100 / 5.6 macro with a maximum f-ratio of f/64 if used with a 2x TC.  It is important to note that at such small apertures, that image degradation from the TC is insignificant in comparison to detail loss from diffraction softening.

Of course, what's good for the goose is good for the gander.  That is, all systems can use TCs to increase their DOFs.  However, at f-ratios at and beyond f/22 (in terms of 35mm FF equivalents), the effects of diffraction softening are so strong that they overwhelm any other IQ differences between systems, keeping in mind that all systems suffer the effects of diffraction softening equally for the same DOF at a given level of detail.  Thus, at the extreme deep end of the DOF spectrum (f/22 and beyond on FF), there is virtually no difference between systems in terms of IQ.
 

4) Larger sensors require sharper lenses

In fact, the exact opposite is true.  First of all, as discussed in Myth #1, it is important to compare systems at the same DOF when discussing sharpness, since if we don't, the system with the more shallow DOF will have less of the scene within the DOF, and thus appear less sharp.  So, given that we are comparing systems at the same DOF, consider the following scenario:  we have two targets, each with the same number of squares on them covering the entire area.  Since both targets have the same number of squares, the squares on the larger target will be larger than the squares on the smaller target.  Thus, when trying to hit the squares on the smaller target, we need to be more accurate than when aiming at the squares on the larger target.  For example, if the larger target has twice the length and width of the smaller target, then we need to be twice as accurate to hit the smaller squares on the smaller target.  This is why the MTFs for 4/3 lenses use 20/60 as compared to 10/30 for FF.  In the same way, a lens on a larger sensor does not need to be as sharp as a lens on smaller sensor to resolve the same amount of detail.  Lenses that are able to resolve the same detail on sensors with the same number of pixels on their respective formats have the same relative sharpness.

For example, consider the Zuiko 150 / 2 on 4/3 and the Canon 300 / 4L IS on 135, which are equivalent lenses on their respective formats -- that is, both have the same AOV and maximum aperture diameter.  The 150 / 2 tested at 49 lp/mm wide open, whereas the 300 / 4L IS tested at 36 lp/mm wide open.  Since the 4/3 sensor is 13mm tall, and the 135 sensor is 24mm tall, these figures translate to 49 lp/mm · 13mm/ih = 637 lp/ih for the 150/2 and 36 lp/mm · 24mm/ih = 864 lp/ih for the 300 / 4L IS.  In other words, even though the 150 / 2 is the sharper lens, the 300 / 4L IS out resolves it on the larger sensor.

However, since lenses for FF can be used on both crop and FF, the manufacturer MTFs overstate the lens performance on cropped sensors since they are reported at 10/30 instead of 15/45 (1.5x) or 16/48 (1.6x).  Another issue, of course, is that MTF charts usually show only wide open and f/8 performance, which means we are unable to use these charts to compare at the same, or even similar, DOF.  Thus, we need to rely on other tests to make system comparisons.  However, all the web "lens tests" are actually system tests.  That is, they evaluate the performance of the lens on a particular camera.  The problem with this form of testing is that it confounds both the sensor resolution and the effect of the AA filter with the lens performance.  So, while system "lens tests" are more useful for comparing the actual systems tested, they need to be continually updated to reflect current sensor resolutions and AA filter strengths.

Thus, the pixel count of the sensor, along with the AA filter, are both critical to system resolution.  Many subscribe to the myth that system resolution is largely limited by the lens in the case of modern sensors, but the reality is that current pixel densities are far from making systems "lens limited".  More pixels will always resolve more detail, but not necessarily as much more detail as the increase in pixels suggests.  For example, a 20 MP sensor will resolve less than double the detail (41% more linear resolution) of a 10 MP sensor unless the lens resolution is much greater than the sensor resolution.  As the pixel count continues to increase, the return becomes disproportionately smaller as the sensor resolution approaches the lens resolution.  However, as lenses get updated with newer and sharper versions, the limiting pixel densities will concomitantly increase.  In any event, for lenses with the same relative sharpness, the system with the greater pixel count will resolve more detail.  Thus, the level of detail in an image depends both the on how many pixels the sensor has and how well the glass is able to resolve those pixels.  This is also why FF glass will almost always perform better on FF sensors than on cropped sensors, unless the glass is significantly higher than the sensor resolution.  For the scenario when lens resolution is well beyond sensor resolution, the system performance will be primarily a function of the pixel count.

One issue that the lenses for FF sometimes suffer is that the sharpness is not as even across the frame as it is for smaller sensor systems.  This is usually not an issue at larger aperture diameters, except for far off-center composition, since the corners are rarely within the DOF at large aperture diameters.  However, some lenses far poorly in the corners even at relatively deep DOFs, such as the Nikon 70-200 / 2.8 VR, which is discussed in Myth 5 below.  In these cases, while the FF system may resolve better in the central region of the image, they may resolve worse in the corners.  Thus, while the overall sharpness is often the same or better with FF, the issue of evenness of frame needs to be considered, and taken on a lens-by-lens basis.

Thus, while smaller sensor systems usually have sharper glass, that does not necessarily give them sharper end results -- they need that extra sharpness just to "break even".  In practice, however, for the same AOV and DOF, the comparable glass for smaller sensors does not appear to hit that break-even point until the edges of the image, where, in some cases (usually UWA), they outperform FF glass in the extreme corners.  But since the larger sensor systems almost always have more pixels and resolve more detail in the central area of the image, if the extreme corners are not satisfactory, you have the option to frame wider and crop.  What is meant by "comparable glass"?  This is tricky, but generally lenses at, or near, the same price-point.  For example, we wouldn't call the Olympus 14-35 / 2 on 4/3 ($1840) or the Nikon 17-55 / 2.8 ($1130) on 1.5x "comparable" to the Canon 24-85 / 3.5-4.5 on FF ($310), since the prices are so different, even though they have nearly the same AOV and range of aperture diameters.  But we could call them comparable to the Canon 24-70 / 2.8L ($1255) even though the Olympus lens still costs significantly more and both have smaller aperture diameters, since we are now comparing the best against the best from each manufacturer that have comparable AOV ranges on their respective systems.  Alternatively, we could also call the Canon 24-105 / 4L IS ($1060) "comparable" since it is also "top glass" for Canon FF, has the same aperture diameter, and a zoom range that includes the AOVs of the aforementioned competitors' lenses.

When comparing systems, then, we must carefully articulate the reasons for choosing the lenses used in the comparison, since those reasons may be "invalid" depending on the use of the lens as it is rare to find two lenses from two different systems that enjoy the same range of AOV, aperture diameters, and price.
 

5) Larger sensor systems have softer edges and more vignetting than smaller sensor systems

Once again, as discussed in Myth #1, this belief is a result of people comparing systems at the same f-ratio rather than the same aperture diameter.  At the same f-ratio, the larger sensor system will have a larger aperture diameter, and thus a more shallow DOF, which will result in the areas of the scene outside the DOF being OOF (out-of-focus), as well as greater vignetting.  A more fair comparison for edge sharpness is to compare at the same DOF, or, often even more appropriate, at the lenses' sharpest settings, since it is rare that edge sharpness plays a role in high ISO photography.  However, it is disingenuous to compare edge sharpness and vignetting by artificially handicapping the larger sensor system with the same f-ratio as the smaller sensor system.

Of course, as we know, glass does not have the same sharpness across the image.  For example, the issue of telecentricity for UWAs causes a sharp drop in the MTF for many UWAs in 35mm FF lenses (the Nikon 14-24 / 2.8 being a remarkable exception).  Thus, the image may be "sharp enough" in the center, but too soft in the corners.  This is what happens when comparing, for example, 4/3 lenses with 35mm FF lenses.  The 4/3 lenses are sharper than the 35mm lenses (on average), but they need to be sharper to resolve the smaller pixels of their sensors.  And while 35mm FF glass is easily "sharp enough" for the center of the image, it is sometimes not "sharp enough" for the extreme corners (for some UWAs) even at the same DOF, despite the larger pixels.  Thus, near the edges, the sharper glass on a smaller sensor may outperform the less sharp glass on a larger sensor, but the amount of the corners where this reversal in sharpness occurs is dependent on which lenses are being compared, and must be taken on a lens by lens basis.  In fact, sometimes the larger sensor system will have the sharper corners (the evidence section of this essay gives examples).

An interesting case is the earlier version of the Nikon 70-200 / 2.8 VR which, according to a test conducted by DPR, performs significantly better in the corners on 1.5x than it does on FF even for the same perspective, FOV, and DOF (although, again, the rest of the image is sharper on FF).  However, the reviewer noted that this is almost certainly since the lens was optimized for 1.5x, since Nikon had no FF DSLRs, or even plans for one, at the time of the introduction of the lens, and is quite different from how Canon's 70-200 / 2.8L IS performs.  Another good example is the Canon 24 / 1.4L II on 1.6x vs the Canon 35 / 1.4L on FF.  The two have nearly identical performance at the same DOF in all areas that www.slrgear.com tests.  Of course, the 24 / 1.4L II is a newer lens and costs significantly more than the 35 / 1.4L, but such a comparison is fair to make since they are both top level lenses with the same FOV for their respective systems.  The question, then, is how would the 24 / 1.4L II on a 50D compare to a 35 / 1.4L on a 5D II?  The answer is that the 5DII image would likely deliver more detail, since it has more (and larger) pixels, as well as deliver an optional more shallow DOF, if desired.

However, back to the UWA situation, there is another angle to this story.  DSLRs with a 3:2 aspect ratio must shoot wider and then crop to match the FOV of a 4:3 aspect ratio, and this cropping all but eliminates the soft corners, if they even exist.  For example, for a Canon 5D to match the perspective, framing, and DOF of an Olympus E3 shooting at 7mm f/4, it would have to shoot at 12.5mm f/7.1 and crop to a 4:3 aspect ratio.  This would leave 10 MP on the 5D image, which would still match the pixel count and framing of the E3 image, while eliminating the extreme corners.  Likewise, the Canon 5DII (FF) has more pixels than the Canon 50D (1.6x), which also gives it more cropping latitude.  However, a 50D has more pixels than a 5D, so the 5D would have no such luxury, except if the lens were unable to sufficiently resolve the smaller pixels of the 50D.  In this case, a cropped image from the 5D, despite having less pixels, would likely retain the same, or even more detail, in the instances that we would need to frame wider and crop the corners out.

Stopping the larger system's lens down to normalize the DOF has the additional benefit of increasing the sharpness of the lens (especially in the corners) and reducing vignetting.  Many 4/3 proponents like to cite their glass as being "sharp wide open" with no significant vignetting.  However, "wide open" for 4/3 is "stopped down" for 35mm FF.  For example, let's compare the Leica 25 / 1.4 on 4/3 with the Canon 50 / 1.4 on 35mm FF.  With both lenses at f/1.4, the 4/3 lens will surely have the superior image in terms of sharpness and vignetting, but the 4/3 image with have a DOF that is be two stops deeper.  Stopping the Canon lens down to the same DOF (f/2.8) will produce a sharper image (even in the corners) with the same or even less vignetting.  If the 35mm FF system must also raise the ISO two stops to match the shutter speed, all this means is that the 35mm FF system loses its advantage in relative noise, but it is not at a disadvantage for relative noise (for sensors with the same efficiency and images with the same level of detail).

Of course, this is not to say that the corners and vignetting are exactly the same between systems at the same DOF -- it most certainly varies from lens to lens.  However, it makes little sense to compare corners and vignetting at different DOFs, and it is because people compare at the same f-ratio rather than the same DOF, that the myth of 35mm FF having softer edges and greater vignetting exists.
 

6) Assuming "equivalent" means "equal"

It is important to distinguish between "equivalent" and "equal" -- "equal" is a much stronger condition than "equivalent".  As stated in the Definition of Equivalence, Equivalent photos are photos of a given scene that have the:

As a corollary, Equivalent lenses are lenses that produce Equivalent photos on the format they are used on which means they will have the same AOV (angle of view) and the same aperture diameter.

If the images were "equal", then they would also have the same noise, detail, color, bokeh, distortion, etc., etc., etc.  These elements of IQ are what make "equal" a much stronger condition than "equivalent", and are a function of the sensor efficiency, pixel count, lens design, AA filter, CFA, etc., etc., etc.  However, for different systems, equivalent images will often, if not usually, be the images that look most similar in appearance, as the photos here demonstrate.

The most talked about aspect of equivalent images is that the relative noise will be the same.  This notion is predicated on the premise that since equivalent images are made from the same total amount of light, then the photon noise will be the same.  However, differences in sensor efficiencies will affect not only how efficiently the sensor captures the light that falls on it (and thus affect the photon noise), but also how efficiently that noise is processed (read noise).  In practice, differences in sensor efficiencies for sensors of a given generation are much less significant than the total amount of light that falls on the sensor, so equivalent images will have roughly the same total relative noise.

Likewise, if the sensors had the same same number of pixels, the same AA filter, and the lenses resolved in proportion to the sensor sizes, then Equivalent photos would also result in the same detail.  Since photos recorded on smaller sensors need to be enlarged more than photos recorded on larger sensors, lenses on smaller formats must be sharper in proportion to this enlargement ratio to resolve as well as lenses on a larger sensor for a given pixel count and AA filter.  For example, an mFT (4/3) lens needs to be twice as sharp as a FF lens and an APS-C lens needs to be 1.5x or 1.6x (depending on the particular APS-C camera) as sharp as a FF lens.

Of course, there are still other elements of IQ that differentiate "equal" from "equivalent", such as color, distortion, etc.  In short, the five parameters of Equivalence are simply five visual aspects of the photo that are completely independent of the technology, but this is not to say that visual elements dependent on the technology may not matter more for any particular photo.  How close other visual elements are for Equivalent photos depends on the particular sensors and lenses being compared.
 

7) Assuming Equivalence is based on equal noise

The most controversial visual property of equivalent images is that people incorrectly assume that Equivalence is based on equal noise.  Equivalence is based on the five principles listed above, which do not include noise, nor any other elements of IQ.  The primary elements in image noise, in order, are:

  • The Total Amount of Light that falls on the sensor (exposure · sensor area)
  • The percent of this light that is recorded by the sensor (QE -- quantum efficiency)
  • The additional noise added by the sensor and supporting hardware (electronic noise)

Other factors, such as ISO and pixel count / size play a minor role in relative noise compared to the above three factors.  Since equivalent photos are made from the same total amount of light, and sensors of the same generation often (but not always) have similar efficiency (see here), equivalent photos from cameras of the same generation will usually have similar relative noise.  People commonly believe that larger sensor systems have less relative noise because they have better sensors, when, in fact, it is instead because they collect more total light for a given exposure.

Thus, breaking the properties of Equivalence down into the properties of the photo, lens, and sensor:

  • If photos are taken of the same scene from the same position with the same focal point, have the same framing, same display dimensions, and same aperture diameter, they will have the same DOF (and diffraction).
  • If the exposure time is also the same, then the photos will also have the same motion blur / camera shake, as well as be made from the same total amount of light.
  • If the sensors record the same proportion of light falling on them (same QE) and add in the same electronic noise (the noise from the sensor and supporting hardware), then the noise will be the same regardless of pixel count and ISO setting (keeping in mind that sensors of the same, or nearly the same, generation typically record very nearly the same proportion of light falling on them regardless of brand, size, or pixel count (a notable exception would be BSI tech which records a third to half a stop more light for a given exposure than non-BSI tech) and that the electronic noise matters only for the portions of the photo made with very little light).
  • If the lenses resolve proportionally the same on their respective sensors at the same DOF (e.g. if a 25mm lens at f/1.4 on mFT resolved 2000 lp/mm, a 31mm lens on 1.6x at f/1.8 resolved 1600 lp/mm, a 33mm on 1.5x at f/1.9 resolved 1500 lp/mm, and a 50mm lens on FF at f/2.8 (or any equivalent relative apertures), the sensors have the same number of pixels, and the AA filter introduces the same blur, then all photos will have the same resolution (lw/ph).

 

8) Larger sensor systems have less noise because they have larger pixels / higher ISOs result in more noise

The reason so many feel that smaller pixels result in more noise is that smaller sensor systems usually have smaller pixels than larger sensor systems.  While smaller pixels, individually, will be more noisy (for a given exposure and sensor efficiency) because they record less light, there are more pixels.  That is, the noise in a photo is not determined by a single pixel, but the combined effect of all the pixels where a greater number of smaller pixels will capture the same total amount of light as a fewer number of larger pixels, and thus have the same photon noise, so long as the QE (Quantum Efficiency -- the proportion of light falling on the sensor that is recorded) is not adversely affected by pixel size, which is the case for sensors of a given generation.

However, the other primary source of noise in a photo is the read noise, which is the additional noise added by the sensor and supporting hardware.  If the read noise of the pixels scales linearly with the size of the pixel, the overall read noise will be the same.  For example, if a 1x1 pixel had half the read noise of a 2x2 pixel, then the combined read noise for four 1x1 pixels would be the same as a single 2x2 pixel (noise adds in quadrature, not linearly, which is discussed in the link on read noise).  While the QE of a pixel tends to be the same across the board for pixel sizes for any given generation of sensors, the read noise per pixel varies dramatically.  If any generalization can be made, it would be that the read noise per pixel also tends to be about the same, regardless of the pixel size (although, as stated, there is tremendous variation, even for sensors of the same generation).

While the photon noise is usually the dominant source of noise in a photo, as the exposure gets lower and lower, the read noise becomes more and more important compared to the photon noise.  Thus, for very low light scenes where a photographer might be using, say, a setting of ISO 3200 or more, a sensor with more pixels will tend to be more noisy than a sensor with fewer pixels, all else equal, and will become worse as the exposure gets even lower.  How much worse depends very much on the specifics of the differences in read noise per area between the sensors.

Another additional wildcard is NR (Noise Reduction, or, more properly, Noise Filtering), which is a much more efficient method of dealing with noise than downsampling.  More pixels will record more detail, but how much more detail depends on the specifics of the scene (e.g. motion blur) and the sharpness of the lens.  If the photo with more pixels has "enough" more detail than the photo with fewer pixels, then the application of NR to normalize the detail between the two photos may tip the noise advantage in favor of the photo with more pixels, even if the initial image file were more noisy.

Thus, for fully equivalent images, where both the DOF and shutter speeds are the same, all systems will collect the same amount of light, regardless of the pixel size.  The system that will have the lesser amount of noise will be the system that has the more efficient sensor and/or the system that resolves "enough" more detail since that additional detail can be traded, via NR, for less noise.

Furthermore, the belief that higher ISOs result in more noise is a common misinterpretation as to what is actually taking place.  Yes, higher ISO images usually result in greater noise, but this is because using a higher ISO results in either a faster shutter speed and/or a smaller aperture for a given scene, the effect of which will be less light on the sensor.  It is the lesser amount of light falling on the sensor that results in the greater noise, not the higher ISO per se.  In fact, the higher ISO setting results in slightly less noise for a given exposure (that is, for a given f-ratio and shutter speed, the higher ISO setting will result in less noise) for some sensors.  For example, if we took a photo of a scene at ISO 100 and ISO 1600,  both with the same f-ratio and shutter speed, and pushed the ISO 100 photo four stops in the RAW conversion to achieve the same lightness as the ISO 1600 photo, it would be more noisy than the ISO 1600 photo on some sensors (discussed in more detail here, with some examples linked).  The same would be true if we took a photo at ISO 1600 and pulled it down four stops to match the lightness of the ISO 100 photo, but at the expense of four stops of highlight detail.  In other words, the cause of noise is the amount of light falling on the sensor, and the efficiency of the sensor, not ISO setting, per se.

Thus, while it is not news to anyone that lower exposures result in greater noise, it is news to many that it is not the higher ISO setting that causes more noise, but instead the lesser amount of light falling on the sensor.  To minimize the noise  in a photo, we want to maximize the exposure within the constraints of how much light the sensor can absorb before oversaturating, the DOF / sharpness we wish to achieve, and the shutter speed necessary to offset motion blur and/or camera shake.


9) Comparing images at 100% rather than the same display dimensions

It is common for people to compare images at 100% -- that is, to compare images at the pixel level.  However, such a comparison would only make sense if each image was made from the same number of pixels.  For example, it makes no sense to compare a 4x6 print with an 8x12 print, just as it makes no sense to compare, for example, a 2000 x 3000 pixel image with a 4000 x 6000 image.  Comparing images that have different pixel counts causes a "scaling error".  That is, the photo made from more pixels is being viewed with greater enlargement, which leads to incorrect conclusions about noise and sharpness.

To properly determine which system has less noise, we need to compare so that the common elements are displayed at the same size.  Furthermore, we need to apply NR (noise reduction) to the more detailed image until it matches the amount of detail in the less detailed image, and then display the two photos with the same diagonal dimension.  Otherwise, we may find ourselves in the position of saying that one image is more noisy than the other, but, due to its greater detail, is actually more pleasing.  In other words, "more noisy" could mean "higher IQ" -- that is, the more detailed photo resolves both the inherent detail and noise more clearly, whereas the less detailed photo merely blurs both the noise and detail (see these photos for a demonstration).

The reason we display with the same diagonal dimension, as opposed to the same area, is because if we are comparing photos with different aspect ratios, then the same diagonal display will result in the common elements in both photos being displayed at the same size if the photos were taken of the same scene from the same position with the same AOV.

Some argue that the process of resampling the image with the smaller pixel count to the dimensions of the image with the larger pixel count is unfair to the smaller image since the upsampling introduces a new variable into the comparison.  However, this variable is always introduced regardless.  That is, we either resize an image for web display, or the printer will automatically interpolate the image for printing, regardless of whether we upsample or not.  The most fair method for comparing at 100% is to carefully resample both images to a common dimension, so that neither system is favored.
 

10) Larger sensor systems gather more light and have less noise than smaller sensor systems

For the same AOV, lenses for larger sensor systems often have larger aperture diameters which gather more light than smaller sensor systems, and thus deliver less noisy images even if the sensor for the larger sensor system is less efficient (to a degree).  However, choosing a larger aperture diameter also results in a more shallow DOF, more vignetting, and softer corners.  For fully equivalent images, however, all systems gather the same total amount of light.  Thus, any differences in the relative noise and dynamic range will be due to differences in the sensor efficiencies, and, contrary to popular belief, larger sensors are not necessarily more efficient than smaller sensors.  On the other hand, in situations where motion blur is not an issue, or even desirable, systems that have in-camera IS or IS lenses can gather more light by using a slower shutter speed and achieve an advantage in relative noise over other systems lacking IS when a tripod is not used.

 

 

 

 

 

EXPOSURE, LIGHTNESS,  AND TOTAL LIGHT

 

For a given perspective and framing, the same aperture diameter will result in the same DOF and diffraction.  For a given scene luminance and shutter speed, the same aperture diameter will result in the same total amount of light falling on the sensor.  The concept of Equivalence is controversial because it replaces the exposure-centric paradigm of exposure, and its agent, relative aperture (f-ratio), with a paradigm based on total light, and its agent, effective aperture (entrance pupil).  The first step in explaining this paradigm shift is to define exposure, lightness, and total light, and, in the following two sections, how this relates to both noise and dynamic range.

This section will answer the following four questions:

  • For a given scene, what is the difference in exposure, if any, between f/2.8 1/200 ISO 400 and f/5.6 1/200 ISO 1600?
  • What role does the ISO setting play?
  • What role does the sensor size play?
  • What does any of this have to do with the visual properties of the photo?

The exposure is the average density of light (total light per area) for a given luminosity function that falls on the sensor while the shutter is open, which is usually expressed as the product of the illuminance of the sensor and the time the shutter is open.  The exposure can be measured as the luminous energy density (lux · seconds -- this is the standard definition and standard units) or as the average photon density (where 1 lux · second = 4.1 billion photons / mm² for green light  -- 555 nm).  The only factors in the exposure are the scene luminance, f-ratio, shutter speed, and transmissivity of the lens (note that neither sensor size nor ISO are factors in exposure).

The transmissivity of the lens is the proportion of light falling on the lens that passes through to the sensor since some light is absorbed and/or scattered by the glass elements in the lens.  This difference is often quantified by the t-stop, where a transmission of 79% represents a 1/3 stop loss of light, 63% represents a 2/3 stop loss of light, and 50% represents a one stop loss of light.  For example, if 79% of the light falling on an f/2.8 lens makes it through to the sensor, we would say the lens has a t-stop of f/3.2 (click here to see some examples of this differential for a few lenses for the Nikon system).  For convenience purposes, this essay assumes lenses with equal transmission, so the f-ratio can be used interchangeably with the t-stop, in a comparative sense.

Thus, if we have two photos of the same scene, one at f/2.8 1/200 ISO 100 and another at f/2.8 1/200 ISO 400 (on any system, regardless of format), both will have the same exposure, since the same number of photons per unit area will fall on the sensor, but the ISO 400 photo will appear 4x (2 stops) brighter than the ISO 100 photo.

The lightness of a photo (what people usually think of as "exposure") is how light or dark the photo appears overall. For example, an exposure at an ISO 400 setting will be mapped into the image file such that the photo will appear 4x lighter than if the same exposure were taken at ISO 100.  Alternatively, an exposure at ISO 400 will result in a photo with the same lightness as a photo with 4x the exposure at ISO 100.

The role of the ISO setting on the camera in the exposure is in how the setting indirectly changes the actual exposure by having the camera choose a different f-ratio, shutter speed, and/or flash power.  For example, changing the ISO from 400 to 1600 may result in the camera choosing f/5.6 instead of f/2.8, 1/200 instead of 1/50, f/4 1/100 instead of f/2.8 1/50, etc.  In addition, the ISO control on the camera often results in a greater analog amplification to the signal before it is digitized by the ADC (Analog-to-Digital Converter) for higher ISO settings, which results in less read noise than using base ISO and a digital push for sensors with noisy ADCs (non-ISOless sensors), although a digital push/pull may also be used.

In addition, it is worth noting that "extended" ISO settings are purely a digital push from the highest ISO setting at the high end of the ISO setting (e.g. ISO 25600 underexposes by two stops with the same amplification as ISO 6400, then digitally pushed by two stops), and purely a digital pull at the low end (e.g. ISO 50 is ISO 100 overexposed by one stop, then digitally pulled back a stop).

The total light is the total amount of light that falls on the portion of the sensor used to for the photo during the exposure:  Total Light = Exposure · Effective Sensor Area.  The same total amount of light will fall on the sensor for equivalent photos but, for different formats, this will necessarily result in a different exposure on each format, since the same total light distributed over sensors with different areas will result in a lower density of light on the larger sensor.  Using the same example above, photos of the same scene at f/2.8 1/200 on mFT (4/3) and f/5.6 1/200 on FF will result in the same total light falling on each sensor, but the exposure will be 4x (2 stops) greater for the mFT photo, and thus the FF photographer would usually use a 4x (2 stops) higher ISO setting to get the same lightness for the LCD playback and/or OOC (out-of-the-camera) jpg.

Lastly, the Total Light Collected is the amount of light that is converted to electrons by the sensor, which is the product of the Total Light that falls on the sensor during the exposure and the QE (Quantum Efficiency of the sensor -- the proportion of light falling on the sensor that is recorded).  For example, if QE = 1, then all the light falling on the sensor is recorded (for reference, the Olympus EM5, Canon 5D3, and Nikon D800 all have a QE of approximately 50% in the green channel).

For low f-ratios, the efficiency of the microlens covering plays a very important role in how much light that falls on the sensor is directed into the active area of the pixel.  DxOMark's article, F-Stop Blues, created something of a stir over this matter.  To direct all the light into the active portion of the pixel, the f-ratio of the microlens must be smaller than the product of the f-ratio of the lens and the proportion of the active area of the pixel.  For example, for an f/2 lens with 50% of the pixel area being active, the microlens needs to be f/2 · 0.5 = f/1.  Of course, it quickly becomes obvious why there are problems getting all the light into the pixel for lenses faster than f/2.  This issue is being addressed on newer sensors with technologies such as light-pipes, stacking microlenses, and BSI (backside illumination).

A consequence of inefficient microlenses is that a larger sensor system will have an advantage over a smaller sensor system for a given microlens efficiency.  Specifically, larger sensor systems will use a higher f-ratio for the same DOF. resulting in more light reaching the photosensitive area of the pixel,  For example, FF will suffer less light loss at f/2.8 than 1.6x at f/1.8 loss than mFT (4/3) at f/1.4, assuming the same microlens efficiency.

In terms of IQ, the total light collected is the relevant measure, because both the noise and DR (dynamic range) of a photo are a function of the total amount of light that falls on the sensor (along with the sensor efficiency, all discussed, in detail, in the next section).  That is, noise is determined by the total amount of light falling on the sensor and the sensor efficiency, not the ISO setting on the camera, as is commonly believed (the ISO setting is simply a matter of processing the signal, discussed in more detail here).  In other words, the less light that falls on the sensor, the more noisy and darker the photo will be.  Increasing the ISO setting simply brightens the captured photo making the noise more visible.

For a given scene, perspective, and framing, the total light depends only on the aperture diameter and shutter speed (as opposed to the f-ratio and shutter speed for exposure).  Fully equivalent images on different formats will have the same lightness and be created with the same total amount of light.  Thus, the same total amount of light on sensors with different areas will necessarily result in different exposures on different formats, and it is for this reason that exposure is a meaningless measure in cross-format comparisons.

Mathematically, we can express these four quantities rather simply for a given luminosity function:

  • Exposure  (photons / mm² or lux · seconds) = Sensor Illuminance (photons / mm² / s or lux) · Time (s)
  • Total Light (photons or lumen · seconds)     = Exposure (photons / mm² or lux · seconds) · Effective Sensor Area (mm²)
  • Total Light Collected (electrons)                  = Total Light (photons) · QE (electrons / photon)
  • Lightness (RGB value)                                 = Total Light Collected mapped into a particular color space and recorded in the image file

So, we can now answer the questions posed at the beginning of the section:

The exposure (light per area on the sensor) at f/2.8 1/100 ISO 100 is 4x as great as f/5.6 1/100 ISO 400 for a given scene luminance, regardless of the focal length or the sensor size.  However, the lightness for the two photos will be the same  since the 4x lower exposure is brightened 4x as much by the higher ISO setting.  If the sensor that the f/5.6 photo was recorded on has 4x the area as the sensor as the f/2.8 photo (e.g. FF vs mFT), then the same total amount of light will fall on both sensors, which will result in the same noise for equally efficient sensors (discussed in the next section).

There is a small "niggle" involving different aspect ratios.  A 3:2 FF sensor, for example, does not have 4x the area as a 4:3 mFT sensor, but rather 3.84x the area.  This 4% difference, however, is inconsequential.  What it means is that the 4:3 mFT sensor records 4% more of the scene due to the more square aspect ratio for the same perspective and framing, but the corresponding portions of the scene on the 3:2 FF sensor do have exactly 4x the area as they do on the 4:3 mFT sensor.  For instance, if we took a photo of a person from the same position at 50mm on FF and 25mm on mFT, the area of the face of the person on the FF sensor would be exactly 4x the area of the fact of the person on the mFT sensor.

Let us now go into more detail on these points:

The ISO setting on the camera determines the mapping from the signal to the image file.  For example, at ISO 1600, the mapping will result in a value that will be represented 16x as bright as it would have been for the same scene luminance, f-ratio, and shutter speed at ISO 100.  That is, if the signal were 1000 electrons, at ISO 1600 it would be recorded in the image file as if the signal were 16000 electrons.  It is important to note that not all cameras apply the same amount of amplification for the same ISO setting.  For example, f/5.6 1/100 ISO 400 on one camera may not show the same lightness as f/5.6 1/100 ISO 400 on another camera.  The ISO standards allow the manufacturers a lot of latitude in their implementation of ISO standards which is what results in the differences in "highlight headroom" between cameras.

The total light is simply the total amount of light that is used to make up the photo, and is measured in  lumen·seconds, or, equivalently, photons.  The effective sensor area refers to the portion of the sensor that is being used for the final photo, or, when comparing photos with different aspect ratios, the area that both have in common.  For example, if we are cropping a FF sensor (36mm x 24mm) to 1:1, then the effective sensor area is 24 mm x 24 mm = 576 mm².  Alternatively, if we were comparing a 3:2 FF photo to a 4:3 mFT photo, the framing will be slightly different.  In this case, assuming the photos were taken with the same perspective and AOV, the effective sensor areas for the purposes of a noise comparison would be 831 mm² for the FF sensor and 208 mm² for the mFT sensor, since those areas are the respective areas of the sensors that cover the common scene in both photos.

The role of exposure in digital photography is nothing more than noise and/or highlight control.  A higher exposure will use more light to create the image, resulting in less relative noise.  However, the more light we use to create the image, the more we will run the risk of oversaturation (blowing highlights).  The only way to increase the exposure is by using a slower shutter speed (thus increasing the chance/effects of motion blur and/or camera shake), using a wider aperture (thus decreasing the DOF and reducing the image detail/sharpness, particularly in the corners), or increasing the amount of light on the scene (e.g. flash photography).

Thus, it does not necessarily make sense to compare systems with the same exposure.  For example, if we compare systems at the same f-ratio and shutter speed (same exposure), the larger sensor system will have a more shallow DOF, which may, or may not, be desirable.  If we instead compare systems at the same exposure and DOF, the larger sensor system will have to use a larger f-ratio and a concomitantly lower shutter speed, which will increase the risk of motion blur and/or camera shake.

However, if we instead compare systems with the same total amount of light, and thus different exposures, then the DOF and effects of motion blur / camera shake will be the same.  Any differences in noise will be due to differences in the sensor efficiencies, and not because the larger sensor system will require a higher ISO for the same lightness.  In other words, for equivalent images, the visual properties will be rather similar, but for non-equivalent images, the visual properties may be radically different.  Sometimes this difference will favor the larger sensor system, sometimes it will not -- it depends on the scene.  However, if non-equivalent settings put the larger sensor system at a disadvantage, the photographer can instead always choose equivalent settings instead.  Indeed, the photographer is served best by choosing the optimal settings for their system, keeping in mind that what constitutes "optimal" is not only subjective, but highly dependent on the scene.  However, while equivalent settings are not necessarily optimal for the larger sensor system, these settings remove the subjectivity from the comparison, and are applicable to all scenes.

It is instructive to understand why the same f-ratio results in the same exposure for the same scene and framing, regardless of the focal length or format.  There are six factors that determine how much is projected on the sensor:

  • The luminance of the scene
  • The amount of the scene that is recorded
  • The distance from the scene
  • The aperture area
  • The transmissivity of the lens elements
  • The exposure time

The amount of light from the scene depends on how wide we frame -- the wider we frame, the more light we will capture, since we are gathering light from a larger scene.  If we assume a scene with the same average luminance, framing twice as wide, for example, will result in collecting light from four times as much area, and thus four times as much light reaching the aperture.

The amount of  light from the scene reaching the aperture also depends on how far we are from the scene -- the further away we are, the less of that light that reaches the lens.  For example, if we are twice as far away,  only 1/4 as much light will fall on the lens in any given time interval.  It should be noted that the inverse square law is exact only for point sources, and becomes "less exact" the wider we frame.  The reason is that the distance from the camera to the center of the focal plane is not the same as the distance to the other portions of the frame.  So, when we increase the distance from the scene, the distance from the other portions of the frame do not increase in the same proportion as the center.  However, for most situations, the difference is trivial.

Furthermore, the amount of light from the scene falling on the aperture is proportional to the area of the aperture.  For example, if we double the diameter of the aperture, the area will quadruple, so four times as much light can pass through in any given time interval.  However, some light is lost as it travels through the lens which depends greatly on the lens (click here for some examples).  Lastly, the amount of the light passing through the aperture onto the sensor is proportional to the exposure time.  That is, halve the exposure time (double the shutter speed), and you halve the amount of light falling on the sensor.

Let's work a few examples, ignoring the effects of light lost from the elements of the lens, keeping in mind that the exposure is the total light per area falling on the sensor, not the total amount of light.  In other words, we can express the exposure as the quotient of the total amount of light falling on the sensor and the area of the sensor.

Say a photographer takes a "properly exposed" photo of a subject 10 ft away at 50mm f/2 1/100 (aperture diameter = 50mm / 2 = 25mm).  If they step back to 20 ft away and use 100mm f/2 1/100, the aperture diameter has doubled (100mm / 2 = 50mm) and the aperture area has quadrupled (area is proportional to the square of the diameter).  However, the amount of light from the scene reaching the lens is 1/4 as much since they're twice as far away. Since the aperture area is four times as much, it exactly compensates, and the same amount of light will pass through the aperture onto the sensor, and, since the sensor has not changed size, the exposure will also be the same.

Alternatively, let's say they don't step back, but instead remained at 10 ft and shot at 100mm f/2 1/100.  At 100mm, the framing will be twice as tight, and thus record only 1/4 the light of the scene as they would at 50mm (assuming, of course, a uniformly lit scene). Thus, despite the fact that 1/4 as much light is reaching the lens, since the aperture area is four times as great, it exactly compensates once again.  An excellent video on this can be seen here.

The above two examples demonstrate how the same f-ratio and shutter speed results in the same total light and exposure for a given scene and format regardless of focal length on the same format.   However, for different formats, the same exposure does not result in the same total amount of light falling on the sensor.

Let's now consider a photographer using using 50mm f/2 1/100 on mFT and another photographer with FF shooting the same scene from the same distance at 100mm f/4 1/100.  In both cases, the framing is the same (ignoring the minor differences in aspect ratio between the systems, which amounts to a mere 4% difference), the aperture diameters are the same (50mm / 2 = 100mm / 4 = 25mm), the distances from the scene are the same, and the shutter speeds are the same.  Thus, the same amount of light will pass through the apertures onto the sensors.

On the other hand, if the FF photographer shot the same scene from the same position at 100mm f/2 1/100, the aperture diameter would now be 100mm / 2 = 50mm as opposed to 25mm.  Since the aperture diameter is twice the size, the aperture area is four times as large, and four times as much light will fall on the sensor.  But, since the sensor has four times the area, the density of the light on the sensor would be the same as the mFT sensor, so the exposures would be the same.

Thus, for a given scene, perspective, framing, aperture diameter, and shutter speed, and transmissivity of the lens, the same amount of light will pass through the aperture onto the sensor for all systems.  So, for equally efficient lenses and sensors, f/2 = f/2 = f/2 in terms of exposure, regardless of format, but in terms of total light (and DOF and diffraction), f/2 (on 4/3) is equivalent to f/2.5 on 1.6x which is equivalent to f/4 on FF.  However, due to limitations in current microlens efficiencies, these equivalences for total light become less clear cut for f-ratios below f/2.

The reason, once again, that total light is so important, is because it is the total light that makes up the image (along with sensor efficiency) that determines the noise, not the exposure.  Of course, for a given format, the distinction between exposure and total light is not necessary to make.  But when comparing different formats, the distinction is crucial.

Hence, the same f-ratio will result in the same exposure for the same scene and framing regardless of format.  However, two different formats cannot simultaneously have the same exposure and same total amount of light for the same perspective and framing, since the same amount of light is being distributed on different areas.  There is one exception:  if we use the same perspective and focal length on both formats, and then crop the larger sensor image to the same framing as the smaller sensor image, then we will have the same exposure and total light (as well as the same DOF) if we use the same f-ratio.  However, in practice, the only time I know of when an image from a larger format uses the same perspective and focal length, and is subsequently cropped to the same framing as the smaller format, is when the larger format is focal length limited, or for greater apparent magnification (macro).

In other words, the exposure matters only inasmuch as it is a component of the lightness and total light -- it is not an important measure in and of itself.  That is, when we look at a photo, we can see how bright or dark it appears (lightness), the noise, and DR.  But we cannot see the exposure itself, so it is not important except as a means to an end:  Total Light Collected = Exposure · Effective Sensor Area · QE.  Knocking exposure from it's time-honored pedestal and replacing it with Total Light is a radical statement that many have difficulty coming to terms with, but this is a key point to understanding Equivalence.

The bottom line is that we want to maximize the total amount of light the camera records within the constraints of DOF / sharpness, motion blur / camera shake, and noise / blown highlights.  For a given DOF and shutter speed, the same total amount of light will fall on the sensor for all systems, and thus any differences in noise will be due to differences in sensor efficiencies.  The advantage for larger sensor systems comes when they are able to gather more light than smaller sensor systems, either by using a longer shutter speed for a given DOF at base ISO (the smaller sensor would oversaturate) or using a wider aperture (entrance pupil) for a given AOV and exposure time, which will necessarily result in a more shallow DOF (which may be reduced to a "necessary evil", depending on the scene and artistic intent).

In short, exposure is relevant only inasmuch as it is a component of the total amount of light recorded by the sensor.  While we can use use exposure and total light interchangeably for a given format, the distinction is essential when comparing across formats.

 

 

 

 

 

NOISE

 

Let's begin by considering a couple of experiments you can do at home.  Take a photo of a scene at, say, 35mm f/5.6 1/200 ISO 3200 and 70mm f/5.6 1/200 ISO 3200.  Crop the 35mm photo to the same framing as the 70mm photo, and display both photos at the same size.  These photos have the same exposure, and are taken on a camera using the same sensor, thus the same efficiency and pixel size.  Which photo is more noisy and why?

Here's another experiment.  Take a photo of a scene using RAW that is "properly exposed" at f/5.6 1/200 ISO 3200.  Now take the same photo at f/5.6 1/200 ISO 100, push the photo 5 stops in the RAW conversion, and display the photos at the same size.  Again, we have the same exposure (recall from the section above that the ISO setting does not affect the exposure for a given scene luminance, f-ratio, and/or flash power; rather, it affects the internal processing of the photo) and the same sensor.  Which photo is more noisy and why?

Answers are at the end of this section.

Noise.  That's where the controversy over Equivalence begins.  People think that the "equivalence argument" is based on the presumption that Equivalent photos are equally noisy.  Even different cameras from the same format will have not be exactly as noisy, either in quantity or quality, so, clearly, neither will Equivalent photos from different formats.  However, for sensors of the same generation, Equivalent photos will usually be fairly similar in terms of noise (see these examples), and certainly more similar than photos from different formats with the same exposure (which will result in the larger sensor collecting more total light).  Furthermore, it is of utmost importance to distinguish between noise at the pixel level and noise at the image level.  For example, we would not compare one pixel from a 12 MP photo to one pixel from a 24 MP photo, but to two pixels from a 24 MP photo (which will be discussed in more detail further down).

When people refer to the noisiness of a photo, what they mean is the density of the noise in the photo (NSR -- noise-to-signal ratio), which is often represented as a percent.  Often, we hear the term "SNR" (signal-to-noise ratio), which is the reciprocal of the NSR (SNR = 1 / NSR).  For example, if the SNR = 5:1, then the NSR = 1/5 = 20%.  However, since a noisy photos has a high NSR, and a non-so-noisy has a low NSR, whereas it is exactly opposite for SNR, in my opinion, it is less confusing to think in terms of NSR than SNR with regards to photography.

There are two principle types of noise in a photo:  luminance noise and chroma (color) noise.  Luminance noise is a function of the total amount of light falling on the sensor, and the efficiency of the sensor.  The photon noise (often referred to as "shot" noise) is determined by how much light the sensor records.  This, in turn, is determined by the total amount of light falling on the sensor (Total Light = Exposure · Effective Sensor Area) and the QE (Quantum Efficiency) of the sensor, which is the proportion of light falling on the sensor that is recorded (for example, a QE of 50% means half the light falling on the sensor is recorded -- Total Light Collected = Total  Light · QE).  The other factor in luminance noise is the electronic (read) noise, which is the additional noise added by the sensor and supporting hardware.

The chroma noise is the noise is a result of the CFA (Color Filter Array) both blocking light which they are supposed to admit (AE -- Absorption Error), and admitting light they are supposed to block (TE -- Transmission Error).  For example, if a red photon fails to make it through the red filter, this results in an Absorption Error, and if a red photon makes it through a green filter, this results in a Transmission Error.

The more transmissive the color filters (weaker color filter), the greater the QE, and thus the lesser the luminance noise, but the greater the chroma noise.  The less transmissive the color filter (stronger color filter), the less the chroma noise, but lesser QE, and thus greater luminance noise.  Different manufacturers choose a different balance in the transmissivity of their color filters resulting in a different balance between the luminance and chroma noise, which results is a different quality of noise in the photos, even if the quantity of noise remains the same.  Of course, it is worth noting that if two sensors have the same QE, but one sensor has a stronger CFA than the other, it will have the same luminance noise but less chroma noise, and thus less noise overall.

The photon noise is the primary source of noise in the midtones and highlights of a photo.  It is an inherent characteristic of incoherent light (the kind of light in almost all situations -- see the diagrams at the bottom of this page), and unavoidable -- one of those "Laws of Physics" things, as opposed to "an engineering challenge".  Light has the properties of both a particle (photon) and wave, and the noise is measured in terms of its particle characteristics.  The photons are collected and focused by the lens onto the sensor, where they are converted into electrons, and the signal is processed and recorded.  The only role the sensor plays in the photon noise is what proportion of the photons falling on the sensor are converted into electrons (QE), since the electrons are the source of the electrical current that is processed by the hardware.  The electronic (read) noise, discussed in more detail further down, is how much noise is added when collecting and processing the signal produced by the photons.

The photon noise is proportional to the square root of the signal (N = sqrt S) so the photon relative noise is inversely proportional to the square root of the signal (NSR = 1/ sqrt S).  Thus, if we quadruple the signal (the total amount of light recorded), we halve the relative noise.

There are, of course, other sources of noise, such as thermal noise, which plays a central role in long exposures, PRNU (Pixel Response Non-Uniformity) noise, which plays an important role in the highlights of the image, as well as other sources of noise.  So, noise, is, of course, even more complicated than this essay makes it appear, and for some specific forms of photography (such as astrophotography) we may find that the noise is very different for equivalent images in some situations, much in the same way that corner sharpness is very different for equivalent images in some situations.

Let's begin with an analogy to traffic to understand photon noise.  Imagine we drew a line across a busy freeway, and counted cars crossing the line.  The number of cars crossing the line in any given time interval represents the total light falling on the sensor during an exposure.  If we are talking about short intervals of time, like seconds, or, at most, minutes, then we can safely assume that there is a constant average flow of traffic.  But we also know that it is very unlikely that any two equal time intervals will contain exactly the same number of cars.  This variation from the "true average" is what we call "noise".  The larger the time interval, the larger the noise will be, but the less significant it will be in terms of the total number of cars counted (relative noise).

For example, let's say that, on average, 10 cars pass by our line every second, and let's say we take three one-second samples that come up with 8, 11, and 13 cars.  These three "photographs" would have a noise of 2, 1, and 3 cars, respectively, which correspond to an relative noise of 20%, 10%, and 30%.  Now, let's say we extend the time interval to ten seconds.  The expected number of cars would now be 100 cars (10 cars / second · 10 seconds = 100 cars).  Let's say that, once again, we take three "photographs" and count 93, 98, and 112 cars.  The noise is now 7, 2, and 12 cars -- much more than before.  But the relative noise is 7%, 2%, and 12% -- much less than before.

So, the question is, then, how do we know what the "true average" number of cars is?  Well, we don't.  But what we do know is that the arrival of photons for incoherent light is described by a Poisson Distribution, that the standard deviation for phenomena that is described with a Poisson Distribution is equal to the square root of the mean (average), and that the standard deviation is the photon noise (often called the Shot Noise, which is a more general term).  Thus, since the magnitude of the noise is equal to the square root of the number of recorded photons, the noise increases with more light.  But since the NSR is the ratio of the noise and the recorded signal, the relative noise decreases with more light.  For example, whenever we double the amount of light, the photon noise increases by 41%, but the relative photon noise decreases by 41%.

A good way to visualize the role the lens and sensor play in photon noise is to think of rain falling on a flat surface through an opening.  The size of the opening corresponds to the aperture area, and length of time the rain falls corresponds to the shutter speed.  If a lot of rain is falling (lots of light), then the surface will quickly be covered in water having a very smooth appearance (low relative noise).  However, let's imagine a much lighter rain (low light).  At first, we will see splotches of water here and there in a random and irregular pattern (noisy image).  As more water falls (the total amount of light increases), either by letting more time pass (longer shutter speed) and/or by making the opening larger (larger aperture area), the pattern becomes smoother (less noisy).

Let's now pack a large number of cups on the surface to collect the water.  The cups are analogous to the pixels on the sensor.  Larger cups will collect more water than smaller cups, but smaller cups will give us a better idea of the pattern of the rain.  If we compare an array of cups covering the same area, the water in the larger cups will be more uniformly filled (less noisy) than the smaller cups.  This is why photos from sensors with larger pixels appear less noisy than sensors with smaller pixels.  However, the amount of water in the smaller cups gives us a much better idea of the pattern of rain that has fallen (resolve greater detail).  If the resulting "image" formed by the water in the cups is too noisy to our liking with the smaller cups, we could replace the smaller cups with larger cups and pour the water collected in the smaller cups into the larger cups, achieving the same smoothness we had if we had used the larger cups from the beginning (binning).  Alternatively, we could siphon water from one cup and add it to an adjacent cup to smooth the appearance of the "image".  This method of smoothing (NR -- noise reduction) would retain more of the detail of the original pattern than if the smaller cups were just poured into bigger cups (binning).

There are, of course, important considerations.  How closely are the cups packed together?  How deep is each cup?  How thick is the glass in each cup?  These concerns are all analogous to the efficiency of the pixel, and has much to do with whether or not a sensor with more pixels can accurately achieve the lower noise of a sensor with fewer pixels via binning.  For the same technology, it appears as though this is very much the case.  Thus, smaller pixels offer more options of detail vs noise than do larger pixels.  For photos composed with a large amount of light, smaller pixels basically resolve more detail, and, while more noisy, still "clean enough" to where the additional detail likely contributes far more to the IQ of the photo than the greater noise detracts from the IQ.  It is only for images, or portions thereof, that are created with little light that sensors with smaller pixels much choose, via post processing, to have greater detail and more relative noise, or less detail with less relative noise.

Of course, this choice only comes to play if we display the photo large enough and/or view it closely enough, that we can resolve the additional detail afforded by more pixels.  Otherwise, we do not need to do anything at all -- the photos would look the same, whether they came from sensors with large, or small, pixels.

However, it is not merely the total amount of light that falls on the sensor that determines the noise in the photo, but also how efficient the sensor is.  The primary attributes of sensor efficiency are the QE (Quantum Efficiency -- the proportion of photons falling on the sensor that are converted into electrons) and the read noise (the additional noise added by the sensor and supporting hardware), although it should be noted that a manufacturer may use a weaker CFA to increase the QE, which will reduce the luminance noise at the expense of increasing chroma noise.  For example, two sensors may have the same QE and read noise, but one sensor may have a weaker CFA than the other, making it a more noisy sensor.  Sensorgen is an excellent resource for sensor QE and read noise, although it does not give information about the CFAs.

While the QE and electronic (read) noise are the primary attributes to sensor efficiency, there is more to the story, such as the Bayer CFA (color filter array) which most all cameras use.  The Bayer CFA is a color filter covering the sensor in an RGGB pattern.  This means that 25% of the pixels are covered with red filters, 50% are covered with green filters, and 25% are covered with blue filters.  Of course, the filters actually accept a range of colors, which overlap (otherwise, yellow photons, for example, would never make it through the color filters).  How much the filters overlap, and how strong the filters are, also contribute to the sensor efficiency.

For example, the green filter may only admit 60% of the green light that falls on it, but also admit 10% of other colors that fall on it.  If we use a weaker filter to increase the transmissivity, we will reduce the luminance noise (more total light will pass through the filter and onto the sensor), but also increase the transmission error by concomitantly allowing a greater percentage of other colors to also pass through.  Thus, different manufacturers may strike different balances between luminance noise vs color noise in their choice of color filters.  For example, it has been argued that Canon has been steadily "weakening" their CFAs to increase the QE of their sensors.  The 5D has a QE of 25%, the 5D2 has a QE of 33%, and the 5D3 has a QE of 49%, as the metamerism index has steadily declined from 84 to 80 to 74, respectively.

In addition, the read noise often varies considerably as a function of the camera's ISO setting, usually getting progressively less with higher ISOs ("ISOless" sensors have a relatively constant read noise as a function of the camera's ISO setting).  In addition, some sensors have a base ISO of 100, whereas others have a base ISO of 200.  Sensors with a base ISO of 100 can absorb twice the light at that setting than sensors with a base ISO of 200.  This brings up the saturation capacity of the pixel, which is the number of electrons that it can release.  Both read noise and saturation capacity are only meaningful, in terms of comparative IQ of the photos between systems, when taken on a per area basis.  For example, consider a 10 MP and 40 MP sensor of the same size and with the same QE.  If  the pixels of the 40 MP sensor have 25% the saturation capacity and 50% the read noise, then they would be "equally efficient", since a four pixel block (same area as a single pixel of the 10 MP sensor) would record the same number of photons that landed on it, have the same total saturation, and the same total read noise.

Lastly, the issue of banding needs to be discussed.  Banding is not noise, since it represents a systematic bias in the output signal, whereas noise is completely random.  In that sense, banding is usually significantly more a detriment to IQ than is noise, depending, of course, on how serious the banding is.

For example, the Canon 6D has a QE of 50%, the Olympus EM5 has a QE of 53%, and the Nikon D800 has a QE of 56% (see here, but keep in mind that these figures are for the green channel only, and I am assuming that there is no significant difference in the other channels, and that the transmission error for the color filters is "close").  The maximum possible QE is, of course, 100%, which is just under a stop more than the cameras above.  Of course, that doesn't mean that a 100% QE will record all the light falling on the sensor -- the Bayer CFA, recording only one color per pixel, will only pass, at best, 1/4 of the red and blue light, and 1/2 the green light (since most Bayer CFA filters are RGGB).  There is also the matter of light absorption by the color filter itself, as well as allowing the wrong color light to pass through (color noise).

The read noise (R), is the sum of all the sources of noise due to the sensor and supporting hardware.  The total noise (N) is the sum of the photon noise (P) and the read noise (R).  Since noise is a standard deviation, it does not sum linearly -- that is, N ≠ P + R.  For independent random phenomena, like noise, it is instead the variances (the squares of the standard deviations) that sum linearly:  N² = P² + R².  Thus, the total noise is the square root of the sum of the squares:  N = sqrt (P² + R²).  Since the photon noise is the square root of the signal (S), we can represent the total noise as:  N = sqrt (S + R²).

For example, if 800 photons fell on a pixel that had a QE of 50%, the signal would be 400 electrons, and thus the photon noise would be sqrt 400 = 20 electrons.  If the pixel had a read noise of 10 electrons, then the noise for that pixel would be:  N = sqrt (20² + 10²) = 22.4 electrons, which is not much different than the photon noise alone.  Let's now consider a much lower level of light -- say 50 photons fell on the pixel.  Then the signal would be 25 electrons, and the photon noise would be sqrt 25 = 5 electrons.  This would make the pixel noise to be:  N = sqrt (5² + 10²) = 11.2 electrons, which is not much different than the read noise.

So, we can see that for very low light, the read noise is dominant, whereas the rest of the photo is dominated by photon noise.  In addition, we notice that the relative noise (NSR) for the signal of 400 electrons is 22.4 / 400 = 5.6%, whereas the relative noise for the lower signal of 25 electrons is 5 / 25 = 20%.  This is why low light results in noisy photos and why pushed shadows are more noisy than other portions of the photo.

A common myth is that higher ISOs cause more noise.  The effect of the ISO setting on exposure is to indirectly changes the f-ratio, shutter speed, and/or flash power depending on the metering mode we are in, as well as adjust the lightness of the LCD playback and/or OOC jpg.  Higher ISOs result in more narrow apertures, faster shutter speeds, and/or less flash power than lower ISOs for a given scene luminance, which results in less light falling on the sensor, and thus more photon noise.  In other words, it is the lesser amount of light falling on the sensor at higher ISOs than lower ISOs that results in greater noise at higher ISOs, not the higher ISO setting, per se.

Aside from the indirect effect on exposure based on how the ISO setting results in the choices the camera uses for the f-ratio, shutter speed, and/or flash power, higher ISO settings increase the lightness of the photo.  That is, the ISO setting will map the DN (digital number) from ADU (Analog to Digital Unit) in a preprogrammed manner (which, incidentally, can vary from camera to camera and still be within the standard).  Specifically, ISO 100 will map the corresponding DN (digital number) from an exposure of 0.1 lux · seconds to a value of 118 in an sRGB image file, ISO 200 will will map the corresponding DN from an exposure of 0.05 lux · seconds to a value of 118 in an sRGB image file, etc.

Unlike film, the sensitivity of the sensor is fixed -- that is, the ISO setting does not affect the efficiency of the sensor.  However, for sensors with noisy ADUs (Analog to Digital Units), higher ISO settings result in less electronic noise than lower ISO settings digitally amplified and/or mapped.  For example, the read noise for the Canon 5D3 at base ISO (100) is 33.1 electrons, drops to 18.2 electrons at ISO 200, and continues to drop until it finally levels off at around 3 electrons at ISO 3200.  On the other hand, some sensors, like the Sony Exmor sensor in the Nikon D7000 and D800, have the same read noise throughout the entire ISO range.  These types of sensors are referred to as "ISOless", although it's worth noting that even "non-ISOless" sensors  become ISOless after some point in the ISO range (usually between ISO 800 and ISO 3200).

Let's work an example of how this works.  Let's assume that the electronic noise before digital amplification is 2 electrons / pixel, and the digital noise after digital amplification is 27 electrons / pixel.  At ISO 100, this would give us a combined read noise of  R = sqrt (2² + 27²) = 27.1 electrons / pixel.  At ISO 1600, the read noise would be R = sqrt [ (16 · 2)² + 27² ] = 41.9 electrons / pixel.  If the amplification had instead been performed at the end of the imaging chain, the read noise would be R = 16 · 27.1 electrons / pixel = 433 electrons / pixel.  Since the effective signal is 1/16 as much at ISO 1600 as it is at ISO 100, the effective read noise at ISO 1600 would be 41.9 / 16 = 2.6 electrons / pixel, relative to the effective signal.  Conversely, a perfectly ISOless sensor would have zero digital noise after amplification and thus have the same read noise / pixel relative to the effective signal throughout the ISO range (2 electrons / pixel in this example).

For a camera with an ISOless sensor, the only reason for using higher ISOs is operational convenience, as no cameras currently have an ISOless UI (user interface).  For example, let's say a photo of a scene at f/2.8 1/100 ISO 800 resulted in the desired output lightness.  If we instead took the photo at f/2.8 1/100 ISO 100 and pushed the file three stops, in a RAW conversion, the resulting files would be the same, whereas if we did the same with a camera using an non-ISOless sensor, the ISO 800 photo would be less noisy than the ISO 100 photo (see here and here for demonstrations).

The disadvantage, then, of shooting at higher ISOs with a camera using an ISOless sensor is that the recorded file may very well oversaturate (blow) portions of the scene since portions well within the saturation limits of the pixel will be pushed outside the bit depth of the recorded file, whereas using the appropriate tone curve in the processing of a photo taken at base ISO could retain much more of the highlights (see here for a demonstration).  However, since neither the output jpg lightness nor the LCD playback are tied to the camera's meter, operationally, using a camera with an ISOless sensor in an ISOless manner is very inconvenient.

It's more than worthwhile to note that putting more light on the sensor will always result in less noise than using a higher ISO setting to get less read noise with a camera that uses a non-ISOless sensor.  The photon noise  dominates except in the shadows.  But, for a given f-ratio and shutter speed, as the light lessens, more and more of the photo becomes shadow (even though the output file may not look like shadow, since it was pushed either by a gain applied by using a higher ISO, or in processing), so the read noise becomes more and more important.  For example, f/2.8 1/100 ISO 400 will always be less noisy than f/2.8 1/200 ISO 800 -- that is, don't use an artificially higher ISO than needed to get less read noise on an ISOless sensor, since the increase in photon noise caused by less light falling on the sensor will always outweigh the decrease in read noise.

Many systems deal with shadow noise simply by clipping them earlier, which, of course, removes the detail as well.  However, much more important than the shadow noise itself is banding, which is most noticeable and distracting in the shadows due to the fact that banding is a regular pattern as opposed to the random nature of noise.  For example, a sensor with low read noise and banding may well produce a significantly worse image in the shadows than a sensor with more read noise without the banding.  That said, the issue of banding is separate from the issue of noise, and not discussed in this essay as banding has nothing to do with sensor size.

In short, the primary sources of noise in a photo are the following three factors:

(The effect of pixel size, discussed below, does play a minor role in the overall noise in a photo.)

For a given scene, perspective, and framing, the total amount of light is determined solely by the aperture diameter (where aperture diameter = focal length / f-ratio) and the shutter speed.  The role the sensor size plays in total light is that a larger sensor can absorb more light before oversaturating.  By using a larger aperture, we also force a more shallow DOF, because the aperture plays an integral role for both total light and DOF.  So, if a more shallow DOF is not desirable (and/or the concomitant effects that can occur, such as softer corners and more vignetting), then the only way to decrease noise is to decrease the shutter speed, at the risk of overexposing more of the image (this technique is known as ETTR -- expose to the right), and works best for images with a narrow DR (dynamic range).  The efficiency by which this light is captured is a function of many variables, primarily the CFA, the efficiency of the microlens covering, the percentage of the light transmitted by the color filters, and the QE (quantum efficiency) of the sensor.  The efficiency of the signal amplification is a function of not only the sensor, but the supporting hardware, such as the ADCs (Analog-to-Digital Converter).

In terms of the total signal, the amount of light lost by traveling through the glass of the lens is insignificant, so there is no improvement to be made there.  The microlens covering over the sensor, which directs the light that falls on the sensor into the pixels is also near 100% efficiency for modern cameras, so, again, there is no improvement to be made there, either (although there are still efficiency issues when using f-ratios faster than f/2).  However, for Bayer sensors, each pixel records only one color (usually 25% red, 25% blue, and 50% green), which results in around another stop of light lost.  Modern cameras have a QE of about 50%, so there is about another stop more gain that can be made there.  Thus, modern cameras with Bayer CFA's are around two stops away from the maximum possible improvement in photon noise, with one more stop for a scheme that does not require color filters.

While alternative schemes from Bayer may one day be able to give us this extra 1-2 stops improvement in photon noise and possibly render double the detail for a given pixel count as well, so far these technologies have run into significant problems of their own, such as the Foveon sensor where each pixel records three colors as opposed to one.  While ostensibly a very good idea, and one that may eventually take hold, in its current implementation it is actually more noisy than a Bayer CFA.  Another scheme is to use prisms or dichromatic mirrors to direct the different colors to three sensors ("3CCD" video cameras, for example), but these alternatives currently have significant technical problems of their own to overcome.

Many, if not most, believe that smaller pixels (for a given sensor size) result in more noise for the following two reasons:

  • For a given pixel count, larger sensors have larger pixels.
  • Less light falls on smaller pixels than larger pixels, all else equal.

The first misunderstanding is easily explained by noting that more light falls on larger sensors for a given exposure, and this is the reason for the lower noise, not the larger pixel size.  The second misunderstanding comes from comparing photos pixel-for-pixel when the sensors have different pixel counts (e.g. one pixel from a 12 MP sensor to one pixel from a 48 MP sensor as opposed to four pixels).  The fact of the matter is that the relationship between pixel size and noise, for a given sensor size, depends on two factors:

  • How the pixel size affects the QE of the pixel (the proportion of light falling on the pixel that is recorded)
  • How the pixel size affects the electronic (read) noise of the pixel (the additional noise added by the sensor and supporting hardware)

For sensors of the same generation, the QE is remarkably consistent across brands and the entire spectrum of pixel sizes.  That said, it is important to note that new technologies are sometimes introduced in smaller sensors first, and there will be times that the smaller sensor of a given generation will have a substantial advantage over larger sensors using the older technology (e.g. BSI technology in cell phones and compacts).  The read noise, however, is a more complicated issue.

Let's work out an example with two sensors of the same generation and size, one with pixels twice the size (4x the area) area of the other (e.g. sensors of the same size with 12 MP and 24 MP).  The QE for the sensors will be the same, so four pixels from the 48 MP sensor will collect the same amount of light as one pixel from the 12 MP sensor, and thus the photon noise will also be the same.  Let's now consider two extremes in terms of the read noise.  If the read noise per pixel is the same, two pixels for the 48 MP sensor will have twice the read noise as one pixel from the 12 MP sensor.  On the other hand, if the read noise scaled inversely with the area of the pixel, this would result with four pixels of the 48 MP sensor having half the read noise as a single pixel from the 12 MP sensor.  A middle case is if the read noise of a pixel scales with the linear dimensions of the pixel.  This would result in four pixels from the 48 MP sensor having the same the read noise of a since pixel from the 12 MP sensor.

Reality lies between these two extremes, but much closer to the read noise per pixel being the same regardless of the pixel size (at least for the higher ISOs), as a general rule.  Since the photon noise is dominant for all of the photo except the portions receiving very little light, this means that, for a given sensor size and generation, sensors with more smaller pixels will tend to be more noisy than sensors with fewer larger pixels, and would be noticeable at, say, light levels where the photographer would be using ISO 3200 or higher, or if they are pushing shadows heavily at lower ISO settings.

Let's work and example with existing cameras.  As already discussed, there are two primary sources of noise in a photo:

  • Photon noise (noise from the light itself -- more light, less noisy photo)

  • Electronic noise (noise from the sensor and supporting hardware)

The only aspect of the sensor that affects photon noise is the QE (Quantum Efficiency -- the proportion of light falling on the sensor that is recorded). For sensors of the same generation, the QE is remarkably consistent regardless of sensor size, pixel size, or brand. There are exceptions, of course, such as sensors using BSI tech, but, for the most part, it is remarkably consistent.  The electronic noise, however, can vary a lot based on a great number of variables, to include the ISO setting (and, unlike what most believe, it is *less* at higher ISO settings, not more) and pixel count.  However, the electronic noise is small compared to the photon noise except for the portions of the photo made with very little light, such as deep shadows at base ISO and progressively more of the photo as the ISO rises.

For example, consider the D810, D750, and D4s. The QEs (the proportion of light recorded by the sensors) for the sensors are 47%, 51%, and 52%, respectively -- essentially identical. Thus, no difference in photon noise.  In terms of the electronic noise, we need to normalize it to the same proportion of the photo. I like to use the µphoto (millionth of a photo) as it is convenient computationally (it also represents one pixel of a photo displayed at 1200x800 pixels on a monitor, which is a common display size for a photo). Thus, one µphoto represents 36 pixels for the D810, 24 pixels for the D750, and 16 pixels for the D4s.

Noise is the standard deviation of the recorded signal from the mean signal, and standard deviations add in a peculiar manner called a quadrature sum. For example, if we have four pixels with an electronic noise of 2 electrons/pixel, the combined noise isn't 2 + 2 + 2 + 2 = 8 electrons, but rather sqrt (2² + 2² + 2² +2² )= 4 electrons. So, the electronic noise / µphoto for the D810, D750, and D4s, respectively, at ISO 6400 is 15.6, 11.8, and 7.6 electrons.

Let's discuss the significance of this in terms of the noise we see in the photo. The saturation/µphoto for the D810, D750, and D4s at ISO 6400 is 28548, 29040, and 29472 electrons/µphoto, respectively. These are almost identical because they should be almost identical, as the sensors have the same area. The photon noise is the square root of the signal, so let's use 29000 electrons as the max signal and tabulate the photon noise (in electrons) starting at max saturation and at one interval stops below full saturation:

  • Max saturation: 170

  • 1 stop down: 120

  • 2 stops down: 85

  • 3 stops down: 60

  • 4 stops down: 43

  • 5 stops down: 30

  • 6 stops down: 21

  • 7 stops down: 15

  • 8 stops down: 11

  • 9 stops down: 8

What do we see here? The electronic noise matters as much as the photon noise 7 stops down from full saturation with the D810, 8 stops down from full saturation with the D750, and 9 stops down with the D4s. Of course, the electronic noise is a factor long before it is on parity with the photon noise, but we can clearly see that the electronic noise is worse for sensors with more pixels, although we are talking about ISO 6400 here. At lower ISO settings, we have to go progressively further down the DR before the electronic noise is a factor (and, conversely, it becomes a factor earlier as we go to higher ISO settings still).

Let's be more specific still and compute the NSR (Noise-to-Signal Ratio) by adding the photon noise and electronic noise (keeping in mind that the photon noise and electronic noise will add in quadrature, as described above). The numbers in the table below are the NSRs for the D810 / D750 / D4s:

  • Max saturation: 0.6% / 0.6% / 1.7%

  • 1 stop down: 0.8% / 0.8% / 1.7%

  • 2 stops down: 1.2% / 1.2% / 1.7%

  • 3 stops down: 1.7% / 1.7% / 1.7%

  • 4 stops down: 2.5% / 2.4% / 2.4%

  • 5 stops down: 3.7% / 3.6% / 3.4%

  • 6 stops down: 5.8% / 5,4% / 5.0%

  • 7 stops down: 9.6% / 8.4% / 7.4%

  • 8 stops down: 16.7% / 14.0 % / 11.5%

  • 9 stops down: 30.6% / 24.7% / 18.9%

As we can see, the electronic noise begins to matter only for the portions of the photo made with lower and lower light. You can clearly see that for the very dark portions of the photo, which is what your linked examples showed (deep shadows at ISO 6400), there is a clear disadvantage to more pixels with regards to noise.  Further compounding the problem is the lack of detail in those shadows meaning that even if noise filtering were applied, it would be of limited utility.

So, are more pixels more noisy than fewer pixels? For sensors of the same generation, this is often, if not usually, the case. Is it significant? Not until higher ISO settings (say, ISO 6400+ FF equivalent) or if heavily pushing shadows at base ISO. Can noise filtering (as opposed to downsampling) tip the balance in favor of the sensor with more pixels? Yes, it can, if the scene is such that the sensor with more pixels is able to record more detail and the light is not so low that the electronic noise dominates the photon noise, which we will now discuss.

All else equal, more pixels resolve more detail.  This means that the photo made with more pixels, despite being more intrinsically noisy, will have more detail to work with when applying NR (noise reduction).  Thus, depending on how much more detailed the photo is, the photo with the higher pixel count can be processed to be less noisy with the same detail as the photo with the lower pixel count.  To this end, there are three primary issues to consider, assuming accurate focus and negligible camera shake:

•  The sharpness of the lens
•  How far the lens is stopped down (diffraction softening)
•  Motion in the scene vs shutter speed (motion blur)

Thus, depending on these variables, 48 MP, for example, will result in up to twice the resolution as 12 MP under ideal conditions, which gives NR a lot to work with (see here and here for some outstanding examples).

In other words, we cannot discuss noise without considering the detail of the image.  Just as it makes since to compare the sharpness and detail of images at the same output size, it only makes sense to compare the noise in images at the same level of detail.  In other words, it makes no sense to say that one image has less noise than another, when it also has less detail, since NR (noise reduction) can be applied to the more detailed image to get a cleaner image at the expense of detail.

All that said, while we often speak of the relative noise of Camera A vs Camera B, what many overlook is that it is not merely the quantity of the noise, but the quality of the noise, that is important.  It is not only possible, but likely, that one image may be have more relative noise than another, yet have a much more pleasing look due to the quality of the noise.  For example, color noise is usually much more distracting than luminescence noise.  In addition, higher frequency relative noise (finer grain) with the accompanying greater detail is usually considered much more appealing than lower frequency relative noise (coarse grain) with less detail.  In other words, a more noisy image with a finer "grain" may well look better than a less noisy image with a clumpier "grain", depending on how close the overall quantities of noise are and the differences in detail rendered (an excellent demonstration of this is given here and here).

To this end, having more pixels, even at the expense of more noise, can lead to a more appealing overall image, but this is most certainly subjective.  Of course, if the more detailed image has the same, or even less, relative noise than the less detailed image after NR (noise reduction) is applied to match the level of detail, then the system with the more detailed, yet more noisy, image will have a substantial IQ advantage by being able to better balance noise and detail in post.  Regardless, it is important to consider the types of images where noise is even an issue.  This, of course, depends greatly on both the QT (quality threshold) of the viewer which is strongly influenced by print size and the viewer's "noise floor" -- that is, the point at which less noise has no noticeable impact on the IQ of the image.  For example, while an ISO 100 image from FF has less noise than an ISO 100 image from mFT, the advantage in noise of FF may be unnoticeable to the viewer at ISO 100.  Of course, the "noise floor" is likely a function of the processing, print size, and viewing distance as well.  For example, the noise in an image may not be distracting in a 5x7 print, but become an issue in a heavily processed 12x18 print.  Furthermore, the impact of the noise is greatly dependent on the scene.  The noise may go overlooked in areas with lots of detail, but stick out in areas with low detail, such as sky noise.

In addition to the mere quantity of noise, we have to consider the balance of noise in the different color channels, which is a function of the CFA (color filter array) that is used on the sensor.  One image may be less noisy than another overall, but exhibit significantly more noise in one of the color channels which will give it a less appealing overall look.  Furthermore, both photon and read noise are completely random which makes for a significantly more pleasing appearance than banding which has a regular pattern.  In other words, while noise is most certainly an important consideration in the IQ of an image, the quantity of the relative noise most likely is less important than the quality of the noise.

Lastly, we have to take into account that different RAW converters and JPG engines will deal with noise differently, so the noise/detail present in the final image is not necessarily an accurate representation of the actual hardware.  NR (noise reduction) can be applied even with a setting of "0", blacks can be clipped early to hide shadow noise (albeit at the cost of erasing detail), etc.  In fact, even the same RAW converter might use different settings for different cameras since the programmers decided that such-and-such a look was "better".  So, for sure, it is the final image that matters.  But the final photo is not necessarily representative of the capability of the hardware.  In fact, many compare photos on the basis of OOC (out-of-the-camera) jpgs, which, of course, is a very poor way in which to compare the hardware performance.  But it is the best way to compare if you shoot OOC jpgs!

Thus, the advantage in noise of larger sensor systems is limited to situations when they can use a lower shutter speed than the smaller sensor systems, such as good light, tripod use where motion blur is not a factor, flash photography when the balance of the light from the flash and the ambient light is not an issue, or when a more shallow DOF is used by trading f-ratio for ISO.  And, once again, all these factors only matter if we are talking about sensors that have the same, or nearly the same, efficiency.  Regardless, it is likely that it is the quality of the noise, more so than quantity of the noise, that is the primary factor in distinguishing between the IQ of Equivalent images in terms of noise.  Just as with any element of IQ, noise is very subjective, and different people will reach different conclusions about which image is more pleasing, even if the numbers clearly point to one image or the other as having more overall noise.

So, let's answer the questions posed at the beginning of the section.  The first of the two experiments was to take a photo of a scene at, say, 35mm f/5.6 1/200 ISO 3200 and 70mm f/5.6 1/200 ISO 3200.  Crop the 35mm photo to the same framing as the 70mm photo, and display both photos at the same size.  These photos have the same exposure, and are taken on a camera using the same sensor, thus the same efficiency and pixel size.  The question was "Which photo is more noisy and why?:

The answer is that that the cropped photo is more noisy, because it was made with only 1/4 the light as the uncropped photo, and thus twice the photon noise.  However, the read noise will be half as much since it is made with 1/4 the number of pixels.  The lower the light the photo is made with, the more the read noise will matter compared to the photon noise.  So, if we performed the experiment in good light where we were using ISO 100 as opposed to ISO 3200, the photon noise would dominate, and the cropped photo would be twice as noisy.  On the other hand, if we performed the experiment in very low light so that we were using, say, ISO 102400, where the read noise is much more of a factor, we would find a much smaller difference in noise between the two photos.

The second experiment was to take a photo of a scene using RAW that is "properly exposed" at f/5.6 1/200 ISO 3200.  Now take the same photo at f/5.6 1/200 ISO 100, push the photo 5 stops in the RAW conversion, and display the photos at the same size.  The same question, "Which photo is more noisy and why?" was asked.  The answer is that In both cases, the same amount of light falls on the sensors, so the photon noise is the same.  If the a camera using an ISOless sensor, there would be no difference.  However, for a camera using a non-ISOless sensor, the read noise would be significantly greater in the pushed photo, and thus it would be more noisy.

 

 

 

 

 

 

DYNAMIC RANGE

 

The Dynamic Range (DR) tells us the maximum range of light levels where we can record detail, and as this section will show, the DR, by itself, can be very misleading as a measure of the IQ of a photo, since it tells us only the number of stops of light levels where detail can be recorded, but nothing about the quality of the levels within the DR since it does not take photon noise into account.  Thus, to make DR more meaningful in terms of the IQ of a photo, we need to understand the role that both photon noise and electronic (read) noise play, and how the NSR (Noise-to-Signal Ratio) compares for each level within the DR of the photo.

The DR is usually presented as the number of stops from the noise floor to the saturation point:  DR = log2 (Saturation Point / Noise Floor).  The saturation point is the maximum number of electrons that can be released over a predetermined area, and the noise floor is the smallest number of electrons that can be released before recorded detail is no longer useful due to noise.  The area over which the DR is being measured is given as a per-pixel measure, but, like noise, the per-pixel measure is only meaningful in terms of comparing photos if the photos are made from the same number of pixels (this essay uses the electronic noise for the noise floor and the µphoto, millionth of a photo, as the area over which the DR is measured).

It's also important to understand that the saturation limit is fixed as the product of the saturation limit of each pixel and the number of pixels over the area we are measuring the DR.  However, the noise floor and the area over which the DR is being measured, need to be specified.  The usual default for the noise floor is either the read noise or the 100% NSR, where the 100% NSR is the number of electrons that results in equal parts noise and signal for a read noise, R:  100% NSR = [1 + sqrt (1 + 4·R²) ] / 2.  However, levels of the DR near the read noise or 100% NSR may be quite noisy and considered "unusable".  Thus, for example, a particular photographer might choose a 20% NSR as the noise floor for "usable DR".

Sensorgen measures DR per-pixel using the electronic (read) noise as the noise floor.  DxOMark's "Screen" tab gives the per-pixel measure of DR, whereas the "Print" tab gives the per-pixel DR for a photo normalized to 8 MP.  In both cases, DxOMark uses a 100% NSR as the noise floor.  For large read noises, the 100% NSR is essentially the same as the electronic (read) noise.  For low electronic noise (say, below 2 electrons), the distinction begins to become significant.  Furthermore, it is important to realize that the electronic noise often varies considerably as a function of the ISO setting (compare and contrast the read noise at each ISO setting on the D800 vs the 6D, for example, where the D800 has an ISOless sensor in contrast to the non-ISOless sensor of the 6D).

In addition, many confuse the bit depth of the image file with DR -- we need to distinguish between the DR of the scene itself, the DR that the sensor can capture, and how the bit depth of the image file affects the encoding of this DR.  For example, a scene might have 20 stops of DR, the sensor might be able to record 12 stops of that DR, and this DR is then encoded in a jpg file (8 bits) or a RAW file (12 or 14 bits, depending on the camera and settings).  If the DR of the scene is greater than the DR of the sensor, then the photographer needs to expose for the portions of the DR they wish to capture.  If the DR of the sensor is greater than the bit depth, then either not all levels of the DR will be recorded (levels will be clipped) or not all levels of the captured DR will be distinguishable.  On the other hand, if the bit depth is greater than the DR of the sensor, more memory is being used than necessary to store the photo.  The tone curve applied to the photo maps the linear capture of the RAW file into a processed image file that may, or may not, reflect the DR of the initial capture.

This brings us to how DR fares for jpg vs RAW.  Since jpgs are 8 bits, they can display at most 256 levels of the DR.  The tone curve decides which levels of the DR are represented in the jpg file.  For example, if a sensor were able to capture 12 stops of DR, and all levels were equally represented by the tone curve, then each of the 256 levels of DR in the jpg will represent 16 levels of the DR.  Having multiple levels of DR represented by a single level can cause significant issues with posterization, especially if the file undergoes strong processing.  The advantage of RAW over in-camera jpgs is not only that RAW allows the photographer to choose the "appropriate" tone curve after the fact (which is often reason enough), but that if the RAW is converted into a file with a larger bit-depth, like a tiff or png file, then it can withstand much more editing before the effects of posterization become apparent.  However, if the in-camera jpg has the desirable tone curve and further processing is not required, then the RAW file has little to no advantage over an OOC (out-of-the-camera) jpg since most display media (prints and computer monitors) can only display 256-512 levels.

In a sense, RAW vs jpg is analogous to Auto vs M mode.  In Auto mode, the camera chooses the f-ratio and shutter speed for the capture; in M mode, the photographer chooses.  Just as some photographers may feel that the camera makes the right choices in Auto mode, some photographers may feel that the tone curve applied in-camera is the right choice.  If the camera has multiple options for tone curves, this is analogous to shooting in an AE (auto exposure) mode such as Av, where the photographer chooses the f-ratio, but the camera chooses the shutter speed, or Tv mode, where the camera chooses the f-ratio, but the photographer chooses the shutter speed.  The analogy breaks down, however, in that that photographer cannot change the f-ratio and/or shutter speed after the fact, but the photographer can choose the tone curve after the fact if they shoot RAW.

An excellent example of  applying the "appropriate" tone curve after the fact, are the following two photos here:

http://forums.dpreview.com/forums/read.asp?forum=1034&message=36903045

Note how the bottom photo has compressed the DR of the scene by revealing details in the highlights that were blown in the top photo, since the higher ISO of the bottom photo shifted the DR of the capture four stops lower.  So, even though the medium (jpg displayed on a computer monitor) cannot display all 14 stops of the DR of the initial RAW capture, the DR of the capture can be mapped into the fewer levels of the display medium by an "appropriate" tone curve to produce a better photo.  This is not unlike how we see.  Our eyes take several exposures of scenes, and our brain merges these exposures into an HDR-like memory of what we "saw".

Here's a step by step method to compute the DR for a photo:

•  Decide the area of the photo where you want to measure the DR (I like the µphoto -- millionth of a photo)
•  Choose the NSR (noise-to-signal ratio) beyond which you consider the noise to be "unacceptable".
•  Compute the signal the results in the NSR noise floor and area chosen above: S = [1 + sqrt (1 + 4N²R²P) ] / [2N²].
•  Compute the saturation limit of the sensor over the area chosen above: SL = P·R.
•  Compute the DR as the number of stops from the noise floor to the saturation limit:  DR = log2 (SL / S).

Let's define the variables in the above formula:

•  S = signal (electrons)
•  N = NSR (noise-to-signal ratio) for the noise floor
•  R = electronic (read) noise / pixel
•  P = number of pixels that cover the area over which we are measuring the DR

Let's now work an example for the EM5:

•  Let's choose the area over which were are measuring the DR as a millionth of the photo (µphoto), which is 16 pixels.
•  Let's choose a noise floor of 20% (e.g. more than 20% noise is "unusable") at ISO 200.
•  A noise floor of 20% has a signal of S = [1 + sqrt (1 + 4(0.2)²(6.5²)(16) ] / [2(0.2)²] = 143 electrons.
•  The saturation per µphoto at ISO 200 is 16 x 25041 = 400656 electrons.
•  DR / µphoto with a 20% noise floor at ISO 200 = log2 (400656 / 143) = 11.5 stops.

It is also important to note that two systems with the same DR may have a rather different look the same since DR does not take photon noise into account.  Thus, DR, by itself, doesn't tell us about the quality of the resulting photo.  What we need is to see the noise levels at each level in the DR.

For example, consider two sensors.  One is 24 MP with a saturation of 80000 electrons/pixel and an electronic (read) noise of 20 electrons/pixel at a particular ISO setting.  The other sensor is 16 MP with a saturation of 30000 electrons/pixel and an electronic (read) noise of 7 electrons/pixel at the same ISO setting.  The DR of the 24 MP sensor is log2 (80000 / 20) = 12.0 stops/pixel.  The DR of the 12 MP sensor is log2 (30000 / 7) = 12.1 stops/pixel, which is just a hair more DR/pixel than the 24 MP sensor.

Now we normalize the DR measurement so that we are measuring the DR over the same area of the photo by computing the DR/µphoto for each sensor.  For the 24 MP sensor and 12 MP sensor, one µphoto is 24 pixels and 16 pixels, respectively.  This gives saturation points of 80000 x 24 = 1.92 million electrons/µphoto and 30000 x 16 = 480000 electrons/µphoto, respectively.  The 24 MP sensor has 4x the saturation limit as the 12 MP sensor which means it is twice the size (four times the area) for a given tech.  The read noises are 20 x sqrt 24 = 98 electrons/µphoto and 7 x sqrt 16 = 28 electrons/µphoto, respectively, which would make the smaller 12 MP sensor more efficient, assuming the QE (Quantum Efficiency -- the proportion of light falling on the sensor that releases electrons) was roughly the same for both sensors.

We can now compute the DR as log2 (1.92 million / 98) = 14.3 stops/µphoto for the 24 MP sensor and  log2 (480000 / 28) = 14.1 stops/µphoto for the 12 MP sensor, which has reversed to being just a hair under the DR/µphoto when we compare over the same area of the photo.

However, the DR is not the whole story by a long shot.  Let's look at the NSR (Noise-to-Signal Ratio) at one stop increments within the DR for each sensor:

 

Signal (electrons / µphoto) NSR / µphoto for the 12 MP sensor NSR / µphoto for the 24 MP sensor
     
25 114% 393%
     
50 59% 197%
     
100 30% 99%
     
200 16% 50%
     
400 8.6% 25%
     
800 5.0% 13%
     
1600 3.1% 6.6%
     
3200 2.0% 3.5%
     
6400 1.3% 2.0%
     
12800 0.91% 1.2%
     
25600 0.63% 0.73%
     
51200 0.45% 0.48%
     
102400 0.31% 0.33%
     
204800 0.22% 0.23%
     
409600 0.16% 0.16%
     
819200 0.77 stops oversaturated (blown) 0.11%
     
1638400 oversaturated (blown) 0.08%
     
3276800 oversaturated (blown) 0.77 stops oversaturated (blown)

 

Now, this is a remarkable thing, is it not?  Two sensors with essentially the same DR displaying very different characteristics.  It is not a coincidence that Sensor A, with the lower read noise, performs better at the lower levels of the DR.  Nor is it coincidence that as we climb up through the levels of the DR that the photon noise begins to dominate and the read noise becomes unimportant.

In addition, at every signal level the 12 MP sensor is less noisy, but the highlights oversaturate (get blown) two stops earlier than the 24 MP sensor, since the 24 MP sensor is twice the size (4x the area) as the 12 MP sensor in this example, and can thus absorb 4x (two stops more) light before oversaturating.  Thus, if we maximized the exposure on each sensor, the 24 MP sensor would get 4x (2 stops more) total light (and thus 2 stops more light per µphoto) as the 12 MP sensor, and we'd get something like this:

 

Signal (electrons / µphoto) for the 12 MP sensor Signal (electrons / µphoto) for the 24 MP sensor NSR / µphoto for the 12 MP sensor NSR / µphoto for the 24 MP sensor
       
25 100 114% 99%
       
50 200 59% 50%
       
100 400 30% 25%
       
200 800 16% 13%
       
400 1600 8.6% 6.6%
       
800 3200 5.0% 3.5%
       
1600 6400 3.1% 2.0%
       
3200 12800 2.0% 1.2%
       
6400 25600 1.3% 0.73%
       
12800 51200 0.91% 0.48%
       
25600 102400 0.63% 0.33%
       
51200 204800 0.45% 0.23%
       
102400 409600 0.31% 0.16%
       
204800 819200 0.22% 0.11%
       
409600 1638400 0.16% 0.08%
       
819200 3276800 0.77 stops oversaturated (blown) 0.77 stops oversaturated (blown)

 

Now we see that the 24 MP sensor oversaturates (blows out) the same number of stops from the bottom of the DR as the 12 MP sensor, but has less noise at each level of the DR as the photon is dominant throughout the DR due to Sensor B receiving 4x as much light per µphoto as Sensor A since the systems in the above NSR table are receiving the same exposure and Sensor B has 4x the area as Sensor A (Total Light - Exposure x Sensor Area).

An important aside is how we go about maximizing the exposure.  First of all, maximum exposure occurs only at the camera's base ISO setting, since higher ISO settings simply apply larger and larger amplifications to lower and lower signals.  In addition, we need to consider the difference between the t-stop and the f-ratio.  For example, if the lens for System A has a t-stop of t/4.5 at f/4, and System B has a t-stop of t/5 at f/4, then System B will receive a third of a stop less light per area on the sensor than System A at f/4 for a given scene luminance and shutter speed.

Lastly, the QE (Quantum Efficiency -- the proportion of light falling on the sensor that releases electrons) of the sensor plays an important role in maximizing exposure.  For example, if System A has a QE of 50% and System B has a QE of 40%, then System B will record one third of a stop less light per area than System A.

So, putting this all together, if we assume that the lens used on the 12 MP sensor is t/4.5 at f/4 and has a QE of 50%, whereas the lens used on the 24 MP sensor is t/5 at f/4 and has a QE of 25%, then, at f/4, the 12 MP sensor would record 2/3 more light per area on the sensor for a given scene luminance and shutter speed.  Thus, if the sensors have the same saturation limit per area on the sensor, if f/4 1/200 maximized the exposure for Sensor A, we would use f/4 /125 to maximize the exposure on Sensor B.

A common myth is that more smaller pixels for a given sensor size results in less DR, so let's spend a few moments debunking this misconception with the classic example of the Nikon D600 (24 MP FF) and the Nikon D800 (36 MP FF).  These sensors are of the same generation and design and a textbook example that smaller pixels do not adversely affect sensor efficiency (within limits, of course).  At base ISO, the D600 has an electronic (read) noise of 7.4 electrons / pixel = 7.4 sqrt 24 = 36.3 electrons / µphoto, and a saturation of 76231 electrons / pixel = 24 · 76231 = 1829544 electrons / µphoto.  For the D800, we have an electronic (read) noise of 2.7 electrons / pixel = 2.7 sqrt 36 = 16.2 electrons / µphoto, and a saturation of 44972 electrons / pixel = 36 · 44972 = 1618922 electrons / µphoto.  With this in hand, let's create an NSR / level table:

 

Signal (electrons / µphoto) NSR / µphoto for the D600 NSR / µphoto for the D800
     
25 147% 68%
     
50 74% 35%
     
100 38% 19%
     
200 19% 11%
     
400 10% 6.4%
     
800 5.8% 4.1%
     
1600 3.4% 2.7%
     
3200 2.1% 1.8%
     
6400 1.4% 1.3%
     
12800 0.93% 0.89%
     
25600 0.64% 0.63%
     
51200 0.45% 0.44%
     
102400 0.31% 0.31%
     
204800 0.22% 0.22%
     
409600 0.16% 0.16%
     
819200 0.11% 0.11%
     
1638400 0.08% 0.08%
     
3276800 oversaturated (blown) oversaturated (blown)

 

Here, we see a clear advantage going to the D800 at the lower levels of the DR in a classic example of smaller pixels resulting in superior performance for equally efficient sensors, as well as a classic example that smaller pixels do not compromise sensor efficiency (within the limits of the technology).

Next, we'll explore the effect of sensor size on DR at base ISO for sensors of radically different sizes and relatively similar pixel counts.  In this situation, the differences between systems on the basis of  sensor size are significant, since larger sensors can absorb more light than smaller sensors.  However, if we are shooting at base ISO, it means there is enough light that we can shoot with the desired DOF and a sufficient shutter speed that we can maximize the exposure on the sensor, and the larger sensors will enjoy the advantage of being able to absorb more total light.  Hence, neither differences in QE nor shutter speeds between the systems play any role for a base ISO DR comparison.  Oh the other hand, Equivalent photos will result in the same total amount of light on the sensor for all systems, so the QE of the sensor becomes a significant factor, and is addressed further down in another comparison.

Let's now do a comparison with existing cameras from three different formats at base ISO:

•  Canon G15 (4.65x sensor -- DR = 13.3 stops / µphoto)
•  Olympus EM5 (mFT -- 2x -- DR = 13.9 stops / µphoto)
•  Canon 6D (FF -- 1x -- DR = 13.6 stops / µphoto):

 

Signal (electrons / µphoto) for the G15 Signal (electrons / µphoto) for the EM5 Signal (electrons / µphoto) for the 6D NSR / µphoto for the G15 NSR / µphoto for the EM5 NSR / µphoto for the 6D
           
6.25 33.9 130 144% 79% 93%
           
12.5 67.7 260 75% 40% 47%
           
25 135 520 40% 21% 24%
           
50 271 1041 22% 11% 12%
           
100 542 2081 13% 6.5% 6.2%
           
200 1083 4162 8.3% 3.9% 3.3%
           
400 2167 8325 5,4% 2.5% 1.8%
           
800 4334 16649 3.7% 1.6% 1.1%
           
1600 8668 33299 2.6% 1.1% 0.66%
           
3200 17335 66597 1.8% 0.77% 0.43%
           
6400 34671 133195 1.3% 0.54% 0.29%
           
12800 69341 266389 0.89% 0.38% 0.20%
           
25600 138683 532778 0.63% 0.27% 0.14%
           
51200 277365 1065557 0.44% 0.19% 0.10%
           
102400 554731 2131113 0.78 stops oversaturated (blown) 0.46 stops oversaturated (blown) 0.46 stops oversaturated (blown)

 

So, what does this tell us?  First of all, even though the DR of all three camera is all but identical at base ISO (13.3 stops / µphoto for the G15, 13.9 stops / µphoto for the EM5, and 13.6 stops / µphoto) using the read noise as the noise floor, if we use a 20% NSR for the noise floor, the EM5 and 6D have a stop more DR than the G15.  We also note that the EM5 will have about half the noise as the G15 throughout the entire range of the DR, and the 6D will have slightly worse noise than the EM5 in the bottom two stops of the DR, about the same noise for the next three stops of the DR, and then increasingly less noise for each level of the DR past that point, culminating with half the noise in the highlights.

There is one last comparison to do -- Equivalent photos in lower light (same DOF and shutter speed).  Usually, when we talk about the DR of this or that camera, it is the base ISO DR, as discussed above.  However, while we can see that the single measure of DR can be quite misleading in terms of the IQ of the photo, the NSR table for the noise at each level of the DR gives us a wealth of information.  The NSR table at higher ISOs is no less useful.  To that end, let us compare the EM5 at ISO 800 to the D700 and 6D at ISO 3200.  This would be a valid comparison, for example, if we were shooting a low light scene with motion at, say, 12mm f/2 1/100 ISO 800 on the EM5 vs 24mm f/4 1/100 ISO 3200 on D700 / 6D (for example, a lower light, indoor, wide angle scene at a wedding, with people walking about where we needed a deeper DOF and faster shutter speed).

The reason for introducing the D700 into the comparison here is that it's an older generation FF DSLR with around the same base ISO DR as the EM5 and 6D, but whose sensor has a significantly lower QE (Quantum Efficiency), which means it records a lower signal for a given shutter speed than would a more modern and efficient sensor.  This is not an issue at base ISO, as we can simply use a longer shutter speed -- only the read noise and saturation capacity of the sensor are important here.  But, for a fixed shutter speed comparison, the QE will play a central role.

So, let's introduce our cameras for the comparison (ISO 800 for the EM5, ISO 3200 for the D700 and 6D):

•  Olympus EM5 (mFT -- 2x -- DR = 13.9 stops / µphoto, QE = 53%)
•  Nikon D700 (FF -- 1x -- DR = 13.9 stops / µphoto, QE = 38%)
•  Canon 6D (FF -- 1x -- DR = 13.6 stops / µphoto, QE = 50%)

 

Signal (electrons / µphoto) for the EM5 Signal (electrons / µphoto) for the D700 Signal (electrons / µphoto) for the 6D NSR / µphoto for the EM5 NSR / µphoto for the D700 NSR / µphoto for the 6D
           
10 6.9 9.1 116% 305% 118%
           
20 13.8 18.1 60% 154% 62%
           
40 27.5 36.2 32% 78% 33%
           
80 55.1 72.5 18% 40% 18%
           
160 110 145 11% 21% 11%
           
320 220 290 6.6% 12% 6.9%
           
640 441 580 4.3% 5.1% 4.5%
           
1280 881 1159 2.9% 4.1% 3.1%
           
2560 1763 2318 2.0% 2.7% 2.1%
           
5120 3526 4637 1.4% 1.8% 1.5%
           
10240 7051 9274 0.99% 1.2% 1.0%
           
20480 14103 18548 0.70% 0.86% 0.74%
           
40960 28205 37096 0.49% 0.60% 0.52%
           
81920 56411 74192 0.35% 0.37 stops oversaturated (blown) 0.71 stops oversaturated (blown)
           
163840 112822 148383 0.72 stops oversaturated (blown) oversaturated (blown) oversaturated (blown)

 

The first thing we notice is that the D700 and 6D oversaturate (blow out) a stop before the EM5.  This is because Olympus overstates the ISO rating by a stop for the EM5 (note the discrepancy between the nominal ISO value and the measured ISO setting here).  In short, the sensor and supporting hardware apply a gain to the signal equal to one stop lower than the ISO setting, and then an additional digital push is applied to the signal for the LCD lightness and OOC jpg.

For example, the image file from the EM5 at ISO 800 has an 4x gain applied by the sensor and supporting hardware (this is what the RAW file records, and RAW converters automatically adjust for), followed by a 2x digital push for the LCD playback and OOC jpg.  Thus, in terms of sensor performance, we should compare the EM5 at ISO 1600 against the D700 and 6D at ISO 3200 for equivalent photos in terms of the NSR:

 

Signal (electrons / µphoto) for the EM5 Signal (electrons / µphoto) for the D700 Signal (electrons / µphoto) for the 6D NSR / µphoto for the EM5 NSR / µphoto for the D700 NSR / µphoto for the 6D
           
10 6.9 9.1 109% 305% 118%
           
20 13.8 18.1 57% 154% 62%
           
40 27.5 36.2 30% 78% 33%
           
80 55.1 72.5 17% 40% 18%
           
160 110 145 10% 21% 11%
           
320 220 290 6.5% 12% 6.9%
           
640 441 580 4.3% 5.1% 4.5%
           
1280 881 1159 2.9% 4.1% 3.1%
           
2560 1763 2318 2.0% 2.7% 2.1%
           
5120 3526 4637 1.4% 1.8% 1.5%
           
10240 7051 9274 0.99% 1.2% 1.0%
           
20480 14103 18548 0.70% 0.86% 0.74%
           
40960 28205 37096 0.49% 0.60% 0.52%
           
81920 56411 74192 0.72 stops oversaturated (blown) 0.37 stops oversaturated (blown) 0.71 stops oversaturated (blown)

 

Without the manufacturer "ISO games" being played, we can see that, for equivalent photos (same total amount of light on the sensor), the sensor efficiency is the primary player.

Let us consider one more situation -- a hypothetical FF sensor made from the exact same pixels as the EM5 sensor (this would mean the FF sensor would have 62 MP compared to the 16 MP of the EM5 -- again, a little off from 4x the pixel count due to 4:3 vs 3:2).  We will do this in two parts -- with both sensors at base ISO, and with both sensors at equivalent settings (ISO 800 vs ISO 3200):

 

 

Signal (electrons / µphoto) for the EM5 at base ISO Signal (electrons / µphoto) for the FF sensor at base ISO NSR / µphoto for the EM5 NSR / µphoto for the FF sensor
       
10 38 263% 135%
       
20 77 132% 67%
       
40 154 67% 34%
       
80 307 34% 18%
       
160 615 18% 9.2%
       
320 1229 9.9% 5.0%
       
640 2459 5.7% 2.9%
       
1280 4917 3.5% 1.8%
       
2560 9835 2.2% 1.1%
       
5120 19670 1.5% 0.76%
       
10240 39339 1.0% 0.52%
       
20480 78678 0.71% 0.36%
       
40960 157356 0.50% 0.25%
       
81920 314713 0.35% 0.18%
       
163840 629425 0.25% 0.13%
       
327680 1258851 0.17% 0.09%
       
655360 2517701 0.70 stops oversaturated (blown) 0.70 oversaturated (blown)

 

Here, we see the FF sensor made of EM5 pixels has half the NSR at each level in the DR, which is a substantial advantage.  Now let's see what happens for Equivalent photos at a higher ISO (ISO 800 for the EM5, ISO 3200 for the FF sensor):

 

 

Signal (electrons / µphoto) for the EM5 at ISO 800 Signal (electrons / µphoto) for the FF sensor at ISO 3200 NSR / µphoto for the EM5 NSR / µphoto for the FF sensor
       
10 9.6 116% 231%
       
20 19 60% 118%
       
40 38 32% 60%
       
80 77 18% 31%
       
160 154 11% 16%
       
320 307 6.6% 9.2%
       
640 614 4.3% 5.4%
       
1280 1229 2.9% 3.4%
       
2560 2458 2.0% 2.2%
       
5120 4915 1.4% 1.5%
       
10240 9830 0.99% 1.0%
       
20480 19661 0.70% 0.72%
       
40960 39322 0.49% 0.51%