Image sensors: Modulation transfer function

The modulation transfer function specifies the contrast reduction as a function of the spatial frequency. It is often expressed as line pairs per millimeter (lp/mm) in the image space and commonly specified for lenses. A MTF value of 1 is perfect, a MTF value of 0 means all contrast is lost and the image is perfectly blurred. A line pair means a sinusoidal cycle with a dark and a bright half, not a sharp dark and bright line, because a sharp edge contains further frequencies. The MTF for a frequency may be different from the contrast of a USAF-1951 resolution target. Strictly speaking, the concept of the MTF does not apply to a pixel sensor, because the result depends on the subpixel shift of the image, whereas a PSF does not, but in lack of a better concept it is still used. This gets extreme at the Nyquist frequency, because the integrated signal depends only on the pixel shift alignment.

The Nyquist rate and the airy disk

The Nyquist frequency for image sensors is the spatial frequency of one cycle per two pixels equals a half cycle per pixel. The other way round, the Nyquist rate is two pixels per cycle. It suffices to sample spatial signals below the Nyquist frequency. Should the spatial signal be at Nyquist frequency or higher, aliasing will occur, commonly known as Moiré patterns. So if the airy disk radius takes two pixels, there isn't more information in the signal and any more pixel resolution does not help, right? If that sounds good to you, you are on the wrong track.

It is a common misconception to decide which contrast you need at which spatial frequency and then make sure that the lens MTF delivers that, assuming if the spatial frequency does not exceed the sensor Nyquist frequency, the contrast will be delivered by the camera, or at least almost, because reality is always a little bit worse than theory. An indication something is wrong here are many heated discussions on the net why sampling at a higher rate than the Nyquist rate delivers surprisingly better results, which appears to contradict the theory.

Simple mistakes

Things start with the fact that the diagonal pixel pitch is larger than the main axis pixel pitch, so the Nyquist frequency measured in lp/mm is lower in diagonal direction. Put it the other way, to keep the Nyquist rate in diagonal, a higher rate in the main axis is needed: The resolution of a digital image sensor is direction dependent.

A diffraction limited lens has the cutoff frequency where the airy disks of two points cannot be separated any more, which is less than the radius of a single airy disk!

$$\text"f"_\text"cutoff" = \text"diameter" / { λ · \text"focal length"} $$

This formula looks similar to the airy disk radius, but lacks the constant 1.21966 which is the first minimum. The Rayleigh resolution criterion places the center of the next point there, which yields a visually well separable minimum, but it is possible to move that closer if a contrast reduction is acceptable, which increases the frequency in the image, up to the cutoff frequency which yields no contrast any more. This diagram shows the Rayleigh resolution criterion and the cutoff frequency, and how their point source pairs create them:

Cut off period

Should the object contain arbitrary sharp features, only if the cutoff frequency is used as Nyquist frequency, there will be no aliasing. Often the object frequency spectrum is lower, though, in which case the cutoff frequency is not determined by the lens.

Taking the diagonal and the cutoff frequency into account, it still shows that sampling with more than the Nyquist rate offers a benefit, though.

Pixels do not perform Nyquist-Shannon sampling

This appearent contradiction has a reason: The Nyquist-Shannon sampling theorem samples with a Dirac comb function, but pixels are anything but a comb. Instead they integrate the signal on their area. In math terms, that is a convolution of the signal, acting as low pass filter. It is important to understand that the Nyquist rate is determined by the pixel pitch and the convolution is determined by the pixel shape. Both effects are independent physical limits, which means real sensors cannot be better, but due to effects like diffusion and reflections they could be worse and even in a wavelength dependent way. So what's the limit?

Square pixels can be modelled with a box function in horizontal and vertical direction and with a triangular function in diagonal direction. Another way of thinking about it is sampling with the convolution of that function and the Dirac comb function. In the frequency domain, the convolution of two functions is the product of the Fourier Transformed of both functions, and we are interested in the contrast at a frequency anyway. The Fourier Transformed of the Dirac comb function is the Dirac comb function, no so change there. The normalised Fourier Transformed of the box function is:

$$\text"sinc"(\text"frequency")$$

sinc(frequency)

At 0 cycles/pixel the signal is fixed and the MTF is 1, which is perfect. At 1 cycles/pixel the function is 0, which means no contrast at all, but that is only the case for a rectangular pixel, because other shapes cause a different convolution and may need other frequencies for perfect blur. In between is the Nyquist frequency of 0.5 cycles/pixel, where the MTF is $1/(π / 2) = 0.64$ (cross mark). This is important: For gapless perfectly rectangular pixels in horizontal and vertical direction the MTF at Nyquist frequency is only 64%, because the pixel area is not a point, but an area, and that convolutes the spatial signal. Note that any integral frequency (2,3,4 etc) also has a contrast of 0, but in between the MTF is negative. That means anything beyond the Nyquist frequency shows aliasing, which can look blurred or show Moiré patterns. Although a negative contrast is valid and indicates a phase inversion, which is visible in a Siemens star as a blurred area where the MTF first crosses 0 and inverted stripes beyond, by convention MTFs are always shown as absolute values:

The area between 0 and 0.5 is most interesting and it is easier to imagine what that means when imagining it in terms of period length measured in pixels per cycle, not cycles per pixel, which is $\text"sinc"(1 / \text"period")$ or $\text"sinc"(1 / ( \text"pixels" / \text"cycle"))$, where the interesting range is from 2 pixels/cycle upwards:

MTF of main axis for square pixels

Obviously more pixels do help to some extent.

Coming back to the diagonal axis, the Nyquist frequency is lowered by $√2$ for square pixels, but in addition the pixels look triangular. The normalised Fourier Transform of the triangle function is $\text"sinc"(\text"frequency") ^ 2$. Note that the frequency has a different meaning here: In the main axis, a pixel is related to the unity box function, which is 1 between -0.5 and +0.5, but in the diagonal, a pixel is related to the unity triangle function, which ranges from -1.0 and +1.0. For that reason, the frequency must be halved. Since the diagonal is longer, the halved frequency is then multiplied by $√2$, leading to:

$$\text"sinc"( \text"frequency" / √2 ) ^ 2$$

For that reason, it needs a higher spatial frequency in diagonal direction until everything is blurred, but the Nyquist frequency is still lower. That sounds counter intuitive, but a triangle causes less convolution than a rectangle of the same size.

sinc(pi/(pixels/cycle)) and sinc(pi/(pixels/cycle))^2

Again converted to the period, now 2.8 pixels/cycle are mandatory, but also yield a good MTF of 0.81, and 4 pixels/cycle are great.

sinc(pi/(pixels/cycle)) and sinc(pi/(pixels/cycle))^2

Since a long time, sensors use microlenses, so how about round pixels? The normalised Fourier Transform of the unity circle function is ${\text"J"_1(2 · π · \text"frequency") } / { π · \text"frequency" }$. The unity circle function extends from -1.0 to +1.0, similar to the triangle function, which requires to half the frequency like for the triangle. With that change, the normalised Fourier Transform is:

$${2 · \text"J"_1( π · \text"frequency") } / { π · \text"frequency" }$$

The MTF at the Nyquist frequency is 0.72 for the main axis and 0.85 for the diagonal axis (due to the lower frequency). The curve does not change in diagonal direction, because round pixels are direction independent. Only the Nyquist frequency changes due to the increased pixel pitch in diagonal direction.

Converted to the period, the range to be expected gets obvious:

Real pixels are somewhere between round and square as a compromise of area/fill factor/sensitivity and convolution. The MTF in the shaded area ranges from 0.64 to 0.81 for square pixels with round pixels at 0.72 and 0.85, which can be considered good.

The convolution depends only on the pixel size and shape. Designers of sensors with a very good QE can make the trade of a little pixel area against more gaps to increase the sensor MTF at Nyquist frequency, leading to sharper looking edges, but not more resolution, because the resolution only depends on the pixel pitch that determined the Nyquist frequency.

As a consequence, a color sensor delivers sharper color subframes than a monochrome sensor with twice the pixel size.

A low pass filter reduces the MTF at Nyquist frequency and above to avoid annoying Moiré patterns. There are other effects that may lower the MTF.

Summary

The theoretical sensor MTF at Nyquist frequency is much lower than 1.0 and the Nyquist frequency of the diagonal is $√2$ lower compared to the main axis. More pixels than the Nyquist rate help for a better MTF, but beyond 3.5–4 pixels per cycle there is only little further benefit, unless the sensor delivers much less than physically possible.

If the lens delivers higher frequencies than the Nyquist frequency, aliasing will occur. In that case, a low pass filter helps, which is why many DSLRs have one. Since pixels get smaller all the time and extreme frequencies do not occur often, current DSLRs are often available with or without lowpass filter, depending on the needs.

This discussion of pixel areas causing a convolution also applies to ADCs that integrate vs. sample and hold and to digital line sensors for spectrometers.