• Members 557 posts
    Oct. 26, 2024, 10 a.m.

    This blog post by Aaron Hertzmann is the first online tutorial I have found that gives a well-informed discussion of perspective distortion in photographs.

    As I have said many times before, there is a lot of nonsense talked about perspective distortion. It is nice to see something available online that gets it right.

  • Members 187 posts
    Oct. 26, 2024, 11:18 a.m.

    Absolutely, 100%, and I would encourage that we read it without trying to fit it withi our own per-concieved ideas/frameworks or perceptions. and he leaves the door open at the end. But I must make a couple of points relevant to our previous discussions.

    Please note that the author makes no attempt to link human vision to the maths of linear perspective, or any attempt to describe what we see in terms of the maths of linear perspective. He doesn't say "exactly the same", quite the opposite:

    In principle, linear perspective requires you to have only one eye open, and to have it located right at the COP, in order for the picture to “look right.”

    With "telephoto compression", (we use the common label for continuity) and wide angle distortion he also indicates that the former is when you look at a photo of a distant object and the latter a photo of a close object. It is not the singular effect of the COP, the case of a single photo viewed from in front of, then behind the COP causing the distortion. It is the effect of the correct mapping of the linear perspective locked into the photo then mis-interpreted by the human eye as a result of looking from the comfortable viewing distance that is either in front of or behind the COP respectively.

    The last paragraph on the subject is revealing:

    Throughout art history and perception research, there is a long history of assuming that linear perspective is the “right” way to make pictures, and any distortions or misperceptions must be “user error,” i.e., mistakes made by the artist or the brain. To me, this seems backwards. If human vision doesn’t perceive pictures according to the “rules” of linear perspective, then the “rules” aren’t quite right.

    Or to put it another way, in mathematical terms when using the human eye it is the photo that shows telephoto compression that's closer to correct linear perspective and the one that "looks right' that's the distortion of linear perspective. If we keep trying to link linear perspective to the "correct" answer and use that as the basis of "what we see" then we quite literally get it back to front, you treat what is in fact the variable as the constant and vice-versa.

    Sorrry, but in all our conversations we always fail at one simple point, one simple mathematical truth. It is your desire to rationalise and understand perspective, and you choose to do it in the language you are both familiar with and understand, but read the quote above. We do not see the world as a series of still images where we can ray trace, it is quite impossible for the human brain to do this, as well as counterproductive as it wouldn't present a workable or understandable solution. So quite simply:

    Linear perspective is never equal to what you see.

    Any attempt to equate one with the other without also allowing for the nature of human vision will be wrong. It's a blanket truth.

  • Members 187 posts
    Oct. 28, 2024, 9:22 a.m.

    J A C S has just posted a link to the full article in this thread's sister posted on the other DP:

    Toward a theory of perspective perception in pictures

    Some of the initial Abstract:

    "Many past theories and experimental studies focus solely on linear perspective. Yet, these theories fail to explain many important perceptual phenomena, including the effectiveness of nonlinear projections. Indeed, few classical paintings strictly obey linear perspective, nor do the best distortion-avoidance techniques for wide-angle computational photography. The hypotheses here employ a two-stage model for 3D human vision. When viewing a picture, the first stage perceives 3D shape for the current gaze. Each fixation has its own perspective projection, but, owing to the nature of foveal and peripheral vision, shape information is obtained primarily for a small region of the picture around the fixation. As a viewer moves their eyes, the second stage continually integrates some of the per-gaze information into an overall interpretation of a picture. The interpretation need not be geometrically stable or consistent over time."

    It's worth a read.

    Just to expand on one of the points made here regarding human vision and the correct projection of close peripheral objects as predicted by the maths of linear projection. The apparent distortion of the shape of peripheral objects is absolutely correct according to linear perspective.

    However the human eye's sharp focus is limited to a narrower field of vision to the fore, we simply don't have a peripheral vision to match with linear perspective, (this may explain why it looks odd to us). And so to build a complete picture we move our heads and eyes . We "pan" the scene and what we see is a composite. It's worth pointing out again that:

    If you view the resulting and correct projection of that "wide angle scene" from the COP then you can do much the same, you can turn your head (pan the camera as you would have to in the original scene to match human vision) and look directly at the corners at which time the oblique angle appears to compress the perspective into the shape you are familiar with, the straight on view. The illusion of the skull in Holbein's The Ambassadors is a good example of this.

    But back to the original question, and this is important:

    When looking at the image, at which point (from the COP or behind) are you seeing the "correct" perspective as defined by linear geometry, and which point represents a distortion of the correct linear perspective?

    It's important because theories so far have centered around the assumption of: When we view from the COP and the perspective in the picture matches our view of the real world, this is the point where linear perspective is correctly preserved and when we view from outside the COP that the image appears distorted.

    Another quote from the preamble:

    "The first set of proposed hypotheses state that viewers understand the 3D shape and structure within any small region of a picture independent of the rest of the picture, with very specific exceptions. When a viewer fixates on a picture, they interpret shape and space around that fixation point, primarily in a radius related to the size of the fovea. This 3D interpretation does not change after subsequent fixations in a picture, regardless of the content around the region, except after high-level changes in object recognition, such as in bistable imagery."

    By "fixation" I assume the meaning "when we first look and the assumptions/determinations we make at that point", and "Bistable Imagery" is such as the skull in The Ambassadors, drawn as a distortion and designed to resemble reality from one position only, and so your interpretation switches to another stable interpretation at that point.

    But the import is clear, if we make an interpretation that holds steady regardless of subsequent viewing positions then at none of those other positions does our interpretation match the reverse engineered linear perspective. And if it doesn't hold true in those positions then it's highly unlikely that it holds true in the initial fixation because if "angle subtended" is a primary source of interpretation in the initial view then it should hold at least some import in subsequent viewing positions.