In fact, depth perception is quite well understood. It depends on a large number of different cues in the image, but many of those cues behave in a mathematically predictable way when you enlarge the image.
If you observe two adult people, one 100m away and the second 200m away, the first person has a size of about 1 degree and the second a size of about 0.5 degree. If you look through a 10x telescope, then the size of the first person is 10 degrees and the second person is 5 degrees. Now, that is the same as you would see looking (with the naked eye) at a person 10m away and another 20m away. So your brain tends to perceive the distances as 10m and 20m when looking through the telescope and as 100m and 200m when looking without the telescope. The distance between the two people appears to be compressed from 100m to 10m by the telescope.
However, when we look at a photograph, we know it is a photo and we usually don't know what focal length was used or how much it has been cropped, so we tend to automatically reserve judgement on how far away things might be in the image. So, it is often only in the most extreme cases that we are confident in saying that the perspective looks compressed.
Because all depth perception is just an optical illusion, like most illusions thinking about them too much can sometimes make them harder to see. To see depth in an image you need to imagine that you are looking at a real scene rather than looking at a flat image.
It's interesting that you don't think so. I see plenty of evidence for it myself. But, of course, it all depends on one's perception of depth in a two-dimensional image.
OK, what do we really see when looking at these two images at the same time?
The second image is simply the centre of the first image magnified 4 times.
Everything in the second image is 4 times larger than it is in the first image.
Hence every object in the second image appears 4 times closer than it appears in the first. In other words, it appears one quarter as far away. That applies to every individual object in the image.
All the distances in the first image appear to be compressed by a factor of 4, in comparison with the first image. That is telephoto compression.
I think your consciousness is probably overriding your instinctive depth perception, but I really don't know.
I would be interested to know what you see when you look at the videos in this article.
If your depth perception really isn't seeing any difference when the image is magnified, then you will not see the illusion of the train apparently moving more slowly when the image is magnified (by zooming the lens).
Tom, what you propose is described by Bruce MacEvoy as a '"peep show"'.
For me that explains the difference between what you see and what I see.
My brain probably just extrapolates what I would see if I would be looking from that 'center of projection', while the image remains a 2d object which depicts a 3d scene whose perspective does not change when correlated with its frame.
It's good to know that you are normal in so far as you can see the illusion of the train appearing to run more slowly. Crop your hands if you prefer, it makes no difference, you are seeing telephoto compression when the image is magnified.