There's nothing complicated here; if you separate a 3-channel raw image into 3 greyscale images, you are seeing simply that; nothing is interpreted, color-wise, in the creation of the raw file; the file simply also has metadata that tells a converter how to derive realistic color from the differences between those 3 greyscale images.
Some might find that a bit puzzling - especially "the differences between". I think it has more to do with ratios between levels extracted by matrices or LUTs.
Example matrix, sensor outputs to XYZ:
camRed camGreen camBlue
X 0.14759 0.65183 -0.13036
Y -0.83358 2.43696 -0.93664
Z 0.60299 -2.61102 2.7869
From a Foveon-based camera - for a CFA-based camera, the coefficients would be much less extreme.
The posted image was converted from a TIFF which looked no different on my computer. The TIFF was a raw composite with no conversion whatsoever. Therefore, I fail to see your point, sorry.
I would say that there is a step before understanding the data, which is understanding the information - that is what the data represents. I would argue with the concept of 'actual colours'. Colours are something that don't exist in the physical world, there are only photon energies (wavelengths of light). Colours are entirely part of human perception. This is an instance of an often misunderstood concept, about which I bang on frequently. The misunderstanding ultimately underpins many of the fallacies surrounding photography. The misunderstanding is that the input and output of the photographic process are both about light. This is wrong. The input is about light, and for colour photography, light separated into different energy bands. The output is about human perception - it is a recipe that specifies how a human being should perceive the scene. It's on this point that once can argue that a raw file doesn't encode 'colour' - rather it encodes light energy in different frequency bands. We generally use three frequency bands (though that's not a necessity) but they don't contain all of the information necessary to create the output image. That is because human perception is adaptive to lighting conditions, so we can't translate to an XYZ (perceptual) space without some information about the lighting conditions. To some extent that information will be in the metadata, in terms of the 'colour temperature' - but this is a very rough and ready indicator. Essentially, it assumes that the lighting is provided by a black-body radiator at a particular temperature. Many actual illuminants don't follow those rules, and thus we get metamerism errors in many translations.
Not quite, because the gamut of perceptual hues is larger than the spectral hues. Magenta, for instance, is a hue that doesn't occur in the spectrum. The reason for this is that human vision is not measuring the wavelength of light. It's just looking at the magnitude of a stimulus in three different (and overlapping) wavebands. It's made somewhat worse because mammalian tri-stimulus vision is a later adaptation of bi-stimulus vision, whereby the lower frequency band split into two, leaving us with an 'S' band well separated form the largely overlapped 'M' and 'L' bands. Birds (and presumably dinosaurs before them) have quadri-stimulus vision separated into nice evenly spaced bands. That's one reason dinosaurs never invented colour photography.
Isn't it also because the S band is also sensitive to some red? So in effect even as you move to the ends of the spectrum you are still creating a signal across two types of cone.
In the days before the eye even developed the lens. The question to bear in mind is how colour vision developed, and it's because it gives us a greater chance of survival this way not because light does so and so.
The S cone's response is essentially 0 at 550nm and below, so no, it has no significant response in 'red'. The 'L' cone's response peters out at 400nm at the short end, well into the 'violets'. Magenta is caused by an essentially equal stimulation of S and L cones, with minor stimulation of M cones, impossible with a single wavelength.
I'm not sure that without the lens (single or compound) it's 'vision' as we'd understand it. There's an interesting discussion here.
I actually do not find this an extreme example, though I have had trouble with reds from certain raw files when processed in a particular program, and am therefore sensitised to the red/orange problem. Actually, I have looked carefully at traffic lights and the reds and greens are actually quite strident in real life, probably due to the LED illumination used for them these days. I find the same with the rear lights of modern cars. In the picture above, the red jacket stands out anyway, because everyone else is dressed in rather drab colours.
In fixing my problem, referred to above, it was not a need to reduce saturation, but luminance, for the strident red/orange issue.
Not really sure what you're trying to say here so excuse me if I get it wrong.
There is no absolute truth about an image and photography is never about the simple equation of capture that absolute truth then display it and the eye will see the same. It may give you a symmetry and a framework that allows a comparison of the real scene in terms of absolute physics, but it has little to do with what actually happens.
How do you recreate a 3D scene and it's illumination exactly? By transferring it to a piece of paper and holding it under a balanced LED light? How does this work with the additive system of your computer screen?
There is no truth in the data captured by the camera, the truth about colour only exists in the nature of human perception. Even the computer colour model is based on this, not on the physics of light. Even in a fully calibrated system the baseline is only maintaining the perception of colour under standard illumination as calibrated by the standard human eye. Do you really think that equal quantities of RG and B actually produce grey? Of course they don't, just as colour is not made up of RG and B components. It's just a perceptual model that's easy to work with. Real grey is the lack of a dominant hue, in RGB colour space it is represented as equal values and the monitor profile in the monitor driver converts that into the actual voltages that produce the same sensation of grey to the standard human eye. There is no preservation of actual data, only the preservation of perceived colour under controlled lighting.