Great if we agree. I possibly interpreted what you said differently from how you intended. I didn't talk about CFA the same as our spectral sensitivities, though that would be the only way to get entirely 'accurate' colour - simply because any other response would sift the spectral energies differently, and thus sort the photons differently than a human viewer would. The LIM conditions refer to a model that differs from the ground truth of human perception for reasons of mathematical convenience. The model's close enough to have served us well for 90+ years, but shouldn't be confused with the 'purist' actuality.
Actually, it does. It doesn't appear to do so because our display devices are designed to work with typical interior illumination conditions. View them in different conditions and the colours go off.
Sure, I didn't say that you did. I was just pointing out that what we call 'colour science', the CIE work was originally about prints and printing.
I can understand why your DPReview interlocutor quibbled about it, because i think that's quite an idiosyncratic definition. On the 'basic linear algebra', the point I made above applies. That it is true only of the CIE model, not the ground truth - though as I said there, the CIE model has served us well.
I have a real problem understanding your point sometimes, there was a half sentence a while back that I misinterpreted. I also missed the point of your laser narrow, my knowledge about screens needs updating. If we are talking about the perception of red then rather than the whole, yes. If we are talking about perception of colour and the accuracy of that system being defined by reference colour viewed and not by measurement and preserving actual recorded wavelengths, yes. Accurate colour also becomes a necessity of the theoretical model rather than a truth that must be preserved on output, yes let the eye take care of the resulting inconsistencies that exist in the system. You see I'm not a purist and don't think accurate colour has any real importance outside corporate logos. I'm also deliberately avoiding any mathematical model fo colour as I would much rather develop a more abstract connection. Not really sure what the second sentence means.
Actually yes, we are arguing about terminology. Look, I just want to expand my limited knowledge and if your words meaning contradicts with what I have known previously, then I have question - is it problem with your words or my understanding? Thereby I asked for explanation, and from your explanation I concluded that you were talking about metamerism errors.
(About device metamerism - I agree I used this 'metamerism' term in a limited sense; this doesn't change basis of my confusion.)
Do you mean 'single' wavelength as a dimension? Even then I cannot get the idea of your second example.
Algebra is one of my weakest points. All those 'infinite dimension property spaces' and similar things do not talk to me :(
Where did I say that? You implied it from "not the same" part? Was not intended this way.
How do we construct this set of channels (say RGB screen) then? I know that we have screens with enough good color reproduction, but can those devices (screens) made ideal (create exactly similar eye response as the recorded scene creates, be it within the gamut then)?
I think you mean that you can't generate the exact same light as the eye saw, you can certainly trigger a similar response in the eye. But the limitations of the capture device, or the way it differs from the human eye, combined with the limitations of the display device introduces limits that doesn't quite match that of the eye. It is only possible to reproduce the colour within the limits of the devices, or their colour gamut.
[EDIT] If you are talking about the light from the actual scene then you can only talk about approximate because colour in combination can produce non-linear responses in the eye and real scenes are often lit by several sources with varying WB rather than the global one assumed.
Colour is always multiple wavelengths never single ones.
Some terminology for the non-mathematicians
'interval' is a set of values across which you're looking at the function. So 'on whatever fixed interval' means we're comparing the functions over the same range.
'Linearly independent' - cannot be derived from each other by a fixed multiplier.
Conceptually colour vision could have as many 'channels' as you want. For humans it is 3, hence a '3D subspace'. For most mammals it would be 2 (2D subspace). For most birds it would be 4 (4D subspace). For some birds and turtles it would be 5 (5D subspace). The dimensionality of the space reflects the number of colour stimuli in teh organism for which you're trying to provide colour reproduction.
Usually called the 'Luther-Ives-Maxwell' (LIM) conditions, sometimes 'LI', independently derived by these three scientists. They say that you can use any three functions, so long as the ones that you actually want can be derived from them using linear combinations (simple multiplication and addition) of the functions that you have.
It's not 'horribly spiky'. It's spiky by design, and works much better than would a non-spiky curve that satisfied the LIM conditions (these don't).
The point about this spectrum is that each of the three stimuli should excite as nearly as possible a single type of cone in the human eye. The response of the eye looks like this (from Wikipedia)
The best stimulus would be a monochromatic (hence 'spiky') source targeting a point on the cone response curve which selected only one kind of cone. THis is why the best OLP projectors use laser light sources. It's relatively easy to stimulate (more or less) only the S cones, because that response is well separated from the others. The other two are more problematic, because they overlap. On the other hand, that overlap means that in real-world use they will be stimulated together by most sources, so the aim is to provide maximum differential control of the stimuli. The stimulus for the M cones fits the peak response, but that for the L cones doesn't - it's shifted to longer wavelengths to get more separation.
In short, the spectrum of the exciting illuminant need have no direct relationship to that in the original scene. All that matters is that the correct cones are stimulated. All the brain knows is which cones are stimulated and how much - it has no way of measuring the exact spectrum of the applied stimulus.
In terms of colour reproduction the device recording the scene responds to real wavelength, but this information is not preserved. Somewhere along the line this is transformed into the sensation of colour. This is the maths you are talking about? How colour recorded by the sensor is transformed to a model of absolute colour and then how that is transformed into the output. It is not about collecting light and preserving the absolute wavelength, or really about absolute wavelength at all past a certain point. It's about preserving the sensation of colour.
Question: And I hope I use your terms correctly, if we move to a 4D system (4 sensors on a camera) and had 4 colours on an output screen, does this increase accuracy or really just push the gamut closer to saturated colour? And given most people's confusion between accurate and pleasing colour would it really be noticed by the human eye?
I've always been confused by some, even on camera forums, photographers perception that it is what the camera recorded and so is the real colour of the scene, "SOOC, I've not changed the colours!" There is a thread on another photo forum about unexpected "natural colours revealed" in highly processed images taken at night. I know about night vision and the nature of moonligh, but you have to see the image. I found the whole concept quite absurd. Just wondering what you impression is about the general understainging of colour on photo forums?
It's probably best working this one back form first principles. On of the issues about 'colour science' is that people come at it from the middle, with a lot of abstruse maths. Devoid of context, it's hard to understand.
Ultimately, what we want to do is provide the same data to the visual cortext that would be the case if the eyes were actually looking at the scene. The data comes from the rods and cones in the eye, so what we want to do is to 'fire†' the rods and cones in the same way as would be done by the eye looking at the actual scene. To do this we need to arrange for the eye to be looking at a device with models the relative light intensity of the scene, and also ensure that the light has a wavelength characteristic which triggers the same set (L,M or H) of cones as would looking at the scene. All that matters to our perception is that the right set of cones gets fired, not at all whether the spectral pattern of the light firing them is the same as the scene was originally. So, in a light emitting display (as opposed to a reflective print) the best way to do this is to choose narrow band light sources with wavelengths chosen to give maximum differentiation between the three different sets of cones‡.
So to capture a scene to provide the correct stimuli we need to 'record' how a retina would have reacted to the light - hence we use three channels. There are some confounding issues. One is that the reception pattern of cones varies between individuals - but not too much and it turns out that an average works for teh vast majority of people. The second is that it's very difficult to manufacture dyes or colour separators which have exactly the same spectral profile as do the human cones. Again, we can approximate - and that is what has been done. The CIE XYZ 31 (31 because it was released in 1931) provides a model of human colour vision good enough to work for most people. It's deliberately inaccurate, because it's been designed to be mathematically convenient, of which more later.
For engineering reasons using colour channels that replicate the eye is not the best solution. It doesn't make best use of the optical capture technology available. To get the best quality images (in terms of noise) we want to make most use of the wavelengths at which our capture devices (these days, silicon sensors) are most efficient. One of the advantages of using a mathematically convenient (rather than strictly accurate) model of colour perception is that we can use simple mathematical operators to translate between them. The LIM conditions are quite strict - it means that we must be able to generate XYZ simply by multiplying by some constants and then adding the three channels together. This puts quite a constraint on the colour channels of the camera. There's a looser condition whereby we can get to XYZ by multiplying by some translation functions - that vary by wavelength - rather than constants, and this is what Foveon sensors use to get good colour from 'filters' that are determined by physics rather than design.
So, in answer to your question. Three channels is enough so long as they are a good three channels (LIM or equivalent). In teh case that you can't get a 'good' three channels it might be possible to use a fourth 'helper' channel to generate useful XYZ. It has been attempted with four channel modifications to Bayer, but it didn't provide a clear advantage and no-one's doing it now.
†the word 'fire' here suggest that it is a binary on/off response. This isn't quite true. In fact it appears that there is a mixture of binary type cones and ones with a more gradual but graded response - see here
‡ narrow band sources cannot satisfy the LIM conditions because at most visible wavelengths the emittance is zero, and however much you multiply a zero, you won't get a spectral contribution to match the XYZ spectra. But this doesn't matter, because LIM applies to capture and not reproduction. For reflective (print) reproduction it is important to have pigments which satisfy LIM, since we can't control the illumination, which is why modern colour printers end up depositing so many different inks.
Equivalent, and I was trying to provide an explanation for non-mathematicians. I thought that I might lose them at 'there exists no nontrivial linear combination of the vectors that equals the zero vector'.
No harm in clarifying for people that missed the presumption.
The emphasis that I would put on it is different from 'it can still create nice images'. It's because it's like that it creates nice images. Monochromatic sources are optimum for an emissive display. They control the stimulation of the required sets of cones as precisely as is possible, given the overlap of the bands.
One of the great advantages of mathematical parlance is that it can convey precise concepts concisely. When trying to communicate with non-mathematicians a lot of words might be necessary if they are to understand what you're saying.
Yes, but again that leads to a discussion that possibly most non-mathematicians would not follow.
Again, monochrome is spiky, it's not the only spiky.
Design is always a technological compromise. Clearly the design intent was a single spike in each channel. Frankly, knowing how this technology works, and how old the paper was, I'm very surprised that it's that good.
Anyhow, I'm getting the impression that you're seeing this more as a mutual urination contest between you and I than an attempt to clarify this whole matter for people who perhaps don't have the mathematical background that you do.
Actually I know, what is linear transformation, what means linearly independent and so on, I just get mental block everytime when I read about infinite dimensional spaces :) Or super unitary goups and similarly abstract algebraic concepts.
I'll attempt to explain in few words, what I gathered from this discussion (assuming linear model for all operations).
(1) image (incl. color information) is recorded by sensor in three channels, whose sensitivity is designed to [approximately] satisfy LI condition (linear combination of cone response); this process is subject to metamerism errors (or metamerism according JACS terminolgy)
(2) recorded data is converted into some RGB space, using linear transformations
(3) RBG data is used to produce output (screen or print), which (within device gamut) generates similar eye response as original scene
In non-linear case conversions (2) are a bit more difficult, but general idea is same.
'Color science differences' are mostly (a) different metamerism errors in different cameras/sensors in step 1 and (b) manufacturer preferred color tweaking models in step 2.
(I recall reading article from one Nokia designer, who told about tweaking Nokia's colors for months to get really pleasant image.)
PS. About Foveon - in first approximation image data is processed using linear transformations, it is seen in dcraw code for example. (I've coded Foveon software for older cameras after all :))