The embedded JPEG in the raw file would have to be low quality, or reduced resolution, whatever the other factors. With Canon raws, the embedded JPEGs are full-res and HQ, so a stand-alone JPEG will never be larger than a raw file from the same exposure. If other manufacturers use low-res, lower-Q embedded JPEGs, then it could be possible, depending on raw bit depth, exposure levels, and compression.
BTW, lossless compression of raw data in cameras is designed for speed, but in any raw where only 5 or 6 bits are actually used in most of the photo, third-party compression can make lossless raws so much smaller than OOC raws. Convert to uncompressed DNG, and apply max compression in something like 7-Zip, and the intelligent compression will see right through the CFA mosaic, and see that the color channels are more compressable than the full mosaic.
I have solutions. Noise should be measured in a spectrum, and if you're only going to report noise at one frequency, make it at a lower frequency than the original pixels. Taking the standard deviation of raw images at the pixel level sets up cooked raws to seem more noiseless than they actually are. Cooked raws use filters with very small radii, usually just softening the contrast between neighbor pixels, which has almost no effect, whatsoever, on the original pixels binned, say, to 4x4. A 4x4 bin is actually a more accurate gauge of image level noise, than the original pixels are. Somehow, high-frequency standard deviations have become a sacred cow in the world of noise metrics, when in fact, that frequency is the most likely to get corrupted in any kind of filtering or resampling, or line-based offset noises (fine horizontal banding noise is still present in modern sensors to various degrees).
Every website that measures noise still has the original raw files, and could add "binned" data, which is actually more related to the visual experience of noise, than pixel-level noise is, especially with higher pixel counts. No new tests would be needed; just a script that re-interprets the original files at a lower frequency which is unaffected by raw cooking.