Well, if we were dealing with complete beginners and we had photon-counting cameras with only two exposure/brightness-related controls, exposure time and aperture size, we might not even need to mention "equivalence". Equivalence as a concept is a way of correcting myths that thrive in a world where people conflate noise with ISO and/or sensor size, angle of view with focal length, and f-number with pupil size.
One answer would be that the FF camera has more potential to go into a range of visible photographic qualities that come from having a larger pupil - less diffraction, shallower DOF, and less photon noise. If you don't go into that range, which requires lenses that are generally more expensive, heavier and larger, then the size of the FF sensor is no longer an extra value. If you have a high-MP FF, though, you may have more total pixels than is available with smaller sensors, and your base-ISO shooting when light is abundant will capture more light (regardless of pixel count). Of course, the lower diffraction potential of FF is often moot because the fastest lenses for FF are generally aberration-heavy wide open, making them soft when you go for that shallower DOF and extra light.
Most of the situations in which a smaller sensor is superior final-IQ-wise are ones where you aren't going to utilize the entire FF sensor area (through higher pixel density). You simply need to extend the idea of equivalence to "sensor area actually used for the final composition" rather than "sensor size".