@jameslo the I.04 example operates on the magnitude as well. The only difference is that the powers are filtered by the noise-mask before the root.
yes, there is theory. In this case (amplitude measurement) it's plain signal theory. But you can think about it in less technical terms. e.g. the RMS thing is just a signal that is ring modulated by it self. So you end up with 2 frequency bands, one at double the original frequency and one at 0hz! The lowpass filters the high frequency band out and what's left is the 0hz signal or dc-offset or amplitude...
Now there is something like the hilbert~ allpass, which magically shifts the phase of your signal by 90 degrees, roughly speaking. That's very handy because if you multiply these signals by themselves, the phases of the high frequency bands are 180 degrees apart and if you add them together, they cancel each other out and you got the amplitude instantly, without the need for a lowpass filter. (the magnitude calculation in the fft examples works similar)
I think there are no rules if you have a basic understanding of what you're doing in DSP, it's all about artistic decisions and compromises. Do you need instantaneous amplitude detection i.e. fast attacks of your envelope follower in a vocoder, even if it's very cpu expensive? Maybe it sounds cool. But maybe avg~ is just good enough.