fixed point is taught wrong. instead of saying "to multiply two fixed point values you have to multiply them then shift them back into position" what it really should be is "when you multiply two fixed point values the result has a fractional precision in bits equal to the sum of the fractional precision of the two fixed point types multiplied"
this not only makes it easy to see *why* you have to shift (the fractional precision increased, so if you want to keep a lower one you have to shift) but it also makes it clear how multiplication between two fixed point types with different fractional precision works

if you multiply a fixed point value with 2 bits of fractional precision and one with 4 bits you wind up with a result that has 6 bits of fraction. if you want a result with 3 bits of fraction you shift right by 6-3=3 bits

this is easily understood, but only if the underlying concept is properly explained from the start

this also makes it easier to see that if you're transforming a fixed point type into another fixed point type via multiplication (like position * position = edgeweight for a rasterizer) you may want to not shift at all if you want the resulting fixed point type to represent the transformation precisely without any rounding
@eniko I think often I've seen fixed point multiplication as being from X:Y * X:Y = X:Y, ie same precision. Increasing precision of the result means actually changing the type (which is ok like you say) but it is another operation. Maybe that is where the confusion come from from places that describe it. I like your view of it of course and it builds intuition.
@eniko Another intersting topic is that if you have 32-bits and multiply by 32-bits you get a 64-bit result. But older machines did not have 64-bits so you sometimes had to shift before doing the multiplication. But you don't have to shift the same on both factors.