In #RFC6189 published in April 2011, the hardening measure described was to use video channels to authenticate, since running live lip syncs with the voice-spoofed SAS was considered unfeasible using the machine learning algorithms at the time.

https://datatracker.ietf.org/doc/html/rfc6189#page-77