I ported my fork of #z80ai to #ZXSpectrum. Now it indeed ran at 3.5MHz (CP/M version on #ZXSpectrumNext must have been using 28Mhz). This simple convo took takes 4.5 minutes :)
(Optimizations are surely possible. I also pessimized it a bit by adding border colors just to not be bored waiting for a reply)
Grab the source and .tap file here: https://github.com/RCL/z80ai/tree/main/examples/tinychat/build_tap

