From my amateur perspective, machines/robots/AI are really great at doing hard to develop human things, and really bad at natural/innate human abilities, high level maths and high resolution photography are easy for machines, but “where is the bird in this picture” and “is this person sad” are incredibly difficult concepts for the non biological systems.
If we expand “what’s hard for humans is easy for ai and vica versa” high level communication is probably the most recent development for the human species after agriculture and automated manufacturing. I’m not surprised communication is the first thing being mastered, but I think the gap between bring able to use language and being able to properly use a body is much more significant than we expect.