Someone finally tested AI on real freelance jobs. Not benchmarks. Real paid work from Upwork.
240 projects. Graphic design, video, architecture, game dev. The best AI model completed 2.5% of them. Gemini managed 0.8%.
AI delivered an 8-second video when asked for 8 minutes. A house that changed shape per viewing angle. Corrupted files.
The benchmarks were never the real world.
https://arxiv.org/abs/2510.26787
#AI #FutureOfWork #RemoteWork #Benchmarks #Freelancing #Automation
