Mastodawn

Flying Squid Jan 23, 2025

AI struggles to understand human history and fails miserably when tested

AI struggles to understand human history and fails miserably when tested - Lemmy.World

>LLMs performed best on questions related to legal systems and social complexity, but they struggled significantly with topics such as discrimination and social mobility. > >“The main takeaway from this study is that LLMs, while impressive, still lack the depth of understanding required for advanced history,” said del Rio-Chanona. “They’re great for basic facts, but when it comes to more nuanced, PhD-level historical inquiry, they’re not yet up to the task.” > >Among the tested models, GPT-4 Turbo ranked highest with 46% accuracy, while Llama-3.1-8B scored the lowest at 33.6%.