This is super cool. I feel like maybe it's a little limited, in that you want to be able to test reasoning, not only answers. I'm not sure how to build a task description that requires the LLM to have given the right answer for the right reason or reasons. But it's a step in the right direction, for sure.

#FutureLaw #LegalTech #LawFedi

https://github.com/HazyResearch/legalbench

GitHub - HazyResearch/legalbench: An open science effort to benchmark legal reasoning in foundation models

An open science effort to benchmark legal reasoning in foundation models - GitHub - HazyResearch/legalbench: An open science effort to benchmark legal reasoning in foundation models

GitHub