🤖 ProText: A Benchmark Dataset for Measuring (Mis)gendering in Long-Form Texts
"We introduce ProText, a dataset for measuring gendering and misgendering in stylistically diverse long-form English texts. ProText spans three dimensions: Theme nouns (names, occupations, titles, kinship terms), Theme category (stereotypically male, stereotypically female, gender-neutral/non-gendered), and Pronoun category (masculine, femini…"
https://machinelearning.apple.com/research/protext-gender-bias-benchmark









