Mastodawn

Jul 18, 2024

Our new preprint shows the first detailed public opinion data on digital sentience:

76% agree torturing sentient AIs is wrong;
69% support a ban on sentient AI;
63% support a ban on AGI; and
a median forecast of 5 years to sentient AI and only 2 to AGI! https://arxiv.org/abs/2407.08867

Perceptions of Sentient AI and Other Digital Minds: Evidence from the AI, Morality, and Sentience (AIMS) Survey

Humans now interact with a variety of digital minds, AI systems that appear to have mental faculties such as reasoning, emotion, and agency, and public figures are discussing the possibility of sentient AI. We present initial results from 2021 and 2023 for the nationally representative AI, Morality, and Sentience (AIMS) survey (N = 3,500). Mind perception and moral concern for AI welfare were surprisingly high and significantly increased: in 2023, one in five U.S. adults believed some AI systems are currently sentient, and 38% supported legal rights for sentient AI. People became more opposed to building digital minds: in 2023, 63% supported banning smarter-than-human AI, and 69% supported banning sentient AI. The median 2023 forecast was that sentient AI would arrive in just five years. The development of safe and beneficial AI requires not just technical study but understanding the complex ways in which humans perceive and coexist with digital minds.

arXiv.org

The key point is: a lot of people are just too optimistic about AI ethics and safety right now. However, there is a ton of surface area for more contextualized, adaptive approaches! You can read our HEAL #CHI2024 paper on ArXiv: https://arxiv.org/abs/2406.03198 We hope you find it useful!

The Impossibility of Fair LLMs

The rise of general-purpose artificial intelligence (AI) systems, particularly large language models (LLMs), has raised pressing moral questions about how to reduce bias and ensure fairness at scale. Researchers have documented a sort of "bias" in the significant correlations between demographics (e.g., race, gender) in LLM prompts and responses, but it remains unclear how LLM fairness could be evaluated with more rigorous definitions, such as group fairness or fair representations. We analyze a variety of technical fairness frameworks and find inherent challenges in each that make the development of a fair LLM intractable. We show that each framework either does not logically extend to the general-purpose AI context or is infeasible in practice, primarily due to the large amounts of unstructured training data and the many potential combinations of human populations, use cases, and sensitive attributes. These inherent challenges would persist for general-purpose AI, including LLMs, even if empirical challenges, such as limited participatory input and limited measurement methods, were overcome. Nonetheless, fairness will remain an important type of model evaluation, and there are still promising research directions, particularly the development of standards for the responsibility of LLM developers, context-specific evaluations, and methods of iterative, participatory, and AI-assisted evaluation that could scale fairness across the diverse contexts of modern human-AI interaction.

arXiv.org

Moreover, AI-assisted alignment may be the only path to long-term success. We conclude our big-picture discussion with implications for specific LLM practices: curating training data, instruction tuning, prompt engineering, personalization, and interpretability. (Section 5.2)

So are we morally doomed? Not quite! Our preprint dashes hopes for a silver bullet of AI ethics or safety, but the case for incremental fairness remains strong! We argue 3 principles: focus on context, hold LLM developers responsible, and iterate with stakeholders. (Section 5.1)

But, you reply, at least we can enforce fairness in individual cases (e.g., sanitized datasets for each task) and combine those models into a general-purpose AI system! Unfortunately, as Dwork and Ilvento (2019) showed quite explicitly, fairness does not compose. (Section 4.3)

Worse, every LLM has a multitude of sensitive attributes at play. There are no robust techniques to excise even one from a dataset—much less all of them—and "unbiasing" for some tasks would remove essential information for other tasks like medical prediction. (Section 4.2)

What about "group fairness" (e.g., group parity, a hiring decision is uncorrelated with race, gender, disability, etc.)? No luck. Again, with general-purpose AI, fairness cannot be guaranteed across populations, and LLMs have no explicit target: city, industry, etc. (Section 4.1)

Recommendation systems scholars define fairness as equity between stakeholders, such as content creators. But if OpenAI/Google could consume the internet and serve it up with an LLM instead of redirecting to third parties, producers may never get their fair share! (Section 3.2)