I know quote tweets aren't a thing here but I feel strongly about this, so here goes!

It makes me sad to see the below take pop up so often. Because they're trained directly on data and adopt the perspective of those data, the epistemology of ML/AI/LLM is, IMO, perfectly aligned with the situated knowledges perspective. Feminists, this is actually *our* moment to shine! We can do so much with these methods! We can absolutely use these methods in a way that aligns with a #feminist epistemology.

@LauraNelson I don't think LLMs are able to do this, though, because they require so much data to train from scratch? Maybe there's a way to fine-tune towards the situated knowledges perspective, though.

The big question is if there's a way that pre-training also has embedded biases, which does seem to be the case.

@alex Of course the pre-training has embedded biases, because it's trained on social data and society has biases. But we can leverage that fact, if we acknowledge that it's part of the LLM package. One problem is those working on these models often come at it from the view-from-nowhere perspective. What if we started from the situated knowledge perspective? If we did that we would be approaching LLM very differently, with, in my view, powerful potential.
@LauraNelson I think that's fair. I'm curious how this would work in practice, though?
@alex Yeah I think that's the exciting part! How would that look in practice? I have lots of ideas. First is to be more deliberate about what it's trained on and carefully define/describe exactly what perspective each LLM captures. And also be more specialized. Not one LLM to rule them all, but more targeted LLMs that are, again, more deliberate and calibrated. We can still go big, but also be more precise.
@LauraNelson @TedUnderwood Yeah, I mean this is something @emilymbender has been explicit about -- we just don't know enough about what this stuff has been trained on to ascribe them to have a particular perspective, so we basically need to assume they have this hegemonic view-from-nowhere perspective

@LauraNelson @TedUnderwood @emilymbender If we just try to point to the Common Crawl or The Pile, the only way people have characterized this is through the lens of bias. While that gives us some information, I don't think it's enough to call it a "viewpoint."

Also, cf. part I of our paper https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4217148

@alex @LauraNelson @TedUnderwood My take on the OP isn't that LLMs can legitimately be said to have a "view from nowhere", but rather that the people who think they have knowledge at all* are the types who think that "view from nowhere" is possible and that surely massive scale would deliver it.

*as opposed to information about word form distribution in some specific dataset, which is what they have.

@emilymbender @TedUnderwood @LauraNelson @alex as the op that is what I meant, yes

@alex @LauraNelson @TedUnderwood @emilymbender and to be clear in the sequel I talked about how I think viewing LLMs as having knowledge is just going to exacerbate cultural and epistemological hegemony problems we already have with large scale information management

it takes the problem of the texas school board deciding what goes in high school textbooks across the country into a more global and more insidious issue, basically

@left_adjoint @alex @TedUnderwood @emilymbender I see what you're saying here (and I appreciate your perspective!). But I do think we can view LLMs as having knowledge (I don't think we need to broach the topic of whether they understand), but it's situated knowledge. And that's exciting. But y'all are right here: we need more information about data provenance first to know what view a LLM captures.