Mastodawn

Laura K. Nelson Dec 22, 2022

I know quote tweets aren't a thing here but I feel strongly about this, so here goes!

It makes me sad to see the below take pop up so often. Because they're trained directly on data and adopt the perspective of those data, the epistemology of ML/AI/LLM is, IMO, perfectly aligned with the situated knowledges perspective. Feminists, this is actually *our* moment to shine! We can do so much with these methods! We can absolutely use these methods in a way that aligns with a #feminist epistemology.

Show thread

Alex Hanna Dec 22, 2022

@LauraNelson I don't think LLMs are able to do this, though, because they require so much data to train from scratch? Maybe there's a way to fine-tune towards the situated knowledges perspective, though.

The big question is if there's a way that pre-training also has embedded biases, which does seem to be the case.

Show thread

Laura K. Nelson Dec 22, 2022

@alex Of course the pre-training has embedded biases, because it's trained on social data and society has biases. But we can leverage that fact, if we acknowledge that it's part of the LLM package. One problem is those working on these models often come at it from the view-from-nowhere perspective. What if we started from the situated knowledge perspective? If we did that we would be approaching LLM very differently, with, in my view, powerful potential.

Show thread

Alex Hanna Dec 22, 2022

@LauraNelson I think that's fair. I'm curious how this would work in practice, though?

Show thread

Laura K. Nelson Dec 22, 2022

@alex Yeah I think that's the exciting part! How would that look in practice? I have lots of ideas. First is to be more deliberate about what it's trained on and carefully define/describe exactly what perspective each LLM captures. And also be more specialized. Not one LLM to rule them all, but more targeted LLMs that are, again, more deliberate and calibrated. We can still go big, but also be more precise.

Show thread

Alex Hanna Dec 22, 2022

@LauraNelson @TedUnderwood Yeah, I mean this is something @emilymbender has been explicit about -- we just don't know enough about what this stuff has been trained on to ascribe them to have a particular perspective, so we basically need to assume they have this hegemonic view-from-nowhere perspective

Show thread

Laura K. Nelson

@alex @TedUnderwood @emilymbender I guess I don't see the leap from "we don't know enough about them to know the perspective" to "we need to assume a hegemonic view from nowhere." The hegemonic view *is* a view from somewhere. And that can tell us a lot about society. Maybe we start there?

Show thread

Alex Hanna Dec 22, 2022

@LauraNelson @TedUnderwood @emilymbender I see what you're saying. I think one _could_ do something with that, but I don't know what it'd tell us without knowledge of what the data is. Like, I don't need to prod a model to tell me that most of the text is racist, sexist, ableist, and Western-centric. But I wish I had more information about data provenance to discuss it _as_ a viewpoint. (e.g. this is why your work is so cool, Laura)

Show thread

Nabeel Siddiqui Dec 22, 2022

@alex @LauraNelson @TedUnderwood @emilymbender
If people want to collaborate on a paper about this, it be great. Iv been wanting to write about how tools like the following from a humanist perspective but don’t have time to do it solo: https://github.com/jalammar/ecco. This doesn’t tell us exactly the data these models are trained on but can help In understanding where certain values are “situated” in them.

GitHub - jalammar/ecco: Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, B...

GitHub