Mastodawn

Laura K. Nelson Dec 22, 2022

I know quote tweets aren't a thing here but I feel strongly about this, so here goes!

It makes me sad to see the below take pop up so often. Because they're trained directly on data and adopt the perspective of those data, the epistemology of ML/AI/LLM is, IMO, perfectly aligned with the situated knowledges perspective. Feminists, this is actually *our* moment to shine! We can do so much with these methods! We can absolutely use these methods in a way that aligns with a #feminist epistemology.

Show thread

justine Dec 22, 2022

@LauraNelson hey, so when i was first getting into nlp, language models, as used around my research group, modelled snapshots of language use in a particular setting. the interpretation read into them was that they "captured linguistic conventions in a community" where these conventions were contingent on the community and subject to change across time; an unfulfilled dream was to figure out ways of measuring how exactly individual language-users could cause these conventions to evolve.

Show thread

justine Dec 22, 2022

@LauraNelson e.g., here's my phd advisor's paper on linguistic change in online communities -- really, beer-review forums. https://www.cs.cornell.edu/~cristian/Linguistic_change_files/linguistic_change_lifecycle.pdf

here, language models are bespoke and bigram-level, used to model month-by-month patterns of language used by this group of people who reviewed craft beers on the internet. baked into the research question is the intuition that these patterns change over time.

Show thread

justine Dec 22, 2022

@LauraNelson the cool finding is that as you write more beer reviews, you might start to settle into your habitual ways of writing these reviews, even as the community moves on without you.

(side note, i feel maybe a variant or reversal of this about my current mastodon instance? anyway, not important.)

Show thread

justine Dec 22, 2022

@LauraNelson so, a few thoughts here. first, at least in the data-science-y space, measurements of "adherence to a community norm" have a tendency to get baked into ways of sorting and ranking people. i don't love the question of whether an individual "uses language in a typical way" _even if_ you add lots of qualifications to the word "typical". "you use language weirdly, therefore you must be thinking of leaving this community, let's somehow fix that."

Show thread

justine Dec 22, 2022

@LauraNelson

but maybe there's something here? i.e., in the vein of, "here we map the gradual ossification of bureaucratic-speak in this organization, showing the development of linguistic devices that help it shirk responsibility for the various crises it's implicated in."

Show thread

justine Dec 22, 2022

@LauraNelson second, the "LMs have knowledge" trend is totally bizarre to me, given the "community norm" interpretation that i was socialized into. maybe that's why people make distinctions between NLP and comp-soc-sci/text-as-data.

(i mentioned both interpretations to an anthro professor recently, who seemed a bit weirded out by the idea of writing down a unified interpretation of "what it means to find patterns in text", so...that too.)

Show thread

justine Dec 22, 2022

@LauraNelson third, LMs -- even small-ish ones -- rely on having sufficient data. there are only so many ways you can slice a corpus by time and space until you fall through the statistical ice. i think that leads to lots of the "future qualitative work could..." paragraphs in papers.

Show thread

justine Dec 22, 2022

@LauraNelson i guess that's where the idea of fine-tuning comes in, but i'm not convinced -- if you think that approach leads to better understanding of a smaller-data-setting (because the soup of LLM training data gets you at least partway there), aren't you sort of inheriting the view-from-nowhere-esque assumption that you're trying to problematize?

Show thread

Laura K. Nelson

@tisjune Ahh these thoughts are all fascinating! First, "LMs have knowledge" should mabye instead be "LMs contain knowledge" or "LMs encode information that could be transformed into knowledge" (that last one is very clunky tho). And also, really large LMs might be able to be used to distinguish between perspectives contained within them (provided there's enough data for a particular perspective). I'm sure I don't know the tech enough, but maybe that could be a direction to take them?