Interested in #LLMs for scalable zeroshot text&image annotation &analysis? The Baltic #DH summer school published recordings, mine here: https://www.youtube.com/watch?v=Fm7mJgI0MfU
- Intro to LLMs
- quantitizing analytics framework
- assessing error rates
- using OpenAI APIs
- running your open LLM in free Colab
BSSDH 2024 | Rapidly annotating and analyzing textual and visual data with LLMs | Andres Karjus

YouTube
@andreskarjus
Thank you for sharing! I would mainly be interested in training an LLM with as low energy/environmental impact as possible. Any takes on that?
@elmerot @andreskarjus Training is the most resource-hungry step, so I'd say not training a new LLM, but rather using one already trained, seems much preferable.
@locha
Do you know if OpenAI gets hold of the data you enter?
@andreskarjus
@elmerot @locha Well by definition if you use a cloud servvice, you are sending your data somewhere. Whether they use it for something else e.g. their own purposes is another question. Current OpenAI ToS says they do not.
@andreskarjus @elmerot If it's a matter of resource use, you could consider hosting the model locally (e.g. Llama 3). But it's not for everyone or every kind of use.
@elmerot Use an already pretrained model as base, and just tune it for your use case or train a LoRa adapter - depending on use case this takes very little resources (to the point you might be able to do it on free Google Colab GPU credits). Training a competitive model from scratch is prohibitively expensive and usually wasteful.
@andreskarjus
Thank you, I've thinking about something along those lines.
@andreskarjus I've been looking for this for a while, thanks a lot! Hopefully it will help me past a block.