Mastodawn

Terence Eden 1d ago

I think I have a genuine need for an #LLM. Can someone tell me if this is possible?

@openbenches contains ~40k text inscriptions.

Someone wants to know how many are dedicated to men, how many to women.

"To Grandma Sylvia" is obvious.
"To R Smith" is not.

Could an AI give a rough estimate of the gender of a subject?

Could it ignore text relating to who the inscription is from? "To Granny from Dave and Alice".

What would be the most accurate / cheapest / fastest / easiest tool to work with?

Show thread

Jade 1d ago

@Edent @openbenches You could probably use something like a BERT zero shot classifier to do this with much less resource usage than a LLM.

Show thread

Terence Eden 1d ago

@JadedBlueEyes Hit me up with a link / tutorial?

Show thread

Jade

@Edent This seems like a reasonably good example of label-based classification: https://huggingface.co/blog/Ihor/refreshing-zero-shot-classification

Here's another example: https://jaketae.github.io/study/zero-shot-classification/

There are a variety of models, usually BERT (and derivative) models are the easiest to play with. Keywords you probably want to look for are NLI, Zero shot or one shot, entailment.

Refreshing zero-shot classification with ModernBERT

A Blog post by Stepanov on Hugging Face