I think I have a genuine need for an #LLM. Can someone tell me if this is possible?

@openbenches contains ~40k text inscriptions.

Someone wants to know how many are dedicated to men, how many to women.

"To Grandma Sylvia" is obvious.
"To R Smith" is not.

Could an AI give a rough estimate of the gender of a subject?

Could it ignore text relating to who the inscription is from? "To Granny from Dave and Alice".

What would be the most accurate / cheapest / fastest / easiest tool to work with?

@Edent If the inscriptions are in string form, not image form, this should be trivial, depending on what your acceptable error rate is, and what's your acceptable balance between not providing a response if unsure vs providing an erroneous response.

For 40k text inscriptions, you can even use <1B parameter models, here's a test with facebook's bart zero-shot classifier (which is ancient by today's LLM standards)

EDIT: model link https://huggingface.co/facebook/bart-large-mnli

@openbenches

@budududuroiu Yes, that's part of it. Could it also be used to extract a "to" and "from" classification?

@Edent definitely, many modern LLMs are trained (and then constrained during inference) to be able to produce structured JSON output, so you can just prompt an LLM like "tell me the gender of this person who the plaque is dedicated to, and, if available, who dedicated it to them", and then provide a JSON schema you want in return.

This webpage is a 10,000ft view of Structured Output for LLMs https://openrouter.ai/docs/guides/features/structured-outputs

Structured Outputs - Type-Safe JSON Responses from AI Models

Enforce JSON Schema validation on AI model responses. Get consistent, type-safe outputs and avoid parsing errors with OpenRouter's structured output feature.

OpenRouter Documentation
@budududuroiu Interesting, thanks!
@Edent Yup, despite it probably being able to fit newer model context size, I'd suggest calling an LLM separately for each of the plaque texts.