@simon I'm pondering a tool to help OpenStreetMappers where I could take a picture of an opening hours sign and ask some combination of LLM/image detection to give me the OSM opening hours format. ChatGPT is great at taking arbitrary hours text and outputting the correct string, but I wonder if you can suggest the right tool/API(s) for finding the hours in an image.
@ian @simon The first thing I would try (because it's easy) is feeding images to GPT-4 and getting structured data back with the instructor library. Give it examples. If it doesn't work well, maybe see if a two-stage approach works better: 1. Get a natural language description from an image; 2. Convert that to the structured form.
instructor/examples/vision at main · jxnl/instructor

structured outputs for llms . Contribute to jxnl/instructor development by creating an account on GitHub.

GitHub
@lemonodor @simon good call. This definitely didn't work last time I tried (before GPT-4o)
@ian @lemonodor worth experimenting with Claude and Gemini too if GPT-4o doesn’t seem to catch everything, but it looks like GPT-4o might be what you need already
@simon @ian In case anyone's reading this and hasn't used instructor before, here's a working example: https://gist.github.com/wiseman/71571f12e4961faab7425f32a6afe74e
Get store hours from images

Get store hours from images. GitHub Gist: instantly share code, notes, and snippets.

Gist
@lemonodor @simon thanks again! i got this working and I'm looking forward to trying it out on my walk tomorrow: https://social.modest.com/@ian/112964343293247868
Ian Dees (@[email protected])

Attached: 1 image Thanks for the pointers everyone! I ended up putting together a thing that uses OpenAI's API with gpt-4o-mini to suggest OSM tags for a picture: https://github.com/iandees/image-to-osm I'll try putting it up on the web somewhere if there's interest. I spent about 15¢ testing it a few dozen times:

Modest Mastodan

@ian @lemonodor @simon How did it go? Are you collecting a library of input photos?

Next thing I'd do to that instructor script is add Field descriptions to the pydantic models.