On April 9th, OpenAI released an updated gpt-4-turbo model:

“GPT-4 Turbo with Vision: The latest GPT-4 Turbo model with vision capabilities. Vision requests can now use JSON mode and function calling.”

In updating my Recipe Ingestor project to use the new model (and eliminate an ugly workaround), I found a pair of new Vision examples in the OpenAI Cookbook, which show vision working with function calling. Worth a look if this is your kind of thing.

https://cookbook.openai.com/examples/multimodal/using_gpt4_vision_with_function_calling