What Does A Robot See In The Mirror?

I couldn't resist comparing the #MoonDream 0.5b and 2b models looking at images of Dave or Wali.

The 2b model recognized Dave has a Minion character and is a robot.

The 2b model recognized that Wali is a robot but did not recognize the WALL-E character.

The 0.5b model simply hallucinates stuff, just like Ollama local models did for text queries.

https://forum.dexterindustries.com/t/moondream-vision-language-assistant-on-gopigo3-robot-kilted-dave/10750/2?u=cyclicalobsessive

#MoonDream #GoPiGo3 #TurtleBot4lite #Robot #Vision #LanguageModels #RaspberryPi5_8gb

Just had to try the #MoonDream Vision Language Assistant on my 4GB #RaspberryPi4 #GoPiGo3 #robot and my 8GB #RaspberryPi5 #TurtleBot4lite robot:

https://forum.dexterindustries.com/t/moondream-vision-language-assistant-on-gopigo3-robot-kilted-dave/10750?u=cyclicalobsessive

TL:DR; Pi5 with 2B model might be useful, but the 0.5B model will not be useful regardless of running on Pi4 or Pi5.

MoonDream Vision Language Assistant on GoPiGo3 Robot Kilted-Dave

Couldn’t resist trying MoonDream vision language assistant on Dave and Wali. Now WaLI sports an 8GB Pi5 and Dave only has a 4GB Pi4, but why not. Here is the pic: So first we ask Dave “What do you see?” (using the MoonDream 0.5b model 693Mb) (moondream_006_venv) ubuntu@kilteddave:~/KiltedDave/systests/moondream/examples_with_API_006$ ./see_wali_and_dave.py Using moondream-0.5b model with moondream 0.0.6 Python API Model Load Time: 19.13 seconds Image Load and Encode Time: 36.69 seconds Qu...

Modular Robotics Forum

Oh no, my #ROS2Jazzy #robot discovered #MoonDream #VisionAssistant (v0.0.6 with local model #Python API) and found an image of "a bunch of models enjoying each others company sitting on a ledge" in its filesystem.

I hope it doesn't go all obsessive and start googling them.

Certainly not letting it near social media.

#TurtleBot4

Moondream cho phép phân tích video trực tiếp, nhận diện đối tượng và hành động trong thời gian thực! Demo không cần đăng nhập và mã nguồn mở có sẵn trên GitHub. 🚀

#AI #VideoAnalysis #Moondream #LocalLLaMA #PhânTíchVideo #TríTuệNhânTạo #CôngNghệ

https://www.reddit.com/r/LocalLLaMA/comments/1oty5a9/realtime_video_analysis_with_moondream/

Very impressive image recognition model.

VRAM requirements: <3GB

https://github.com/vikhyat/moondream

#LLM #selfhost #moondream

GitHub - vikhyat/moondream: tiny vision language model

tiny vision language model. Contribute to vikhyat/moondream development by creating an account on GitHub.

GitHub