

Радикальный Дельфизм в эпоху AI: подключаем ИИ-ассистентов к OpenCV и FFmpeg через MCP
Технологии ушли вперёд, и теперь мы живём в эру больших языковых моделей и автономных AI-агентов. В настоящее время существует несколько агентных систем, работающие с компьютерным зрением и камерами. Интеллектуальные видеоагенты обрабатывают видеопотоки в реальном времени, распознают объекты, анализируют поведение людей, фиксируют нарушения и действуют автономно. В основном – это готовые коммерческие ИИ-платформы для видеонаблюдения (например, Lumana, VisionPlatform.ai , Spot AI). Для создания собственных решений можно настроить захват кадров (через Frame Forwarder ) и передать их в визуальные модели обработки. Можно создавать логику на базе Amazon Bedrock Agents или фреймворков для ИИ-агентов (LangChain, CrewAI, AutoGen), где камера выступает как "инструмент" ( take_snapshot() ) восприятия. Есть еще более специализированные решения – VisionAgent (от Landing AI), Microsoft AutoGen, LlamaIndex (Multimodal Agents). А можно как-то по проще? Да еще из подручных средств? Да еще в «бытовые» агентные системы? А давайте попробуем...

DroneBlocks provides a complete educational platform for STEM educators, combining drones and robotics to bring cutting-edge technology into the classroom. See how you can elevate the learning experience for students or yourself on this episode of OpenCV Live. Join our Patreon for just $2/mo to watch ad-free live streams and get DRM-free downloads of every […]
I've been rescuing old robot code from the archives — skittlebot (2017), armBot (2014), and a MicroPython music project are now on GitHub. Nearly 12 years of history recovered from zip files and broken git repos 🤖
https://orionrobots.co.uk/2026/06/18/18-rescuing-old-robot-code-from-the-archives.html

Over the years I’ve accumulated a fair amount of robot source code — old projects in zip files, folders with broken .git histories, repos that were once on a long-gone GitLab instance. I’ve been working through these archives recently, using AI tooling to help analyse branches, spot untracked files, and...

OpenCV 5.0 is the library's first major release since 2018, and it's not a tidy-up. The layer-by-layer DNN path is replaced by a graph-based engine with shape inference, constant folding, and operator fusion. ONNX operator coverage rose from ~22% to over 80%, and the DNN module can now run language and vision-language models like Qwen 2.5, Gemma 3, and PaliGemma. The legacy C API is gone, C++17 is the floor. Does running LLMs belong inside a computer-vision library?
Foi lançada uma nova versão do OpenCV 5, que incorpora um motor avançado de inteligência artificial. Esta atualização traz melhorias significativas em termos de capacidades de processamento e análise de imagem. 🤖
OpenCV 5 release – New DNN engine with enhanced ONNX and LLM/VLM support, Intel, Arm, and RISC-V hardware optimizations

OpenCV 5 open-source computer vision library has recently been released with a brand-new DNN (Deep Neural Network) engine that provides better ONNX coverage and enables LLM/VLM support. The fifth version of the popular CV library also adds support for Intel, Arm, Qualcomm, and RISC-V hardware acceleration, improved 3D vision, and various new core features such as new data types, real N-dimensional and scalar support, and performance improvements. OpenCV 5's DNN Engine OpenCV 4.x supports about 22% of ONNX operators, and the new DNN engine in OpenCV 5 brings coverage to over 80%. That means models with dynamic shapes that used to fail on OpenCV 4.x, should now work, as the 5.x engine was rebuilt around a typed operation graph with proper shape inference, constant folding, and operator fusion. The table below shows the main difference between OpenCV 4.x and OpenCV 5 Since it's quite a big change, to make sure