Mastodawn

田中義弘 | taziku CEO / AI × Creative (@taziku_co)

로봇과 VLM(비전-언어 모델)을 실제 동작 루프(보는→판단→움직이는→다시 보는) 관점에서 측정하는 플랫폼인 AgenticLab을 소개하는 내용입니다. 기존 VQA 같은 정적 벤치마크와 달리 실제 로봇에서 연속적인 인지-행동 루프를 평가하는 점을 강조합니다.

https://x.com/taziku_co/status/2029419731258843415

#robotics #vlm #vqa #agenticlab

田中義弘 | taziku CEO / AI × Creative (@taziku_co) on X

ロボット×VLMを真正面から測る「見るAI」はVQAなどのベンチマークで評価されている。でもロボットでは見て→判断→動く→また見るこのループがずっと続く。 AgenticLabは実機ロボットでこの一連の動作を評価するプラットフォーム。詳細は🧵

X (formerly Twitter)

MyWinePal Feb 10

A Delicious @TawseWinery Winemaker Dinner at @ProvenceMarinaside.
#ONwine #VQA #foodandwine #organic

https://mywinepal.com/2026/02/10/a-delicious-tawse-winery-winemaker-dinner-at-provence-marinaside/

PsychoticSheep Dec 20

FOSS Advent Calendar - Door 21: See What AI Sees with BLIP

Meet BLIP, the versatile open source AI that bridges vision and language. It's not just another image recognition tool, it's a unified model that can understand images and generate human-like text about them, performing tasks like visual question answering, image captioning, and even searching images based on natural language queries.

Its strength lies in its multifaceted design. Trained on web-scale image-text pairs, BLIP excels at both understanding the content of an image and generating accurate, nuanced descriptions. This makes it incredibly useful for creating accessible alt-text, organizing large photo libraries with intelligent search, or building interactive applications where AI can "see" and "talk" about visual content. Everything runs locally, keeping your visual data private.

Whether you're automating metadata generation, building an educational tool, or adding smart visual analysis to your project, BLIP provides a powerful, all-in-one solution to make your applications see and describe the world.

Pro tip: Use BLIP to automatically caption your image datasets, or combine it with a TTS model like Coqui to create a system that describes images out loud.

Link: https://github.com/salesforce/BLIP

How will you give your projects better vision? Automating alt-text, creating a visual Q&A chatbot, or organizing a decade of unsorted photos?

#FOSS #OpenSource #BLIP #ComputerVision #AI #Accessibility #AltText #ImageCaptioning #VQA #VisionAndLanguage #LocalAI #DeepLearning #MultimodalAI #Fediverse #TechNerds #AdventCalendar #adventkalender #adventskalender #KI #FOSSAdvent #Adventskalender #ArtificialIntelligence #KünstlicheIntelligenz

MyWinePal Dec 2

Introducing Ontario's @HenryofPelham Classic Pinot Grigio 2024 and Baco Noir 2023.
#elbowsup #vqa #Niagara #winewriter #somm #winemedia @winecountryont

https://mywinepal.com/2025/12/02/introducing-ontarios-henry-of-pelham-classic-pinot-grigio-2024-and-baco-noir-2023/

Dustin Sep 27

Ready for the #Riders game, got an amazing #BC #VQA #wine by #TheAudacityOfThomasGBright its a white and normally I don't care for white wine but this one is great. Plus, for the riders we got a #Saskatchewan red, a #haskap. It's our first try of this wine, hopefully it's good.
#CFLOM #CFL game day wine day, let's go #Roughriders

Show thread

NFDI4Earth Sep 22

The incubator project VISQAM from @ifgimuenster.bsky.social presented at #NFDI4EarthPlenary2025 helps computers to understand maps. 🗺️

The team builds an open dataset of annotated thematic maps and will open source their baseline model for map QA.

#NFDI4EarthIncubatorLab #ComputerVision #VisualQuestionAnswering #VQA

MyWinePal Jul 9, 2025

Ontario #CabernetFranc: A Mini Vertical From @sueannstaffestatewinery
#ONwine #VQA #winewriter #somm #winemedia #redwine #Niagara @winecountryont

https://mywinepal.com/2025/07/09/ontario-cabernet-franc-a-mini-vertical-from-sue-ann-staff-estate-winery/

MyWinePal Jun 30, 2025

Ontario #CabernetFranc: Two Wines From 180 Estate Winery
@winecountryont @180wines #Niagara #VQA #winelover #somm #winewriter #ONwine

https://mywinepal.com/2025/06/30/ontario-cabernet-franc-two-wines-from-180-estate-winery/

Dustin Jun 25, 2025

#Canadian #wine drinkers remember to buy #VQA for 100% Canadian wine.

Habr May 16, 2025

[Перевод] Reasoning CV-модели OpenAI не смогли посчитать монеты

Новые мультимодальные модели OpenAI o3 и o4-mini позиционируются как "разумные". Однако качественное тестирование на практических задачах вроде подсчета объектов и распознавания текста выявило неожиданные пробелы в их производительности, в некоторых случаях уступающие даже не-reasoning моделям. Узнайте, какие именно тесты провалили новинки и где показали уверенный результат.

https://habr.com/ru/articles/909052/

#ai #computervision #multimodal_llm #openai #llm #testing #evaluation #VQA #ocr

Reasoning CV-модели OpenAI не смогли посчитать монеты

Всем привет! Меня зовут Александр, я COO в SaaS-платформе аналитики данных. Последний год активно изучаю внедрение AI-решений в кросс-функциональные процессы. Делюсь полезными материалами, которые...

Хабр