Mastodawn

OpenCV 5 is here, and it’s fucking amazing! Come by booth 515 in the CVPR Exhibition Hall to hear all about the new featured and get some exclusive stickers or a postcard! #OpenCV #CVPR

Pedro Lopes Jun 4

Our Embodied AI (https://embodied-ai.tech/) is also an art piece at #CVPR2026 art gallery!

Watch the backend representation of an AI system that moves a dancer's body with muscle stimulation.

Video: https://youtu.be/iRg6_SKpKrI
(See all art pieces at #CVPR: https://thecvf-art.com/archive.php?year=2026)

José Oramas Jun 4

RE: https://sigmoid.social/@jaom7/116640689544313929

I will be at #CVPR2026 these days presenting our #TriLite method for weakly-supervised object localization. Get in touch in case you would like to discuss about #WSOL or #AI #Interpretability in general.

#ML #ComputerVision #CVPR

Alex Chan Jun 2

If you are at #CVPR this week, come checkout our highlight paper on CHIRP: a task-diverse dataset for bird monitoring in the wild :) @[email protected] @[email protected] [paper] arxiv.org/abs/2603.25524 [dataset] github.com/alexhang212/... youtu.be/uZdewa2WuOc

"Frame by Frame" official trac...

CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild

Long-term behavioral monitoring of individual animals is crucial for studying behavioral changes that occur over different time scales, especially for conservation and evolutionary biology. Computer vision methods have proven to benefit biodiversity monitoring, but automated behavior monitoring in wild populations remains challenging. This stems from the lack of datasets that cover a range of computer vision tasks necessary to extract biologically meaningful measurements of individual animals. Here, we introduce such a dataset (CHIRP) with a new method (CORVID) for individual re-identification of wild birds. The CHIRP (Combining beHaviour, Individual Re-identification and Postures) dataset is curated from a long-term population of wild Siberian jays studied in Swedish Lapland, supporting re-identification (re-id), action recognition, 2D keypoint estimation, object detection, and instance segmentation. In addition to traditional task-specific benchmarking, we introduce application-specific benchmarking with biologically relevant metrics (feeding rates, co-occurrence rates) to evaluate the performance of models in real-world use cases. Finally, we present CORVID (COlouR-based Video re-ID), a novel pipeline for individual identification of birds based on the segmentation and classification of colored leg rings, a widespread approach for visual identification of individual birds. CORVID offers a probability-based id tracking method by matching the detected combination of color rings with a database. We use application-specific benchmarking to show that CORVID outperforms state-of-the-art re-id methods. We hope this work offers the community a blueprint for curating real-world datasets from ethically approved biological studies to bridge the gap between computer vision research and biological applications.

arXiv.org

lalmei Mar 5

Our Agriculture Vision Workshop will be at CVPR 2026!
Speakers include:
Girish Chowdhary, Gary Bradski, Shenlong Wang, Soumik Sarkar, and Jason Corso, along with accepted papers (Deadline March 9th!)
#CVPR #AgTech #ComputerVision #Robotics #RemoteSensing #CVPR2026

Show thread

David Culley Sep 22, 2025

I'm thinking back to 2021 when I was writing my thesis in university and had to read academic research papers every day.

The authors (all Israeli) of the deep learning papers I had to read in 2021, shared this in June 2024.

#genocide #palestine #cvpr #deeplearning #computervision

GripNews May 13, 2025

🌗 GitHub - apple/ml-fastvlm：FastVLM：視覺語言模型高效視覺編碼
➤ 視覺語言模型新進展，加速圖像處理
✤ https://github.com/apple/ml-fastvlm
這個 GitHub 倉庫提供了論文 "FastVLM: Efficient Vision Encoding for Vision Language Models" (CVPR 2025) 的官方程式碼實現。FastVLM 引入了新穎的混合視覺編碼器 FastViTHD，它能輸出更少token，大幅減少高解析度圖像的編碼時間。其性能優於現有模型，並針對 Apple Silicon 和 Apple 設備進行了優化，提供了模型下載和推論使用方法，以及相關論文引用資訊。
+ 這個 FastVLM 看起來很有潛力，能在有限的資源下處理高解析度圖像，對於行動裝置來說非常重要。
+ 很好奇這個模型在實際應用中的表現如何，特別是在 Apple 設備上的優化效果值得關注。
#人工智慧 #機器學習 #視覺語言模型 #CVPR 2025

GitHub - apple/ml-fastvlm: This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025 - apple/ml-fastvlm

GitHub

OpenCV May 8, 2025

We're live with a packed guest list to talk about VAND 3.0 the Anomaly Detection Challenge @ CVPR 2025. Watch along for your chance to win a free OpenCV University course. https://youtube.com/live/ddLyq14D-Lc #OpenCV #ComputerVision #AI #CVPR

Anomaly Detection Challenge, VAND 3.0 @ CVPR - OpenCV Live! 171

YouTube

OpenCV Mar 3, 2025

The OpenCV Perception Challenge for Bin-Picking is in full-swing, with teams submitting solutions to the leaderboard. If you're in the robotics or AI space, this is a great way to test your skills, win a share of the $60,000 in prizes, and most excitingly be part of an official CVPR Workshop at CVPR 2025 in Nashville!

Sign up to participate: https://bpc.opencv.org/web/challenges/challenge-page/1/overview
See the leaderboard: https://bpc.opencv.org/web/challenges/challenge-page/1/leaderboard/1

#OpenCV #ComputerVision #AI #BPC2025 #Robotics #Competition #CVPR

Perception Challenge for Bin-picking

A competition focused on the real-world robustness of 6DoF solutions for the industry. This will involve estimating the position and orientation of objects in scenes typical for bin-picking. An automated system will rank teams on accuracy and efficiency.

Perception Challenge for Bin-picking

Show thread

Adam Cook Feb 4, 2025

One of the dumbest talks that I have ever heard was given by @karpathy.bsky.social at #CVPR when he still worked at #Tesla. An absurd slide deck obviously written by Musk to justify Tesla dumping radar sensors some weeks either - which was only done because of COVID supply chain issues.

karpathy (@karpathy.bsky.social)

AI @ OpenAI, Tesla, Stanford

Bluesky Social