Did you know that you can follow our (pre-)recorded conference talks on our FIZ ISE youtube channel? For example, you can listen to @epoz presenting "The Art Historian's Bicycle becomes an E-Bike" about recent research results around #iconclass from the VisArt workshop at #ECCV2022
video: https://www.youtube.com/watch?v=gfIYaIZQ9DI
paper: https://zenodo.org/records/7225425
FIZ ISE youtube channel: https://www.youtube.com/@ISEFIZKarlsruhe

#arthistory #computervision @fiz_karlsruhe #digitalhunmanities #culturalheritage #embeddings

[VISART2022] The Art Historian's Bicycle Becomes an E-Bike

YouTube

Due to requests at #ECCV2022 and to make our #MapFreeReloc dataset useful for more tasks, we make the SfM reconstructions of our train set publicly available.

🔥460 SfM models of outdoor scenes all around the world 🔥
https://research.nianticlabs.com/mapfree-reloc-benchmark/dataset

Want to train 460 NeRFs? Go ahead.

Each scene was captured by non-expert users with two independent scans, sometimes months apart. We reconstructed them with COLMAP and aligned them to the original phone trajectories.

Thus, all models are in metric scale.

Map-free Visual Relocalization - Download

The most impactful paper that I was (co)first author on was “VQGAN-CLIP: Open domain image generation and editing with natural language guidance.” This paper was about a methodology that @rivershavewings @Adverb and others developed in the summer of 2021, but a paper never got written up about it. I performed the systematic experiments that never happened when it came out and wrote most of the text itself. I had a blast presenting it at #ECCV2022 this October

https://arxiv.org/abs/2204.08583

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance

Generating and editing images from open domain text prompts is a challenging task that heretofore has required expensive and specially trained models. We demonstrate a novel methodology for both tasks which is capable of producing images of high visual quality from text prompts of significant semantic complexity without any training by using a multimodal encoder to guide image generations. We demonstrate on a variety of tasks how using CLIP [37] to guide VQGAN [11] produces higher visual quality outputs than prior, less flexible approaches like DALL-E [38], GLIDE [33] and Open-Edit [24], despite not being trained for the tasks presented. Our code is available in a public repository.

arXiv.org

Did you miss #ECCV2022 & #DIRA2022? Did you go, but want to relive the experience?

Thanks to the Web Science and Digital Libraries Research Group blog for posting my trip report that covers keynotes, some interesting papers, and my work at #ECCV2022 & the #DIRA2022 workshop.

https://ws-dl.blogspot.com/2022/12/2022-12-23-eccv-2022-and-dira-2022-trip.html

#ComputerVision #InformationRetrieval #ComputerScience #Conference

2022-12-23: ECCV 2022 and DIRA 2022 Trip Report

The Web Science and Digital Libraries Research Group at Old Dominion University.

Slides and videos from our #eccv2022 tutorial "Self-Supervision on Wheels: Advances in Self-Supervised Learning from Autonomous Driving Data" are now available:
- video: https://www.youtube.com/watch?v=RhNZUyOubfE
- webpage: https://gidariss.github.io/ssl-on-wheels-eccv2022/
[ECCV 2022 Tutorial] Self-Supervision on Wheels: Advances in SSL from Autonomous Driving Data

YouTube

Ce soir, c'était lecture de papiers d'#ECCV2022 et en particulier ce papier de tracking : "Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories"

Encodeur, corrélation de features, sinusoidal encoding et transformer-like MLP : un joyeux mélange.

https://particle-video-revisited.github.io/

Par Adam W. Harley & team

#AI #ML #Tracking #OpticalFlow #CarnegieMellon

Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories

#ECCV2022 #DIRA2022 Many reasons exist for users to conduct image searches: protecting intellectual property, building datasets, providing evidence, or justifying funding. That abstract images are at a disadvantage hurts users leveraging search engines for these use cases.
#ECCV2022 #DIRA2022 Google and Yandex perform better with natural images than with abstract ones achieving a difference in retrievability as high as 54% between images in these categories. These results indicate a clear difference in capability among search engines.
Because Wikipedia is well indexed by search engines, we acquired abstract (diagrams) and natural (photos) images from Wikimedia Commons. We submitted these 380 images to each search engine and recorded how often the search engine returned the same image back. #ECCV2022 #DIRA2022
The major search engines Baidu, Bing, Google, and Yandex support "reverse image search" -- where the user can upload an image and view pages that contain that image or pages that have similar images. #ECCV2022 #DIRA2022