Mastodawn

Jaleed Khan Feb 28, 2025

Our paper "KnowZRel: #CommonSense Knowledge-based Zero-Shot Relationship Retrieval for Generalised Scene Graph Generation" is now published in IEEE Transactions on #AI! 📜🤖🎯🎉
https://ieeexplore.ieee.org/document/10897903
@johnbreslin
#artificialintelligence #sceneunderstanding #visualreasoning #computervision #knowledgegraphs #deeplearning #NeurosymbolicAI

Jaleed Khan Apr 1, 2024

Our paper "A Survey of #Neurosymbolic #VisualReasoning with Scene Graphs and #CommonSense Knowledge" has been accepted to Neurosymbolic #AI Journal➡️ https://neurosymbolic-ai-journal.com/paper/survey-neurosymbolic-visual-reasoning-scene-graphs-and-common-sense-knowledge-0 📜🎯🎉 @johnbreslin #artificialintelligence #sceneunderstanding #computervision #knowledgegraphs #deeplearning #research

A Survey of Neurosymbolic Visual Reasoning with Scene Graphs and Common Sense Knowledge | Neurosymbolic Artificial Intelligence

Hermann Blum Jun 8, 2023

It takes a while to make fancy #NeRF animations, so I am very happy we can now share our upcoming #CVPR paper with video and code release:
A big debate in #ContinualLearning is how to scale to many experiences. This work shows how well NeRF-based compression can scale to store robotic experiences over many consecutive deployments, much better than storing checkpoints of your model.

website: https://ethz-asl.github.io/ucsa_neural_rendering/
#Robotics #SceneUnderstanding

Unsupervised Continual Semantic Adaptation through Neural Rendering

J. de Curtò Feb 27, 2023

Our paper "Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles" has just been accepted for publication in MDPI Drones. Read more about our work here: https://mdpi.com/2504-446X/7/2/114 #UAV #LargeLanguageModels #SceneUnderstanding #Drones

Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles

Unmanned Aerial Vehicles (UAVs) are able to provide instantaneous visual cues and a high-level data throughput that could be further leveraged to address complex tasks, such as semantically rich scene understanding. In this work, we built on the use of Large Language Models (LLMs) and Visual Language Models (VLMs), together with a state-of-the-art detection pipeline, to provide thorough zero-shot UAV scene literary text descriptions. The generated texts achieve a GUNNING Fog median grade level in the range of 7–12. Applications of this framework could be found in the filming industry and could enhance user experience in theme parks or in the advertisement sector. We demonstrate a low-cost highly efficient state-of-the-art practical implementation of microdrones in a well-controlled and challenging setting, in addition to proposing the use of standardized readability metrics to assess LLM-enhanced descriptions.

MDPI