CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation

Authors: Shangning Xia, Hongjie Fang, Hao-Shu Fang, Cewu Lu

pre-print -> https://arxiv.org/abs/2410.14974
website -> https://cage-policy.github.io

#robotics #deep_learning #manipulation #outofdistribution #vision

CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation

Generalization in robotic manipulation remains a critical challenge, particularly when scaling to new environments with limited demonstrations. This paper introduces CAGE, a novel robotic manipulation policy designed to overcome these generalization barriers by integrating a causal attention mechanism. CAGE utilizes the powerful feature extraction capabilities of the vision foundation model DINOv2, combined with LoRA fine-tuning for robust environment understanding. The policy further employs a causal Perceiver for effective token compression and a diffusion-based action prediction head with attention mechanisms to enhance task-specific fine-grained conditioning. With as few as 50 demonstrations from a single training environment, CAGE achieves robust generalization across diverse visual changes in objects, backgrounds, and viewpoints. Extensive experiments validate that CAGE significantly outperforms existing state-of-the-art RGB/RGB-D approaches in various manipulation tasks, especially under large distribution shifts. In similar environments, CAGE offers an average of 42% increase in task completion rate. While all baselines fail to execute the task in unseen environments, CAGE manages to obtain a 43% completion rate and a 51% success rate in average, making a huge step towards practical deployment of robots in real-world settings. Project website: cage-policy.github.io.

arXiv.org

Seems like stable diffusion hasn't been in the training corpus of #galactica, yet the model is confidently explaining what it believes stable diffusion to be.

We need to teach these models to say "I don't know".

Link to prompt: https://galactica.org/?prompt=describe+how+stable+diffusion+works

#OpenWorld #OutOfDistribution

describe how stable diffusion works - Galactica