Mahi Shafiullah

8 Followers
14 Following
14 Posts

Teaching robots how to do all my chores @NYU @MetaAI.

Previously @MIT.

🏛️New York University
🏫Massachusetts Institute of Technology
🎓https://scholar.google.com/citations?user=vAOw6aQAAAAJ&hl=en
🌐https://mahis.life/
We can train a CLIP field from scratch under an hour, including automated labeling, thanks to advances in NeRF literature such as instant-NGP. Our trained model can then be used on a robot to find "blue book with a house on the cover" or a place to "throw out my trash". https://t.co/VuYoMPdfHx
Mahi Shafiullah on Twitter

“We can train a CLIP field from scratch under an hour, including automated labeling, thanks to advances in NeRF literature such as instant-NGP. Our trained model can then be used on a robot to find "blue book with a house on the cover" or a place to "throw out my trash".”

Twitter
For real world exps, we collect RGB-D data using an iPhone 13 Pro and pre-process them using open-label detection/segmentation models like Detic and LSeg.
We then convert the data to world coordinates and semantic/visual reps using Sentence-BERT and CLIP on the labels and bboxes. https://t.co/9POmdCnyp0
Mahi Shafiullah on Twitter

“For real world exps, we collect RGB-D data using an iPhone 13 Pro and pre-process them using open-label detection/segmentation models like Detic and LSeg. We then convert the data to world coordinates and semantic/visual reps using Sentence-BERT and CLIP on the labels and bboxes.”

Twitter
How can we train data-efficient robots that can respond to open-ended queries like “warm up my lunch” or “find a blue book”?
Introducing CLIP-Field, a semantic neural field trained w/ NO human labels & only w/ web-data pretrained detectors, VLMs, and LLMs http://mahis.life/clip-fields https://t.co/CffqCZfRuW
CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory

Teaching robots in the real world to respond to natural language queries with zero human labels — using pretrained large language models (LLMs), visual language models (VLMs), and neural fields.

CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory