HierVL is a new #AI approach injecting semantics into visual representations by capturing long-term and short-term associations in video-language embeddings capturing both 'what' and 'why' of human actions. 🤯

https://vision.cs.utexas.edu/projects/hiervl

#ComputerVision #DeepLearning #CVPR2023

HierVL: Learning Hierarchical Video-Language Embeddings