HierVL is a new #AI approach injecting semantics into visual representations by capturing long-term and short-term associations in video-language embeddings capturing both 'what' and 'why' of human actions. 🤯
HierVL is a new #AI approach injecting semantics into visual representations by capturing long-term and short-term associations in video-language embeddings capturing both 'what' and 'why' of human actions. 🤯