Seven Waves, by Suzanne Ciani

7 track album

Suzanne Ciani
Compositional Datalog on SQL: Relational Algebra of the Environment

I spent sone time making Datalogs that translated into SQL. https://www.philipzucker.com/tiny-sqlite-datalog/

Hey There Buddo!
One of my favoured things in #modular #sound #synthesisers is the negative envelope and #inverted #VCA. Inverted VCA is a unique technique in modular that no other instruments can provide. Traditional VCA opens up the volume (or filter) when the key is pressed and we can hear the sound. Inverted VCA closes when key is pressed or signal received and we will not hear the sound until we release the key. It's a good #compositional tool for opening the sound for adding other #instruments or #noises.

Cool new work linking #cognitive science and #AI

Investigating #Compositional and #Semantic #Understanding in Video Retrieval #Models

by Avinash Madasu and Vasudev Lal

https://arxiv.org/abs/2306.16533

@cogsci #NeuroAI #CognitiveAI #ML #compositionality #cogsci #cog

ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models

Video retrieval (VR) involves retrieving the ground truth video from the video database given a text caption or vice-versa. The two important components of compositionality: objects & attributes and actions are joined using correct syntax to form a proper text query. These components (objects & attributes, actions and syntax) each play an important role to help distinguish among videos and retrieve the correct ground truth video. However, it is unclear what is the effect of these components on the video retrieval performance. We therefore, conduct a systematic study to evaluate the compositional and syntactic understanding of video retrieval models on standard benchmarks such as MSRVTT, MSVD and DIDEMO. The study is performed on two categories of video retrieval models: (i) which are pre-trained on video-text pairs and fine-tuned on downstream video retrieval datasets (Eg. Frozen-in-Time, Violet, MCQ etc.) (ii) which adapt pre-trained image-text representations like CLIP for video retrieval (Eg. CLIP4Clip, XCLIP, CLIP2Video etc.). Our experiments reveal that actions and syntax play a minor role compared to objects & attributes in video understanding. Moreover, video retrieval models that use pre-trained image-text representations (CLIP) have better syntactic and compositional understanding as compared to models pre-trained on video-text data. The code is available at https://github.com/IntelLabs/multimodal_cognitive_ai/tree/main/ICSVR

arXiv.org
Researchers From Stanford Introduce Locally Conditioned Diffusion: A Method For Compositional Text-To-Image Generation Using Diffusion Models - The Triangle Agency

Researchers From Stanford Introduce Locally Conditioned Diffusion: A Method For Compositional Text-To-Image Generation Using Diffusion Models

The Triangle Agency
@hirokisayama there are a couple of emerging (no pun intended..) areas that for now I would put inside big circles, but that in another 10 years might become the label for the big circles themselves: #compositional #gametheory and categorical #systemstheory/categorical #cybernetics. Basically #categorytheory applied to both macro-areas.