Deep Learning with Structured Representations.
| Website | https://loewex.github.io/ |
| Scholar | https://scholar.google.ch/citations?user=lZZIP9UAAAAJ |
| https://twitter.com/sindy_loewe |
| Website | https://loewex.github.io/ |
| Scholar | https://scholar.google.ch/citations?user=lZZIP9UAAAAJ |
| https://twitter.com/sindy_loewe |
On the simple, grayscale datasets that we consider, it even achieves competitive or better performance to SlotAttention - a state-of-the-art object discovery method.
⚡And it’s lightning fast - compared to SlotAttention, the CAE trains between 10-100 times faster! ⚡
4/5
We implement this coding scheme by augmenting all activations in an autoencoder with a phase dimension. By training the CAE to reconstruct the input image (left), it learns to represent the disentangled object identities in the phases without supervision (right).
This simple setup works surprisingly well! The CAE learns to create object-centric representations, and to segment objects accurately, as highlighted in the predictions below.
3/5
🧠 In the brain, objects are theorized to be represented through temporal spiking patterns: a neuron’s firing rate represents whether a feature is present; and if neurons fire in sync, their respective features are bound together to represent one object.
🤖 We employ a similar mechanism by using complex-valued activations: a neuron's magnitude represents whether a feature is present; and if neurons have similar phases, their respective features are bound together to represent one object.
2/5
Excited to share the Complex AutoEncoder (CAE):
✨ The CAE decomposes images into objects without supervision by taking inspiration from the temporal coding patterns found in biological neurons. ✨
Now accepted at TMLR!
📜 arxiv.org/abs/2204.02075
with @phillip_lippe, Maja Rudolph, and Max Welling
1/5