Mastodawn

🚀 Excited to share my latest paper!

What is a diffusion model actually doing when it turns noise into a photograph?The usual story involves score matching and stochastic differential equations, but there is a deeper geometric explanation hiding underneath.

Turns out denoising diffusion is a contractive fractal system, where the noise schedule is literally sculpting an attractor toward your data distribution 🌀 Michael Barnsley introduced these partitioned iterated function systems in the 1980s for fractal compression of images for the Microsoft Encarta encyclopedia.

I showed in particular that:
🔹 The noise schedule of denoising diffusion controls the Hausdorff/Kaplan-Yorke dimension of that attractor;
🔹The fractal geometry framework explains the well-known observed two-phase structure in the reverse chain as expansion, crossover, contraction;
🔹 The theory gives principled, geometry-driven guidance for schedule design and immediately explains the working of several popular heuristics, such as the cosine offset.

A math dive with huge practical payoff!
📄 Full paper: https://arxiv.org/abs/2603.13069

#GenAI #ML #image #DenoisingDiffusion #FractalGeometry

Would love to hear thoughts from anyone working on generative models, dynamical systems or geometric deep learning! 👇

Fractals made Practical: Denoising Diffusion as Partitioned Iterated Function Systems

What is a diffusion model actually doing when it turns noise into a photograph? We show that the deterministic DDIM reverse chain operates as a Partitioned Iterated Function System (PIFS) and that this framework serves as a unified design language for denoising diffusion model schedules, architectures, and training objectives. From the PIFS structure we derive three computable geometric quantities: a per-step contraction threshold $L^*_t$, a diagonal expansion function $f_t(λ)$ and a global expansion threshold $λ^{**}$. These quantities require no model evaluation and fully characterize the denoising dynamics. They structurally explain the two-regime behavior of diffusion models: global context assembly at high noise via diffuse cross-patch attention and fine-detail synthesis at low noise via patch-by-patch suppression release in strict variance order. Self-attention emerges as the natural primitive for PIFS contraction. The Kaplan-Yorke dimension of the PIFS attractor is determined analytically through a discrete Moran equation on the Lyapunov spectrum. Through the study of the fractal geometry of the PIFS, we derive three optimal design criteria and show that four prominent empirical design choices (the cosine schedule offset, resolution-dependent logSNR shift, Min-SNR loss weighting, and Align Your Steps sampling) each arise as approximate solutions to our explicit geometric optimization problems tuning theory into practice.

arXiv.org

jobook.raw 📸Oct 3, 2023

from the series of trying, building and generating random stuff with www.ideogram.ai powered by tech like Denoising Diffusion Models, Imagen: Google’s text-to-image system, Imagen Video for video synthesis, WaveGrad for speech synthesis, neural speech recognition, neural machine translation, contrastive learning for learning visual representations, and generative adversarial imitation learning.

prompt: "classic earthship adobe sand translucent glass conservatory crystal rock concrete cathedral in neviges built by gottfried böhm, brutalist cathedral of neviges, organic high definition hip bone femer texture geodesic polygonal generative design skeleton bone structure macro bone texture the azure supported by bamboo scaffolding under construction built in grey concrete by Uglješa Bogunović, Slobodan Janjić, and Milan Krstić brutalist architecture like The Avala TV Tower building bunker black and white analog photography with construction crane on top, greyscale black and white photography in style of artist jeanloup sieff"

#earthships #earthship #architecture #ideogram #ai #aiart #unstablediffusion #stablediffusion #denoisingdiffusion #neuralnetwork #aiarchitecture #generativedesign #imagen #generativeadversarialimitationlearning