Mastodawn

Chris Offner Dec 29, 2023

Here's a more clearly visible demonstration of the problem I described previously: https://sigmoid.social/@chrisoffner3d/111591367887994819

On the left we see the progression of cross-attention maps extracted via the CPU, on the right we see the same cross-attention maps extracted via the GPU.

This is using the #Keras implementation of #StableDiffusion on an M3 Max.

#TensorFlow #StableDiffusion #Diffusion #Python #MLEngineering #MachineLearning #DeepLearning #GPU #M3Max

Chris Offner (@[email protected])

Attached: 1 image I'm running into some unexpected and significant non-determinism when running a #Keras diffusion model on my Apple GPU. On the left we see the progression of cross-attention maps for time steps from t = 0 to t = 900 when running the model via the CPU. We see that each cross-attention map undergoes some "refinement" progression as we go from t = 0 to t= 900. On the right we see the same but on the GPU. It's a much more erratic and discontinuous progression. #MLEngineering #DeepLearning #GPU

Sigmoid Social