"solving cartpole... by evolving the raw bytes of a 1.4KB tflite microcontroller serialised model"

microcontroller models are so small you can just run a genetic algorithm directly against the bytes of the serialised model! :D

who needs gradient descent anways?

http://matpalm.com/blog/evolving_cartpole_flat_buffers/

#oldBlogReposts #machinelearning

brain of mat kelcey...

the main use case for dropout is as a form of regulariser. but what if we used it instead to a make a model that's robust to having different combos / forms of the input? we could then use a genetic algorithm to trade off input complexity to performance, all with a single model...

http://matpalm.com/blog/evolved_channel_selection/

#oldBlogReposts #machinelearning

brain of mat kelcey...

we usually batch training examples for a single model.

but what about instead passing a single instance through a batch of models?

with #jax this is trivial and means we can train ensemble nets directly.

http://matpalm.com/blog/ensemble_nets/

#oldBlogReposts #machinelearning

brain of mat kelcey...

self attention is a super important concept in a lot of neural architectures!

the description that really made it "click" for me was to think of it as a form of learnt map.

here's a 30min explainer starting with the first main problem solved with it and finishing with the "transformer" architecture

https://youtu.be/7wMQgveLiQ4

#oldBlogReposts #machinelearning

Mat Kelcey : The map interpretation of attention

YouTube