I finally made a page on my Dish activation function, replacing my deleted Tweet: https://danieldk.eu/Dish-Activation
It's a non-monotonic activation function similar to GELU and SiLU, but does not require elementary functions, making it much faster on various hardware.
I'll leave the empirical evaluation to someone else 😁.




