@icing There's a thing called "the curse of dimensionality" and it applies to neural networks. I guess you could say that it's like a reverse Moore's Law but for neural nets. Basically, (and this is just my mostly-non technical explanation), neural nets are basically huge multi-dimensional classifiers and when you need to do backpropagation to train the net, it involves making small adjustments to localised areas of the classifier space. The problem (or curse) of having more dimensions is that it becomes harder and harder to localise the changes because every local space becomes closer to all the other points in every other subspace. This means exponentially higher training costs as these models scale.
At least that's as I understand it. I'm not a mathematician, but I have read plenty of stuff relating to machine learning over the years (since the 90s) and I think I've got the above right...





