RE: https://birdbutt.com/@colinstu/116338099556672672

meanwhile, my grad student can't run the experiment he needs to do to get the figure I asked him to get, because he doesn't have enough RAM

@lindsey @colinstu did you see any of the leaked Anthropic code or any of the commentary on it? I was like …is this writ large the real reason they need every datacenter and then some
@kaoudis @colinstu oh you better believe I'm reading those @jonny toots like 🍿🍿🍿
@kaoudis @colinstu @jonny but, no, isn't model training the reason they need all those data centers, and isn't the stuff that leaked "just" the front end that runs on users' machines?
@lindsey @colinstu @jonny I think that’s part of it… but, I haven’t seen model training code I thought was particularly efficient before, and the general vibe made me wonder how much the people working at big AI prioritize efficiency if at all. Maybe they do and I’m not looking at the right things, I’ve mainly seen code published with academic papers in that space vs anything from industry, but my understanding is the gap between those two things for ML is not that wide