I’m currently spending my #funemployment time diving into #ai and, as part of that, I’m reading a 2023 paper on a system that improves throughput by 2+x….. by using memory paging to efficiently evaluate multiple prompts in parallel.
Memory paging was first implemented in the early 1960s, is a well known solution to a set of problems, and yet, 60ish years later, the AI folk just started to use it, after years of saying large amounts of RAM are required to run LLMs.
😲
