I want to show the NSF there would be broad support+utility for a "National Deep Inference" service for >100b LLMs.

If your research would be enabled by an inference service on open LLMs w API access+overrides to internal activations, params, gradients: please boost this thread!

(I'm also gathering feedback on twitter - more details here:)

https://twitter.com/davidbau/status/1605609105824964611

David Bau on Twitter

“I want to show the NSF there would be broad support+utility for a "National Deep Inference" service for >100b LLMs. If your research would be enabled by an inference service on open LLMs w API access+overrides to internal activations, params, gradients: Please Like this thread!”

Twitter

@davidbau Great idea! Ideally you would be able to do some fine-tuning as well, and not just an API query.

While language models are the biggest, other generative models are also growing in size, and would be great to have as part of such an API.

@boazbaraktcs yes, agreed, lightweight fine-tuning or techniques like soft-prompts need to be accssible. Seems like such iterated or batch use-cases are a 2nd tier of service; you might be willing to wait longer for results.

@davidbau yes , also apart from these models being big, I think any model that doesn't fit on one GPU starts to be complicated for researchers to use even if they do have enough GPUs to fit them.

Using a central engineering resource that all academics can share would be a game changer.

@boazbaraktcs Yes I also think that, just the mechanics of having a good setup to deal with the practical aspects of working with unwieldy large models, in a research setting, will be important. It will make a big difference if it becomes easy and routine.