Dolly 2.0 is a really big deal: https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm

"The first open source, instruction-following LLM, fine-tuned on a human-generated instruction dataset licensed for research and commercial use"

My notes so far on trying to run it: https://til.simonwillison.net/llms/dolly-2

Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM

Introducing Dolly, the first open-source, commercially viable instruction-tuned LLM, enabling accessible and cost-effective AI solutions.

Databricks
@simon It surprises me, that for something as performance critical as LLMs people use an inefficient language like #Python, where everything, especially GPU access, goes through multiple abstraction layers.
@fell IDK how familiar you are with Python or ML libraries in Python, but for "real" applications (not learning the basics) all of the actual computation of the model is pushed down to native code. Python remains useful as the glue language, as it always has done
@2ck Forgive me if this sounds rude, but don't see the point in leaning and using a "glue" language when you could simply use C/C++ straight away. It allows compiler optimisations throughout the entire program, direct access to operating system features like memory mapping and just less wasted instruction cycles overall. ML is the most intense computing application I can think of, and I really don't get why Python of all things became the de facto standard.
@fell ah. if you mean why Python is in the position it's in, I think it's mostly not technical and more cultural, and to some extent historical accident. like, a few motivated ML folks also liked Python, built libraries that were easy to play around with which others in the community picked up on and built on.
@simon @fell this has been bothering me, too, but I assume Python is serving as a control language for optimized machine code (or should we say β€œshaders” for GPU) and not where efficiency matters.