@kirakira I would recommend:
https://github.com/karpathy/nanoGPT
(That is, an old project that is "left up for posterity" after switching it to a chat interface)
@kirakira What you're looking for is a "foundation model". Look for them in HuggingFace and use them with llama.cpp. If you want to train from scratch, I think llama.cpp has training scripts.
I unfortunately had a local LLM experimentation stint in early 2023. As much as I hate the fact that I somehow ever thought these things had any merit to them, I do appreciate that the technical understanding I acquired makes it very easy to smell the bullshit.
@kirakira Here you go Kira, here's the Llama 2 foundation model, 7 billion parameter version, quantized for running on local hardware.
This is an early quantization, from back when TheBloke was the main person doing the quants, so they aren't as high quality as modern quants but honestly it doesn't make a damn difference.
@kirakira You can use this with llama.cpp. Get the 4-bit version (Q4_K_M or Q4_K_S, not Q4_0!) and plop it in.
Take some time to note that by the time Llama 2 dropped, TheBloke was already being sponsored by a16z. Nobody in our anti-AI circles ever talks about this, but the local LLM/GenAI movement is just as funded and boosted by fascists as the OpenAI/Anthropic side of things.
It's shit and dirt all the way up, but it's shit and dirt all the way down too.