Good morning folks
As expected, doing weeding has meant my back hurts.
Eep.
Let me start up #Barkscrolling for ya
(edit: it already has posts? Nice!)
And some paperbark
Putting together my own LLM dataset(s) has been verrrry educational in a lot of directions.
First, the importance of not letting anything time sensitive in (X tv show is on now, Y thing just happened, even things like 'you can buy Z for $4') - this is waaaay more pervasive than you might expect.
Also: 700+ mb of raw text, all in one file?
Fuuuuuuuuuck thaaaaaaaaat