Zac Yu 🌈

@zacyu
45 Followers
93 Following
156 Posts
software engineer in pittsburgh
Hosthttps://zacyu.com
Accept-Languageen-US,en,zh,ja
i hope this email finds you well
Projects only have 2 states
No, I didn't "forget your name", I didn't store it in the first place, it's called being GDPR compliant.
hi

"AI companies claim their tools couldn't exist without training on copyrighted material. It turns out, they could — it's just really hard. To prove it, AI researchers trained a new model that's less powerful but much more ethical. That's because the LLM's dataset uses only public domain and openly licensed material."

tl;dr: If you use public domain data (i.e. you don't steal from authors and creators) you can train a LLM just as good as what was cutting edge a couple of years ago. What makes it difficult is curating the data, but once the data has been curated once, in principle everyone can use it without having to go through the painful part.
So the whole "we have to violate copyright and steal intellectual property" is (as everybody already knew) total BS.

https://www.engadget.com/ai/it-turns-out-you-can-train-ai-models-without-copyrighted-material-174016619.html?src=rss

It turns out you can train AI models without copyrighted material

It's just a pain in the ass.

Engadget
Every website in 2025.
Pronouns in DNS

This is my PhD thesis

I did not ask for this

I did not consent to this

I did not approve of this

I was not compensated for this

I would not have advised this

I do not like this

And worst of all, the number of people who've read my thesis has still not increased.