I mean it is possible to train LLMs on well-curated datasets that are licensed and where it's both well known what goes in and what the limits are. But those are not the models that $1t is being "invested" into.
@knud@Pionir@scott they're spending the investor money on datacenters and power plants and yachts and rockets and things⦠when you're buying all that, you can't afford to spend any money paying to license the human creativity that makes the product possible in the first place :(