Github has a setting "Allow GitHub to use my data for AI model training" which defaults to Enabled. You might want to turn it off, thought it's probably too late and likely won't stop other bots crawling your code.
Github has a setting "Allow GitHub to use my data for AI model training" which defaults to Enabled. You might want to turn it off, thought it's probably too late and likely won't stop other bots crawling your code.
@dougbinks example: https://github.com/search?q=%22ipc.h+-+v0.2%22&type=code
it has its own repo, but there's 30 copies of it under other people's projects
@sol_hsa @dougbinks I know you just meant this as an example, but public domain:
1. maybe has a higher distribution of people just copying it into their code bases
2. implicitly has permission for AI training
so I think you need an example that's not PD to make this argument convincing.
@nothings @dougbinks welll.. true. I've just seen it done to all sorts of libraries. People even copy the whole boost under their project, which is insane.
I don't think ipc.h is even that popular (especially compared to stb_image.h =), but I was still surprised how many times it's been copied..
I guess I could spend the evening looking up different repos that are copied under other repos but I think I'd rather watch the trees sway in the wind..
@dougbinks To make things complex, I don't have a fundamental problem with a Chinese AI lab that releases the resulting weights for me to use for free. It works, and it's one of the best models currently available.
But giving the data to Microslop, so they can keep them closed up tight, so they can sell them back to me? Yeah, no thanks.
@dougbinks I think this refers to your *Copilot* input/output/context, not your repos. I disabled that setting just to be sure, but I think as long as I'm not using Copilot it won't make a difference.
Pretty sure public repos have already been and will always be used for training AI models
@Doomed_Daniel It states "Allow GitHub to use my data for AI model training", which seems pretty clear.
There is another setting for "Repository access: Choose which repositories Copilot coding agent should be enabled in." which is here:
https://github.com/settings/copilot/coding_agent