Github has a setting "Allow GitHub to use my data for AI model training" which defaults to Enabled. You might want to turn it off, thought it's probably too late and likely won't stop other bots crawling your code.

https://github.com/settings/copilot/features

Build software better, together

GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.

GitHub

@dougbinks I think this refers to your *Copilot* input/output/context, not your repos. I disabled that setting just to be sure, but I think as long as I'm not using Copilot it won't make a difference.

Pretty sure public repos have already been and will always be used for training AI models

@Doomed_Daniel It states "Allow GitHub to use my data for AI model training", which seems pretty clear.

There is another setting for "Repository access: Choose which repositories Copilot coding agent should be enabled in." which is here:
https://github.com/settings/copilot/coding_agent

Build software better, together

GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.

GitHub
@dougbinks as it's a Copilot setting and below that heading you cited refers to "Inputs, Outputs, and associated context" I'd assume this is about data in Copilot
@Doomed_Daniel I think you might be right, but the use of the term "data" makes me think it's potentially more than that. The linked document clarifies nothing sadlly.
@dougbinks true, probably makes sense to assume the worst