Mastodawn

"GitHub’s Copilot will use you as AI training data, but you can opt out"

"...if you’ve used the code completion in Visual Studio Code, asked Copilot a question on the GitHub website, or used another related AI feature, your interactions and code snippets could be harvested...."

https://www.howtogeek.com/githubs-copilot-will-use-you-as-ai-training-data-but-you-can-opt-out/

#ai #microsoft #copilot

GitHub’s Copilot will use you as AI training data, but you can opt out

That includes the Copilot features in Visual Studio Code.

How-To Geek

Show thread

You S Blues 2d ago

@ai6yr I’ve assumed all along that ALL the code stored in GitHub has been used to train their LLM. Does anyone believe that is not the case?

Show thread

Mike Sheward 2d ago

@patmikemid @ai6yr it was always the case because early versions of copilot would happily suggest other peoples api keys etc if they had been accidentally committed

Show thread

AI6YR Ben 2d ago

@SecureOwl @patmikemid LOL there are so many keys in github. I imagine people are already automatically scraping them for nefarious purposes.

Show thread

Mike Sheward

@ai6yr @patmikemid oh yeah 100% - when i was running security for an IoT platform (yes, we had security), i used to scrape defensively as well and reach out to people who committed api keys to our platform by accident before they could be used by bad actors

github has a program that will autodetect them too but you have to commit to using a unique key format so they can have more reliable regex

Show thread

Aprazeth 1d ago

@SecureOwl @ai6yr @patmikemid

It is called "secret scanning" on Github (and Azure DevOps)

You can replicate a similar and perhaps better result with gitleaks and a git precommit setup.

If your environment has pipelines/runners, ALSO add a job (or w/e your variant calls it) that triggers on commits to run gitleaks.

That won't stop them from being being committed but you'll get a warning that there are secrets being stored.