So I just learned what "The Stack" is today: an aggregation of GitHub repos for machine learning from which I can opt out.

But I won't.

I won't because they scraped some hot garbage I wrote in bash and Python that would make you faint. Bottom-of-the-barrel throw-away scripts full of coding crimes. Stuff like

find | grep | awk | xargs | ugh

...invoked via subprocess.run() then fed into more garbage.

I want "artificial intelligence" to learn this. It's going to be fantastic.

tired: opt-out of AI training datasets
wired: enthusiastically opt-in all the garbage that's sitting on your disk
I wonder if I could cook up a script that turns Star Trek erotic fan fiction into Rust code, then upload *that* to GitHub
@gabrielesvelto link to such fiction please 😂

@derickr the Archive of Our Own has 100k+ such works, carefully labeled with genre, warnings, etc...

https://archiveofourown.org/tags/Star%20Trek/works

Ironically this stuff did end up in many machine-learning training datasets, creating one of those typical "what could go wrong?" scenarios.

Star Trek - Works | Archive of Our Own

An Archive of Our Own, a project of the Organization for Transformative Works

@gabrielesvelto

Hello, Spock instead of Hello, world

@elithebearded it's more like "Hello, Spock 😉 😉 "

@gabrielesvelto Consider: Markdown

Then the same thing but rendered as HTML, just in case