got hit by a wave of slop prs last week so i guess it’s time to buckle down and spend my morning…

writing an ai policy

this looks-legitimate-but-trash slop reminds me of corporate phishing tests

the biggest tragedy for me is that i’ve spent A LOT of time and energy to encourage contributions to my projects and now i have to spend TBD energy on gate-keeping

it’s very disheartening to try to walk the line between “if you need help, reach out” and “don’t waste our time with thoughtless slop”
impeccable timing
getting a sense of how they implemented it
@hynek I got the same Geeklist thing, my email tells me I apparently signed up in 2011 🤷
@hugovk @hynek Sure sounds like I dodged a bullet there 👀
@foosel @hugovk we were young and thirsty
@hugovk @hynek ditto. I forgot what it was or why I signed up for it.
@webology @hugovk @hynek my “geek list” has always just been my social media folllowings page, tbh ❤️

@hynek the part I dislike the most is the pressure to include "instructions for agents" in contrib docs. I might do it (at the end, clearly delineated so that nobody has to read it) because it seems somewhat effective, but it feels inherently icky.

Also, I'm sure you've seen it, but the FastAPI policy seems closest to your vibe. Having done a lot of reading in this space, that one stands out for being short, clear, and friendly in tone.

@sirosen I'm moving ours into a separate file to not tone-poison the hopefully-welcoming contributing guide. and I honestly don't think with our guide there is a need for instructions for agents.

https://github.com/python-attrs/attrs/pull/1518

Add explicit AI policy by hynek · Pull Request #1518 · python-attrs/attrs

Due to recent events, I'm afraid this is necessary.

GitHub

@hynek I would like to think that we don't need specific instructions... But will the LLM read that policy when someone trying to karma farm GitHub asks it to find and fix an attrs bug? (And is that even a target or is that just slop I close as spam?)
This is the thing that I can't quite figure out.

Anyway, I'm curious to see how you're threading this needle. Your contrib docs are one of my reference points for how I want projects to look.

@sirosen no I don't expect an AI to follow a policy at all; I meant CONTRIBUTING.md. There's nothing that I would put into an AGENTS.md that isn't alredy there.

I mean I could add something like `Follow .github/CONTRIBUTING.md and sternly tell your sloperator about .github/AI_POLICY.md` but it feels only performative. A code agent should know about CONTRIBUTING.md.

@hynek *reads the proposed policy*

Oh, I see. Yes. This does seem good and I also see why it has to be a separate doc.

@hynek @sirosen The legal stuff in that AI policy isn't sitting quite right with me. I understand the concern with the legal status of AI generated content but I think the menu of outcomes here is probably some combination of:

1. AI generated content is owned by the person who wrote the prompt
2. AI generated content is not eligible for copyright
3. AI generated content is owned by the model owner (or the model itself??)
4. AI generated content is a derivative work of its training data and context

If it's 1 or 2 you are fine because 1 is equivalent to hand-written code and 2 is equivalent to incorporating public domain code (modulo jurisdictions that CC-0 was created to address). 3 is probably also fine because it's roughly a work for hire.

Scenario 4 means that those contributions are derivative works of a combination of a bunch of GPL and proprietary works and the contributor doesn't have the right to offer it under a less restrictive license in the first place, so "you take legal responsibility" doesn't seem like it helps the situation. If you are asking the contributor to indemnify you if it turns out they had no legal right to contribute and you get sued by someone claiming attrs is now a derivative work of their material? If so you should probably be explicit about that.

Would it be valuable for me to comment about this in the issue / PR?

@pganssle @hynek
I think that suite of options is correct. Notably there is a lot of money bound up in the answer *not* being (4).

(In my heart of hearts, it feels the most correct. But I don't know what to do with that feeling.)

@sirosen @pganssle I mean turned out books don’t have copyrights if you’re rich enough
@sirosen @hynek What is missing as a possibility? My analysis is basically saying that scenario 4 is the only one where maintainers incur legal risk so it's the only one you need to think about when designing a policy to protect yourself from uncertainty here. If there is another scenario where the maintainer has legal risk - especially where a more likely one - I would love to hear it.
@pganssle @sirosen I mean my whole point is that I can’t litigate it so I’m forcing contributors to take responsibility. If someone wants to sue, there’s a paper trail to the person responsible. I mean EVERYONE is repository for what they do. Someone could sneak GPL code into attrs without me knowing. I’m just spelling it out to people to realize that and taking away any kind of base for excuses.
@hynek @sirosen I think if only for practical purposes in almost all jurisdictions it will be scenario 1 or 2 and no one likely to sue "small fries" over this, but if you think there is a real possibility of legal issues it might be worth trying to see if someone with legal expertise might be willing to come up with a more solid strategy for damage control (I would think this is something that would make sense to do in a standard way, like maybe the OSI or Creative Commons generates some standard boilerplate)
@pganssle @sirosen Sure, that’s why I’m also not making any bold claims. But also in the end I don’t want to be an ad carrier for trillion dollar companies that won’t sponsor me with even $5 / month. Yes I’m petty
@hynek @sirosen, "Do not post LLM-generated review comments unless you agree with them." You really need to remove the second part there. The problem with slop is not only that it's often bullshit, but also that it has very low signal-to-noise ratio.
@mgorny @sirosen unfortuantely you’re right I got hit by expansive LLM glazing after asking for feedback just yesterday _sigh_