Mastodawn

got hit by a wave of slop prs last week so i guess it’s time to buckle down and spend my morning…

writing an ai policy

this looks-legitimate-but-trash slop reminds me of corporate phishing tests

the biggest tragedy for me is that i’ve spent A LOT of time and energy to encourage contributions to my projects and now i have to spend TBD energy on gate-keeping

Show thread

Hynek Schlawack

it’s very disheartening to try to walk the line between “if you need help, reach out” and “don’t waste our time with thoughtless slop”

Show thread

Hynek Schlawack Feb 21

impeccable timing

Show thread

Hynek Schlawack Feb 21

getting a sense of how they implemented it

Show thread

Hugo van Kemenade Feb 21

@hynek I got the same Geeklist thing, my email tells me I apparently signed up in 2011 🤷

Show thread

Gina Häußge Feb 21

@hugovk @hynek Sure sounds like I dodged a bullet there 👀

Show thread

Hynek Schlawack Feb 21

@foosel @hugovk we were young and thirsty

Show thread

Jeff Triplett Feb 21

@hugovk @hynek ditto. I forgot what it was or why I signed up for it.

Show thread

Jeff Forcier Feb 21

@webology @hugovk @hynek my “geek list” has always just been my social media folllowings page, tbh ❤️

Show thread

Stephen Rosen Feb 21

@hynek the part I dislike the most is the pressure to include "instructions for agents" in contrib docs. I might do it (at the end, clearly delineated so that nobody has to read it) because it seems somewhat effective, but it feels inherently icky.

Also, I'm sure you've seen it, but the FastAPI policy seems closest to your vibe. Having done a lot of reading in this space, that one stands out for being short, clear, and friendly in tone.

Show thread

Hynek Schlawack Feb 21

@sirosen I'm moving ours into a separate file to not tone-poison the hopefully-welcoming contributing guide. and I honestly don't think with our guide there is a need for instructions for agents.

https://github.com/python-attrs/attrs/pull/1518

Add explicit AI policy by hynek · Pull Request #1518 · python-attrs/attrs

Due to recent events, I'm afraid this is necessary.

GitHub

Show thread

Stephen Rosen Feb 21

@hynek I would like to think that we don't need specific instructions... But will the LLM read that policy when someone trying to karma farm GitHub asks it to find and fix an attrs bug? (And is that even a target or is that just slop I close as spam?)
This is the thing that I can't quite figure out.

Anyway, I'm curious to see how you're threading this needle. Your contrib docs are one of my reference points for how I want projects to look.

Show thread

Hynek Schlawack Feb 21

@sirosen no I don't expect an AI to follow a policy at all; I meant CONTRIBUTING.md. There's nothing that I would put into an AGENTS.md that isn't alredy there.

I mean I could add something like `Follow .github/CONTRIBUTING.md and sternly tell your sloperator about .github/AI_POLICY.md` but it feels only performative. A code agent should know about CONTRIBUTING.md.

Show thread

Stephen Rosen Feb 21

@hynek *reads the proposed policy*

Oh, I see. Yes. This does seem good and I also see why it has to be a separate doc.

Show thread

Paul Ganssle Mar 8

@hynek @sirosen The legal stuff in that AI policy isn't sitting quite right with me. I understand the concern with the legal status of AI generated content but I think the menu of outcomes here is probably some combination of:

1. AI generated content is owned by the person who wrote the prompt
2. AI generated content is not eligible for copyright
3. AI generated content is owned by the model owner (or the model itself??)
4. AI generated content is a derivative work of its training data and context

If it's 1 or 2 you are fine because 1 is equivalent to hand-written code and 2 is equivalent to incorporating public domain code (modulo jurisdictions that CC-0 was created to address). 3 is probably also fine because it's roughly a work for hire.

Scenario 4 means that those contributions are derivative works of a combination of a bunch of GPL and proprietary works and the contributor doesn't have the right to offer it under a less restrictive license in the first place, so "you take legal responsibility" doesn't seem like it helps the situation. If you are asking the contributor to indemnify you if it turns out they had no legal right to contribute and you get sued by someone claiming attrs is now a derivative work of their material? If so you should probably be explicit about that.

Would it be valuable for me to comment about this in the issue / PR?

Show thread

Stephen Rosen Mar 8

@pganssle @hynek
I think that suite of options is correct. Notably there is a lot of money bound up in the answer *not* being (4).

(In my heart of hearts, it feels the most correct. But I don't know what to do with that feeling.)

Show thread

Hynek Schlawack Mar 8

@sirosen @pganssle I mean turned out books don’t have copyrights if you’re rich enough

Show thread

Paul Ganssle Mar 8

@sirosen @hynek What is missing as a possibility? My analysis is basically saying that scenario 4 is the only one where maintainers incur legal risk so it's the only one you need to think about when designing a policy to protect yourself from uncertainty here. If there is another scenario where the maintainer has legal risk - especially where a more likely one - I would love to hear it.

Show thread

Hynek Schlawack Mar 8

@pganssle @sirosen I mean my whole point is that I can’t litigate it so I’m forcing contributors to take responsibility. If someone wants to sue, there’s a paper trail to the person responsible. I mean EVERYONE is repository for what they do. Someone could sneak GPL code into attrs without me knowing. I’m just spelling it out to people to realize that and taking away any kind of base for excuses.

Show thread

Paul Ganssle Mar 8

@hynek @sirosen I think if only for practical purposes in almost all jurisdictions it will be scenario 1 or 2 and no one likely to sue "small fries" over this, but if you think there is a real possibility of legal issues it might be worth trying to see if someone with legal expertise might be willing to come up with a more solid strategy for damage control (I would think this is something that would make sense to do in a standard way, like maybe the OSI or Creative Commons generates some standard boilerplate)

Show thread

Hynek Schlawack Mar 8

@pganssle @sirosen Sure, that’s why I’m also not making any bold claims. But also in the end I don’t want to be an ad carrier for trillion dollar companies that won’t sponsor me with even $5 / month. Yes I’m petty

Show thread

Jesus Michał "Le Sigh" 🏔 (he)Mar 8

@hynek @sirosen, "Do not post LLM-generated review comments unless you agree with them." You really need to remove the second part there. The problem with slop is not only that it's often bullshit, but also that it has very low signal-to-noise ratio.

Show thread

Hynek Schlawack Mar 8

@mgorny @sirosen unfortuantely you’re right I got hit by expansive LLM glazing after asking for feedback just yesterday _sigh_