Mastodawn

Question for people who choose not to use generative AI for ethical reasons: Do you make that choice despite accepting the growing evidence that it works (at least for some tasks, e.g. coding agents working on some kinds of software)? Or do you reject it because of the ethical problems *and* a belief that it doesn't actually work?

I'm thinking that principled rejection of generative AI might have to be the former kind, *despite* evidence that it works.

Show thread

Matt Campbell Mar 7

Thanks to everyone who has responded so far.

To focus on a specific definition of "it works", take this post that I boosted:

https://toot.cafe/@nolan/116185451572229163

He has seen coding agents fix bugs with minimal prompting, and it's effective enough that he finds it terrifying. What should we make of that? He's ambivalent, but clearly feels that we should take seriously the demonstrated abilities of these tools, and as a result, he's using them, but not happily. I'm trying to figure out what to do with that.

Nolan Lawson (@[email protected])

I think what a lot of AI critics are missing is that they're judging an LLM by its first draft. This is *not* what terrifies me about these machines. What terrifies me is that you can ask them "find bugs in this PR." Or "find performance flaws." Or really anything. Then have 3 agents (with different models ideally) vote on the result. Then have another fix it. Repeat until all bugs are clean. If you haven't tried this experiment then you haven't reached the dark night of the soul that I have.

Toot Café

Show thread

Jasper Vinkenvleugel Mar 6

@matt I’d consider myself to be in the second category, as I have not yet seen such evidence.

Show thread

Nantucket Lit Mar 7

@jvnknvlgl Yes, I was wondering about this evidence, too.

Show thread

Torb Mar 6

@matt Apologies for wall of text, but it's a nuanced topic which I am not able to do justice succinctly.

This is hard to give a precise answer to because “work” is not a discrete binary category here. (I suspect this might also be driving some of the disagreements). Personally I am specifically worried about code that compiles and seem to do the right thing… yet have security, reliability or other quality problems that are not caught. Sure human programmers also make that kind of mistake

I suppose the short honest answer for I think I both don't think it works well enough to my satisfaction. Specifically, to me, one of the great things about software is how it's deterministic (well, mostly). Considering that randomness is core how what makes LLMs work well (to the degree they do), I have a hard time accepting that as “working” I suppose.

I think this is interesting, because I suspect a lot of us are looking at the same data, but we're interpreting it differently (think of two researchers categories statistical numbers differently… they have different viewpoints despite looking at the exact same raw data).

I *do* know of devs who are very clear that if it worked as claimed it would not change their opinion a dime. I would like to be believe I am one of those, but I have not been in the situation where I have been genuinely convinced they work well. A lot of aspects of tech are already highly immoral yet I still participate in at least some of that tech (but I also reject some and have made some parts of my tech life significantly less convenient because of it).

Another aspect of why I don't use LLM is neither about its ethics or how well it works, but the fact that I'm apprehensive of the cognitive effects it will have on *me*. Whether it's human language or writing, both of those are useful way to *think through problems* and carefully consider ideas and how they connect. I am not confident that my abilities for abstract reasoning will not be negatively affected by LLM usage, so that is also I reason I refrain.

Show thread

cassie Mar 6

@matt I believe it can produce code that works but I do not believe that it results in an overall software engineering process that works.

Show thread

Thomas Wilburn Mar 6

@matt I don't think it's as effective as its boosters claim (I suspect we're going to see--or are already seeing--a real decline in software quality as we start seeing its code in production), but even if it worked perfectly I wouldn't use it. I have enough regrets about the externalities of tech and who bears them, I don't need more of that on my conscience.

Show thread

mira Mar 6

i don't use cloud/big-tech ai. i will probably try running something locally but haven't gotten to it yet. For me it is similar to not hosting my passwords at apple, not using a youtube account, using a small email host, .. all of this is more work than using the bigtech solutions, but i still do it

Show thread

Dani Mar 6

@inkreas.ing if it helps, a super easy way to get going locally (provided your computer is good enough) is this app https://anythingllm.com/

AnythingLLM | The all-in-one AI application for everyone

AnythingLLM is the AI application you've been seeking. Use any LLM to chat with your documents, enhance your productivity, and run the latest state-of-the-art LLMs completely privately with no technical setup.

Show thread

Nerviest Initial Mar 6

@matt
Question for people who choose to use generative AI: Do you make that choice despite accepting the growing evidence that it incurs frankly terrifying external costs? Or do you use it because you ignore those costs *and* believe it actually works?

I have yet to see an application for whatever you want to call these "AIs" where their performance is worth the cost. Yet for some reason that makes me an ethical purist in an ivory tower who ignores "growing evidence" [citation needed] that "gen AI" works for anything but the most trivial cases. Meanwhile, "AI" boosters are allowed to flout ethics, flout externalities, and flout logic, but you don't accuse them of "belief" in something contrary to evidence. Why is that, I wonder?

If you're genuinely interested in asking the question, you need to rethink your phrasing. If you're not, you might still want to rethink it because it smells funny. Hope that helps! 🙂

Show thread

oli Mar 6

@matt many self reported claims of something working is not the same as evidence of it working.

AI research is (obviously, the field is evolving fast) lagging far behind. Even ignoring external harms, it's unclear whether it causes personal long term harm and what it means for the maintainability of projects.

It's entirely possible that it causes *important* skill atrophy (every new tool causes skill atrophy of some kind, many are often irrelevant). Of course last year's studies on this topic doesn't apply to the current models. Just like studies done on these models won't apply to next year's models. But patterns are appearing.

It's also possible that large projects where everyone uses AI heavily won't have anyone understand the details anymore, only the broad designs. Of course an AI can always explain those to you or you just regenerate them with a bigger model in the future.

But all that said... If we had a performance enhancing drug that allowed ppl to be x times more productive, would we really be that uncareful with it just because it's chemical makeup hasn't been restricted yet?

GenAI is basically a nonchemical drug (listen to the LLM maximalists, not me), and I am worried about heavy users, I am worried about companies forcing employees to use them everywhere, I am worried about ppl getting addicted to them and frying their brains (burnout as a service)

And at this point we haven't even talked about the damage all the mis-use (be it malicious or ignorant) has caused or the training causes to the world and to ppl in the training mines.

Show thread

oli Mar 25

And every time I see some new science about AI it's making me more and more worried about the heavy users of it

https://thingy.social/@malcircuit/116290027307902048

Mallory's Musings & Mischief (@[email protected])

How many studies do researchers need to do before the threat of LLMs is taken seriously? This technology *might* have some useful niche applications, but widespread deployment will be a disaster for humanity. This shit is an existential hazard, and not in the way the AI companies love to talk about. It's not going to take over the world like Skynet, it's a cognitohazard that turns anyone that interacts with it into an idiot. https://www.psychologytoday.com/us/blog/the-algorithmic-mind/202603/adults-lose-skills-to-ai-children-never-build-them

Mastodon

Show thread

rivten Mar 6

@matt i'd argue that it seems to work very well on the surface. But i'd like to see evidence that it works on the long term. Like if it produces code twice as fast but requires twice as much maintenance time because it breaks every now and then. But i don't ever argue that it's super good at understanding human language and producing something of _immediate_ value.

Show thread

platlas Mar 6

@matt I think I'm in the first group. Or at least I think the ethics'n'co needs to be answered before questioning if it does or does not work.

Show thread

TheMuso Mar 7

@matt I put myself in the first category.

Show thread

Nantucket Lit Mar 7

@matt I have staked out the position that I will do my own thinking. The hard work, the blunders, and the late nights will be the point of it all. When I mess up, I will not blame it on "AI", but at the same time it cannot claim credit for my successes.

Since I'm a Colorado guy, I'll give this analogy: it's like people who drive their car on the road up Pikes Peak, and then say, "I climbed Pikes Peak."

Show thread

Matt Campbell Mar 7

@nantucketlit I guess the pragmatic response to that would be, what if your boss just wants you to get to the other side of Pike's Peak in the most efficient way?

Show thread

alcinnz Mar 7

@matt I see growing evidence that it does, & I see growing evidence that it doesn't.

I don't really care either way.

Show thread

Adrien Plazas Mar 7

@alcinnz @matt I couldn't agree more, for me discussing whether AI works or doesn't is just a way to obscure its harm, be it social, environmental, artistic…

Show thread

Mar 7

@matt Here's why: https://infosec.exchange/@Em0nM4stodon/116109826119332018

Em :official_verified: (@[email protected])

I'm not using any LLM or "AI" for anything that I write or draw. Never had, and never will. I'm making this choice because: • This technology was built by unethically stealing the hard work of millions without any consent or compensation. • This technology has and is still constantly scraping data, including personal data, from people without their knowledge or consent, in complete disregard of the privacy laws we have to protect us. • This technology unnecessarily uses vast amounts of energy in a world where using more energy sadly usually means more pollution. • This technology is working on devaluating labor in order to enrich even more the already rich, aggravating poverty everywhere. • This technology is misleadingly being sold as a solution to problems it cannot solve. • This technology is supercharging disinformation and manipulation online, centralizing an incredible power of influence in the hands of a few controlling billionaires. • This technology is increasingly being used by authoritarian governments in order to surveil and control the people. • This technology atrophies our creativity and capability to think, as well as harming our social relationships. • This technology makes my writing voice feel flat and boring. I'd rather learn to live with my human tipos. • This technology... #NoAI

Infosec Exchange

Show thread

Nelson Mar 7

@matt How high is the bar for "works"? Code can compile and be a security nightmare, for example.

Relatedly, how long must it "work" for it to count? Do you have to be able to maintain the software?

Is the code still licensed properly? Did it "work" legally?

Did the chatbot give you mental illness which interferes with your ability to discern reality, and therefore to tell whether anything works?

Show thread

David Nash Mar 7

@skyfaller @matt

Yep. “Works” needs to be nailed down pretty firmly here.

Case in point: LLMs as a tool for answering factual questions. They will sometimes get them right. They *cannot*, however, actually get there from first principles, the way a human can, and so they also get things wrong a lot. Hilariously, brutally incompetently, wrong. For example, I check in with the major LLMs with topics from my primary hobby (amateur astronomy) every so often, and they routinely botch things. ChatGPT 5.2 recently told me that a galaxy I wanted to look at from some place in New Zealand would be in the northern sky, so I should look in the *southern* sky to see it. And that’s not even the worst error I have seen it make. The logical conclusion: either LLMs are bizarrely incompetent at this one single topic, despite there being plenty of useful training data for it, or they are similarly incompetent at other topics and people have a hard time seeing it if they aren’t experts in the topic. I know which way Occam’s razor slices here.

It’s like that for most other use cases I’ve tried, so: regardless of the ethical issues, I not only don’t use LLMs, I actively avoid them.

The one “it works well” use case everyone brings up is software development. LLMs are *definitely* better at this than they were a year ago, but only in one narrow area: speed of writing (and, to an extent, testing) code that *nominally* meets specifications. But software development, especially as a profession, isn’t just about cranking out superficially correctly behaving code fast. I work as a software developer for a large company, and only about 1/4 of my time is actually writing code. A lot of it is making sure I know what my business actually *needs* from code, and a code-spewing machine doesn’t help with that at all. Additionally, I have run into a lot of AI-generated bug reports and task tickets where the AI-generated output was simply wrong, causing me to waste hours tracking down reality. LLMs are actually a clear net-negative value for me right now in my day job.

Show thread

Teun 🌏 ❤️ 🏳️‍🌈 🇺🇦 🇵🇸Mar 7

@matt

Firstly, the ethical problems are so vast that whether it 'works' is irrelevant.

Secondly, I have never seen or imagined a use case that made the technology interesting for me to use.

To me, it is horribly unethical AND uninteresting.

So umm… all of the above?

Show thread

Irina Mar 7

@teun @matt Yes, that.

Show thread

vksxypants Mar 7

@matt I've used it and I've seen it work - it clearly has value. But it also has a cost - to creators of original works, to the environment, and to society when megacorporations are monopolising them - which I'm not willing to pay.

Local LLMs are better on the cost front and sometimes satisfy the value side of the equation, so I'm more forgiving of them. But overall, to answer your question, I'm the former, not the latter.

Show thread

Beach Mar 7

@matt ethical issues but also because I *want* to do the work myself. I don’t care if it’s easier or better with AI. The whole point of my life on this planet is to grow and to learn and to accomplish, and I can’t do that if I outsource my brain to a computer.

Show thread

joep schuurkes Mar 7

@matt GenAI is incredibly young. The fact that we're having discussions about if "it works", shows the software development industry doesn't think beyond the time frame of the proverbial goldfish's memory.

We're celebrating the disappearance of junior devs, the validity of lines of code as a metric, and the viability of reviewing instead of writing code and we call that: "it works!"

Show thread

joep schuurkes Mar 7

@matt Thinking about the question some more: I reject the premise that "it works" is separate from any ethical considerations.

Show thread

Glyph Mar 7

@matt this decision is ongoing, but personally I think about it with a matrix like this one:

- risk: what harms might I personally face?
- reward: what benefits might I personally accrue from using it?
- externalities: what harm might I be doing to others as a result of using it?
- systems: what harms (or benefits!) might develop as a result of *everyone* using this tech in the way that I am?

Show thread

Glyph Mar 7

@matt right now the evaluation I have of that matrix is:

- risk: AI psychosis, huge amounts of wasted time, skill loss, social credibility loss, gambling addiction, dependence upon technology that will increase rapidly in price very soon
- reward: maybe it could help me write some code a little bit faster? evidence is very weak even if sentiment is strong here
- externalities: water use, power use, plagiarism, spamming others with low-quality work
- systems: model collapse, financial collapse

Show thread

Matt Campbell Mar 7

@glyph That's a reasonable way to look at it. I think it's easier to argue against using these things if one can point to significant, demonstrable personal risks. If the rewards are stacked and the only counter-arguments are externalities and systems, then abstinence is a much harder sell.

Show thread

Glyph Mar 7

@matt the big problem with the personal risks right now is the lack of any credible safety story from the model vendors. as far as I can tell, we don't *know* what causes AI psychosis. there's some vague correlation with "sycophancy" and maybe they've figured out how to turn that down, but maybe not? we don't know how much skill loss is real. we don't have demonstrated best practices in place.

Show thread

Glyph Mar 7

@matt like, I think that people are far too nervous about nuclear technology because we actually know how that stuff works, we know how to measure dosage and harm and risk, and "ooh, spooky nuclear" is a vibe and not a risk calculation. but the opposite is true for AI systems. we're seeing these wildly dangerous outcomes and then people kinda yadda-yadda-yadda over "best practices" without ever saying what those practices are or providing validated evidence to support them

Show thread

Glyph Mar 7

@matt maybe the risk is very low! but if a guy uses a model to help summarize actuarial tables, goes crazy and starts calling himself a Star Child, and the response from the vendor of the product that arguably did this to him is "well, he probably had some family history of schizophrenia or something" when he was over 40 (WAY past the age where a disease like that generally presents) and that family history also doesn't exist, well, it's concerning that they still want me to use it

Show thread

Matt Campbell Mar 7

@glyph Wait, did that actually happen? I mean, the guy calling himself a star child?

Show thread

Matt Campbell Mar 7

@glyph It's easy to get the message, not only from boosters but also from reluctant users like Nolan (whom I boosted and posted about elsewhere on the thread), that the rewards are so great and undeniable that one would have to be a saint to not use the thing just because of the externalities, and, you know, none of us are saints like that when it comes to other problematic things.

Show thread

Glyph Mar 7

@matt that is a vague pastiche of a few different stories since I can’t check sources right now but it’s not too far off the mark. this comes from an outline of a post I am writing and I have like … a thousand citations to keep track of

Show thread

Glyph Mar 7

@matt as far as the benefits … I know that it is making people feel high, and *maybe* the latest Claude models specifically are just so much better at software than any of the previous six times that somebody said that “this is it!!!”, but we still haven’t seen any real hard evidence in the form of monetary ROI. the one company we know is leaning hard into LLMs for everything, Microsoft, seems to be having a historic number of bugs and outages

Show thread

Glyph Mar 7

@matt here's the "starchild" story specifically: https://www.vice.com/en/article/chatgpt-is-giving-people-extreme-spiritual-delusions/

ChatGPT Is Giving People Extreme Spiritual Delusions

You may be aware of the fact that ChatGPT suffers from hallucinations. Now some of its users are suffering through similar delusions.

VICE

Show thread

Matt Campbell Mar 7

@glyph I would love it if Nolan, and no doubt other developers as well, could have the relief of finding out that what seemed to be the "terrifying" effectiveness of current coding agents was in fact an illusion, with the risks (ideally to both the developer and the business/project) outweighing the rewards.

Show thread

Matt Campbell Mar 7

@glyph And then I could continue to not use coding agents without the FOMO.

Show thread

Glyph Mar 7

@matt I wish I could provide that, but my only real insight is that a ratio of benefits to costs *exists*, and that we are structurally disadvantaged in evaluating its denominator, while boosters either totally ignore or, at best, wildly underestimate it. but I don’t know what it is, and it’s very expensive to measure even without those cognitive, economic, and social impediments to getting an accurate value for either number.

Show thread

Joe Cooper 🇺🇦 🍉Mar 7

@matt @glyph one of the things that convinced me of its usefulness was asking it to tell me how something in a large codebase works. It's not just become pretty good at writing pretty good code (when it has clear success criteria), it's also quite good at finding and documenting how the pieces fit together much faster than I can. I don't want to presume what your workflow is like but I can't imagine skimming through hundreds of files and thousands of lines is faster for you than for me.

Show thread

Joe Cooper 🇺🇦 🍉Mar 7

@matt @glyph I don't have any answers on why some folks are driven kinda mad by the thing. I hope just understanding a little about how it works and recognizing its limits, and that it's obviously not intelligent or all-knowing, can prevent it. Also, I turned off memory features in the chatbots. I don't like it bringing up old conversations and assuming it knows what I'm trying to do. ChatGPT is the most sycophantic and I'm not using it anymore due to them enlisting to do war crimes.

Show thread

Glyph Mar 7

@swelljoe @matt this dynamic has multiple orientations, though. you're assuming everybody is doing roughly the same thing, but there are different roles. there is the person who asks the chatbot "hey how does this work", and then there is the person who already knew how it worked, who has to now spend a bunch of time unwinding subtly incorrect interpretations that others have built up by asking the chatbot "how does this work"

Show thread

Glyph Mar 7

@swelljoe @matt I find myself in the latter role more frequently than the former, and thus my impression of 'how good its answers are' is informed by a sort of distilled residue of one of its failure modes

Show thread

Joe Cooper 🇺🇦 🍉Mar 7

@glyph @matt the folks suggesting non-technical folks can use LLMs effectively to make software (or maintain software) are still wrong, though the level of technical skill required has shifted quite a bit in recent months. I absolutely believe you've seen people misled by LLMs...but, I've seen folks try to use an LLM to solve a problem, fail, ask me for help, and I used the same LLM to solve the problem (because it's a problem I have no experience with) in a few minutes.

Show thread

Joe Cooper 🇺🇦 🍉Mar 7

@glyph @matt they're still kinda dumb in a lot of the same ways they've always been kinda dumb, but in an agentic context they can search the web, try different tactics, etc. and find solutions in a process that looks kinda like what a human solving problems looks like. Not infallible, not all-knowing, but if given clear success criteria it often finds a way. If you recognize when it's looping on something outside its abilities, you can intervene.

Show thread

Joe Brockmeier Mar 7

@matt I reject it for ethical reasons, the same way I avoid shopping with Amazon.com for ethical reasons more than any pragmatic reason.

Is Amazon often the cheapest, fastest way for me to acquire a thing? Yes. Is it also a terrible company? Yes. If I can acquire something in another way, I look elsewhere. (I am not perfect about this.)

Even if you could show me evidence that, somehow, generative AI produced 100% accurate code or text, I'd still be against using it on ethical and social grounds.

Do I use some LLM-driven software? Yes. I use some local models for transcription. Do I double-check the results? Also yes, because I cannot be 100% sure its results are accurate, and I don't want a fabricated quote winding up in an LWN article.

Might I use LLM-driven stuff at some point for grammar checking? I already use LanguageTool, so ... maybe?

But purely generative AI stuff... I have too many ethical, social, etc. qualms against it to make it part of my work even if I was confident it was 100% accurate. (This is not all, strictly speaking, ethical - I also have qualms about its impact on FOSS development from many other angles, such as increasing the velocity of PRs and putting maintainers under even more stress.)

I also, currently, reject it partially out of spite/stubborness -- there is far too much "peer pressure" and pushing to accept it. I cannot claim this is a logical stance, but when this much money is being spent to push something, I feel like somebody should be pushing the other direction. I'm just dumb enough to be that somebody.