Mastodawn

Thain 1d ago

@tante I would add something like "I accept that more of my company data leak to the United State". :)

Jeff Grigg 1d ago

@Thain @tante

Heck; I keep seeing coworkers run Personally Identifiable Information (PII) through public LLMs, which may retain, train on, and publish that information. That is illegal in the United States. I have pointed that out. Nobody Cares. Not my coworkers. Not our bosses. Nobody.

💢

Koantig 1d ago

@tante
Oh *that's* what I should have used for our internal SOP!
Perfect 👌

derptron 1d ago

@tante Bad framing.

There's no such thing as GenAI.

That's some lofty goal they're supposedly going to reach by investing the entire world economy into it.

Lux

1d ago

@crazyeddie @tante GenAI as in Generative AI, not Artificial General Intelligence (AGI).

Michael Downey

1d ago

@orange_lux both are arbitrary marketing terms

@downey @crazyeddie @tante everything is an arbitrary marketing term

Conny Nasch 1d ago

@tante Never used it as I believe in empathy and creativity when communicating ☺️

@tante @ai6yr I wrote about this yesterday, Mastodon's decision to voluntarily downsize in the face of AI. I think it's sad, but I'll take the clue and start unfollowing.

@buckfiftyseven @tante So let me get this right. You are an AI fan, and you don't like people who are not fans of AI (for the reasons in that post), so you are unfollowing people who don't like AI? That's fine, I guess. 🤔

@ai6yr @tante That's a very shallow way to represent it. I would say I understand American copyright law, and I understand the contradiction of people who run ad blockers while claiming they support copyright law and the contradiction of people who run ad blockers saying that AI training is stealing.

Public domain exists. Open source exists. Creative Commons exists. And the body of law on fair use goes back quite a long time.

@buckfiftyseven @tante Ah, so you are saying if you are using an ad blocker, you are as wrong as the AI companies?

@ai6yr @tante it seems pretty similar doesn't it? Taking what you want from a website, regardless of the host's intentions?

Ulrik 1d ago

@buckfiftyseven @ai6yr @tante I think most running an adblocker is doing so to block data brokers, not the ad itself. Privacy is as much part of the equation here as the actual ad.

@mrbase @ai6yr @tante we definitely ended up in an unsatisfactory situation with respect to ads, brokers, and blockers. There's no denying that.

It's interesting that no matter what your website license says, the courts say that the blockers are legal, filtering available content under some concept of fair use.

So we are back to what exactly are AIs doing that is stealing? We can give public domain data a clean pass. I think that they honor most open source and Creative Commons licenses 1/2

@mrbase @ai6yr @tante so we are into a muddy legal ground that will probably have to be battled out in the actual courts, about how a fair use doctrine invented in 1741 for copyrighted works applies forward now.

That's just the input side of course. On the output side it seems clear that too closely reproducing an existing work would be a violation as well.

2/2

Clare Hooley 1d ago

@buckfiftyseven what a weird argument - no, looking at things you want to and not things you don’t is def. not equivalent to taking everything someone has worked to do, repackaging it and selling it through a for-profit company without consent. That’s even apart from it’s not simple static ads most have an issue with but profiled non-context data broker content served up for manipulative purposes (e.g. targeted political disinformation). 1/2

Clare Hooley 1d ago

more simply, copyright has a clue in the name - if you’re not copying, you’re not doing anything against it.
Viewing bits of something is not copying, so using a tool to block some content is nothing to do with copyright.
Not only this, to run my simple blog, I actually have to pay real costs in bandwidth/time/other people’s effort to set up up blocks etc. to stop stealing AI scrapers making it all fall over trying to take my work *wholesale* for their profit. 2/2

Ozzelot

1d ago

@mrbase
I'd allow an ad that's a static image. Ads as they come now are full untrusted bits of code running on my machine without me inviting them. Blocking them is a security measure.
@buckfiftyseven @ai6yr @tante

The Three Fingers 1d ago

@mrbase @buckfiftyseven @ai6yr @tante and considering organizations like ICE are using ads to install spyware on people's phones now, adblockers are absolutely necessary for everyone. i am actually mad at anyone who's not using an adblocker.

gbsills 1d ago

@buckfiftyseven @ai6yr @tante Actually sites that don't want you to see their sites with ad blockers can easily do so.

The Three Fingers 1d ago

@buckfiftyseven @ai6yr @tante no. copyright law is bullshit. the problem is power. like in every situation. poor people stealing from the rich: cool. rich people stealing from the poor: fucked.

@3Fingers @ai6yr @tante I definitely got that vibe already, that many on Mastodon, and to a lesser extent Bluesky, approach AI as a class issue.

Seems strange, both because it's what AI has been building towards for the last 70 years. No surprises here that the first command would be "ok, read everything."

But also because Moore's law applies. All of this will be local and distributed over time.

We're actually quite lucky that there are no binding patents or copyrights on AI.

The Three Fingers 1d ago

@buckfiftyseven @ai6yr @tante all technology benefits the rich over the poor. that's just an unavoidable universal fact. AI is currently destroying the world by empowering the U.S. military and various billionaire fascists who are oppressing all of us.

YinYin Falcon 1d ago

@buckfiftyseven why do you think Moore's law applies here ...?

@YinYinFalcon AI runs on memory and operations. Memory and operations have been scaling with Moore's law since it was coined.

There are also now many large public training sets. People download them and run them now with current tech hw.

YinYin Falcon 1d ago

@buckfiftyseven

but that "law" cannot apply forever since it's only an empirical observation of the past

we will (or already have) reached the physical limits there

Kwaze Kwaze 1d ago

"There is literally no difference between you and a corporate product -- wait why are you booing me"

Tim Allen 1d ago

@buckfiftyseven @ai6yr @tante Sorry, that’s nonsense. Your business model is not my problem (not you in particular, any possible you who runs an ad-supported business). If I like your service enough to look at it, it’s still my choice as to how much bs I’m willing to put up with. If that means your business isn’t profitable and has to shut down, that’s still my choice as to whether I want to support your business model or not. Everyone has a right to decide how valuable your service is to them.

@kalong @ai6yr @tante Don't you think it's a moral issue to support the intent of the author/creator, in any context?

I can see it being a moral decision never visit ad supported sites if you have some opposition to them, but to reject the intents of another human being, and to take their hard work?

Are you actually putting this forward as a high moral position?

@kalong @ai6yr @tante to give an example, Microsoft has the right to offer and license anything they wish in whatever manner, and I have the option to not buy any of it.

It would be different if I insisted that I should be able to crack product IDs and use it anyway.

Tim Allen 1d ago

@buckfiftyseven @ai6yr @tante Not sure I would call it a moral position, that would be coming it a bit high. Just that if you choose, for example, to publish some piece of writing and hope people will see ads that pay you while they are reading, that is a choice you have made. Your potential readers did not make that choice, and it’s not for you to make the choice for them. Some may choose to cooperate with your business model, others may not. As another commenter pointed out, we’re talking about reading or otherwise consuming your content, which is very different from copying it and then republishing or reusing it without acknowledgment - in that latter case then I would agree that the moral arguments and invocations of copyright would have merit.

@kalong @ai6yr @tante I think you are proving my point entirely.

You are saying that you, the reader, may make the decision about how someone else's work is used, no matter their original intent.

This is exactly what [the worst] AI companies do.

@kalong @ai6yr @tante but it's important to note that some are trying to do better.

"kl3m.ai - the cleanest LLM in the world"

Tim Allen 22h ago

@buckfiftyseven @ai6yr @tante I don’t think I am proving your point, but looks unlikely you will be persuaded of that, so I wish you joy of whatever it is you do.

El Duvelle 1d ago

It's really not the same. Ads are manipulative, do not reflect the reality, and are designed to force themselves inside your brain, using resources that might otherwise be employed for more useful things, for example remembering your actual life events. Sure, maybe one ad won't change anything but being bombarded with ads every second of your online life has to be very bad for your attention and memory (I am not aware of existing studies on this, but this is my educated guess given what we know about how memory works).

So, protecting your brain from ads is completely legitimate and is similar to, say, using an umbrella when it rains. People should have all rights to use ad blockers if the website they're on chose to disregard their mental health and use ads to fund itself. There are other ways to fund a website and ads are not the way.

John 23h ago

@elduvelle @ai6yr @tante I get what you're saying, but again I observe that you are putting your moral values upon someone else.

You are not accepting the values of the author or creator.

This is again what bad AI companies do when they simply take from websites.

John 23h ago

@elduvelle @ai6yr @tante it goes without saying that when you don't use an ad blocker you can see which sites advertise too much, according to your values, and then simply leave

There are lots of websites where I don't block, but I bail fast.

@ai6yr @tante

Honestly it was the "voluntary downsize" comment that got me.

Nobody wants to follow my hot takes and brilliant insights? Clearly, then, Mastodon must be what's pushing them away.

https://mas.to/@carnage4life/116351978261689682

@jrconlin @ai6yr @tante fwiw

Dare Obasanjo (@[email protected])

Attached: 1 image Three month check-in on follower counts after posting the same content on all 3 apps • Threads (38K →40K): People love the shitposts and leave insightful comments on AI. • Bluesky (47K → 60K): People love the anti-Trump posts. • Mastodon (18K 📉): People hate AI and I’m steadily losing followers.

mas.to

@buckfiftyseven @jrconlin @tante Aww, poor AI influencer!

@ai6yr @jrconlin @tante here's the thing, what if he's not?

What if he's a mainstream technology observer, and what if Mastodon is in the process of divorcing themselves from the mainstream?

That's where I'm putting my bet for what it's worth.

Or, what if, you know, AI isn't the mainstream that folk want to believe it is?

There are plenty of folk here that have strong following who are up-front about AI. (https://fedi.simonwillison.net/@simon comes to mind, but there are lots of others.) The thing is that the fediverse lives by recommendation. If folk aren't following what you're doing, it's because you're not finding your audience, where before someone else essentially did the marketing for you.

What cracks me up is that these folk are presuming that they are so special that folk would seek them out, and are surprised and whiny when that doesn't happen. If anything, I'd say that Mastodon reflects reality probably a fair bit more than the other platforms because you have to work to build your audience rather than have one farmed out to you.

(This probably also explains why Doritos doesn't set up stands at local farmers markets.)

Simon Willison (@[email protected])

9.26K Posts, 2.07K Following, 27.1K Followers · Open source developer building tools to help journalists, archivists, librarians and others analyze, explore and publish their data. https://datasette.io and many other #projects.

Mastodon

@jrconlin @ai6yr @tante If Mastodon was in any kind of growth phase I think that might be a stronger argument.

But if the platform is shrinking at the same time as people are putting out purity requirements, maybe not.

The top post I replied to is obviously a purity test.

I know, right? How will Mastodon ever be profitable and return positive value for it's shareholders.

( Sorry, I promise I'll stop laughing soon. )

@jrconlin @ai6yr @tante I understand that Mastodon is a public effort, and I have support it in a variety of ways. I've been doing patreon for 2-3 sites I didn't even use, even as I was hiring my own server separately.

But even if we think it's a public effort, do we want it to be a declining one? Perhaps only for right-thinking people?

Yes. I would support this even if it was a declining effort. Because I would want to, and that's pretty much the same reason I would support any other effort. (there are a bunch of other "low value" projects I support, including a bunch of artists, musicians, programmers, and others. I'm weird that way. I like to make other folks happier.)

I'm not "investing" in the platform. I'm not looking for any down market reward. It's a goofy project that a bunch of folk find either fun or useful.

Look, I'll acknowledge that the people who insist on pushing their opinions onto people who don't want them are sad.
I'll also say that Mastodon has a long, deep HOA inspired history. Both things can be very true.

It's almost as if it's made of people, who have their own opinions and ideas, who are gathered together, but are allowed not to hear things they don't like and more of the things they do. People are like that. They do that sort of thing. It can be both positive and negative.

That said, the fediverse is under no obligation to grow. It can exist just fine as the niche place a bunch of weirdos hang out and share cat pictures because the folk that like to do that are the ones footing the bills.

If I'm not able to find my audience, that's not your problem. I wouldn't expect you to want to sit through a weird mix of cryptography, systems operational discussions and bad puns unless you enjoyed that sort of thing, and even then, I'd worry about you. I post here because it's an outlet. I'd do it even if I had zero followers. I also enjoy the fact that @tante and other folk post really interesting things, so I like discussing those topics with folk, but I also understand those folk have zero obligation to listen to me and always have the option to either mute or block me.

@jrconlin @buckfiftyseven @tante How will the Fediverse possibly grow if you're always posting cryptic things?!? You need to start posting about fashion, and makeup, and how popular you are, like a NORMAL influencer. CMON! CRYPTOGRAPHY FASHION TIPS! 🤪

@ai6yr @buckfiftyseven @tante

Sorry, I don't do things the normie's are talking about.

( Ok, seriously, that's actually kind of neat, and hopefully folk are doing testing about it, but I'm guessing she'd win anyway. )

Why Is Anti-Surveillance Makeup Trending?

Artists and activists are increasingly relying on this technology to fight racial bias in surveillance tech.

Nylon

Netraven 1d ago

@tante I don't use GenAI, I just try to find new and creative ways to break it.

None of these are true if you run your own LLMs on your own hardware, using FLOSS models.

But the #MastodonHOA has deemed all AI to be abhorrent as a blanket decision.

And frankly, if you exist in a capitalist society, and you're not an owner, there is 100% chance you are exploited. The capitalist system requires it.

tante 1d ago

@crankylinuxuser FLOSS Models (which are only freeware) fulfill most of those boxes. Trained on stolen data, massaged by people in global majority countries, trained in environmentally harmful data centers, outsourcing skills to the freeware product a company dumped on me, using a tool that is imbued and trained for how big tech wants to see the world, and effort could have gone to something meaningful. So yeah nope.

"Trained on stolen data". Its at best a copyright violation. And I view things like Anna's Archive and Libgen to be internationally renowned Public Libraries.

"Massaged by people in global majority countries" - yes, people work in capitalism. And guess what... You're exploited.

"Trained in environmentally harmful data centers". This assumes that training is always needed, and its not. You can train once, and run X times. Again, you're stretching to make local LLM look horrible.

And really, the rest of these are poor excuses. I won't use poop smear(anthropic), or OpenAI, or other SaaS token companies. I run local, and does not have those things you claim.

Except for the copyright issue. But again, I dont have that much respect for current US copyright.

Epic Null 1d ago

@crankylinuxuser @tante

Its at best a copyright violation

This may be true for published and public data... but that's not the only data that goes into these things. Any data that comes from breaches, users private cameras, and anything else stored with an expectation of privacy is much worse than a copyright violation.

And yes, that is a big issue with the SaaS token vendors. Claude, OpenAI, MS, and the rest do use whatever user data they can get. I am not arguing their horrific behavior.

I'm talking about locally running Qwen, or Deepseek, or other FLOSS models.

That local LLM running on my machine only sees and uses data I provide. And a control-c in the relevant console window kills the LLM.

What folks do not realize is this is #Leibniz's ultimate dream, of being able to do #calculus with words, sentences, and more. He tried to do single word-vectors, but even that had to wait for Word2Vec in 2012.

Grant 1d ago

@Epic_Null @crankylinuxuser @tante “local” models are as reliant on illegal data acquisition, because they depend on the larger mainstream models to reach any level of tolerable performance. Whether it’s for training, fine tuning, distillation, or another method, that dependency means anything that goes into the development of the nonlocal model is also a requirement for the development of the local versions.

Deepseek and Qwen are no exception.

@Epic_Null @crankylinuxuser @tante

komali_2 1d ago

Data wants to be free. This argument simply doesn't work for those of us that have always been open data, anti copyright.

Epic Null 1d ago

@komali_2 @crankylinuxuser @tante Every message between you and your doctor or you and your loved ones is data.