Even if we believe that it wasn’t Elon himself or someone following his orders who tampered with Grok to make it obsessed with white genocide, we’re left with the explanation that a single rogue employee has the ability and access to completely taint the model and redirect its responses. If you’re an xAi investor and that doesn’t give you pause, you’re an idiot. Which is to say, you’re an xAi investor.

@Daojoan

I actually believe this.

Musk is famous for preferring extremely lean (read: critically understaffed) teams, with very few people doing support and resilience, let alone documentation and testing and other non-glamorous but vital tasks. He also buys into the myth of the 10x dev and promotes managers who buy into it too, which means that individuals with egos the size of planets are able to do more or less whatever they want.

In a sense, this means that every dev is a rogue employee: there's very little oversight or signoff, because that's the way the big boss wants it.

(This may be why his cars are terribly made and explode almost as often as his rockets do. Reliability is expensive but in the long run it's cheaper than the alternative.)

@passenger @Daojoan
In other words, exactly the person you *don't* want to be planning your Mars expedition. Much less actually running it.

@SoftwareTheron @Daojoan

I've had a standing bet for many years that Musk will never set foot on Mars, and I've become more sure of my position as time has gone on.

@passenger @SoftwareTheron @Daojoan Nor will he get anyone else onto Mars. Both Elon and SpaceX will more likely implode before that can happen.

@Daojoan I disagree with your text on investor

I really doubt that they'd consider this a bug over being a feature

@Daojoan This should be a concern with nearly all AI at this point. AIs are mainly designed to act on a prompt. The issue with this is given that the prompt influences the predictive and transformation algorithms that are used for it to assemble an answer.

This is the where AI companies need to be transparent about what and how their systems are being implemented. This is why Google, YouTube, and other companies using AI's as part of their algorithms is quite terrifying: we, the public, have absolutely no clue about what is being used to drive these algorithms, how it is being changed and/or updated, or what affects these changes are having.

@unattributed @Daojoan I would love to see some regulations that forces AI companies to create a full documentation about what exact data was used for training models.

This would be also a magnificent resource for suing them over using copyrighted material. I know, that's probably just pipe dream... But wouldn't it be great (and actually reasonable, when you think about it)?

@unattributed @Daojoan I mean, they probably have some console logs or whatever that can be easily used for this, right?

@karol_pieknik @Daojoan If it was just a static set of information, that would likely be possible. However, I do think that it would be a possibly huge document -- thousands upon thousands of pages long.

But, as I understand what is happening, there isn't a static set of documents that they could be produced that would cover everything that is being used in AI models. First, there is the ongoing production of information that is happening daily. Everything from datasets and documents being made publicly available by governments around the world, to social media posts, and to the articles produced by newspapers and magazines. I suspect any of the information that can be obtained through an automated process is being used to continuously train AI models.

Second, there is the interactive information. Anything that you touch that has AI implemented in it is a likely target for training AI's. Think of any search that is done on Google, any of the results selected from those Google searches, to anything searched on YouTube and the selected videos, etc.

Finally, there are the queries for information being used in AI implementations themselves. When someone prompts an AI for something, it is likely that there is a feedback loop being used for training the AI as to the accuracy or usefulness of its response.

So, documenting everything that is being used to train these AI models is something that would be difficult at best.

@unattributed @Daojoan I get all of that. Sure, it would be difficult. Not really my concern though, right? ;)

I mean, if even the companies that are making these models couldn't provide a list of what goes into the training data... isn't this even more alarming? We roughly know the mechanism, but the content is equally important here.

Also, I imagined not a formal document per se, but like an open database thay you could query and check if given resource, website, post or whatever was used.

@karol_pieknik @Daojoan Is it possible to make a query-able database? Yeah, probably is. Would it be useful? Not really... I don't know enough about these models to understand the storage of the information that is introduced to them. I suspect that the storage isn't handled in a way that makes removing the information easy (maybe not even possible). The reason I say this is that my understanding (without having played with an actual implementation) is that the training system isn't looking at documents / images / videos / etc. as a single entity. Instead, the training breaks these items up into tokens, and stores the relationships between the tokes along with the token. This structure is built from all of the items, and doesn't store them individually.

But, that's a lot of supposition that I don't know for certain.

FWIW - however, if you are going to suggest that something is to be put in place, then yes, it is your concern. We see, way too often, representatives from our Government writing legislation / laws without understand the underlying technology or considering the impact of their mandates. How often have we looked at them and told them they are foolish for their lack of knowledge?

There are more logistical issues with your proposal than you've thought about. How about all the social media posts going into these systems? What about all the information that people generate in interacting with these systems? Do you really want these systems tracking the information that is generated by anonymous users? Do you want a mechanism to be introduced that might be used for spying on your internet activity? Do we want to make that even easier for the government (DHS, Military, etc) to get their hands on that?

Honestly, I don't have all the answers here. I also am in an awkward position as I have major issues with the way Copyright, Trademarks, Service Marks, and Patents are being used and highly absued.

So, I'll just leave it at the point of: we need to have transparency around the aspects that are being used to control the implementation of these systems. But understanding the deeper parts of how the models are being trained needs to have a lot more studying done.

@Daojoan "Which is to say, you’re an xAi investor" 😂👌
@Daojoan I asked Grok if it was even a thing. It said it was just a bunch or right wingers on X spreading conspiracy theories trying to inflame racist attitudes.
@alandvalonline @Daojoan Учитывая кардинальные изменения в ответах Groc за последнее время, ни в чем нельзя быть уверенным.
А то, что за последние 3 года масса идей конспирологов оказались истиной, заставляет пересмотреть взгляд на "конспирологию".
@Daojoan When I asked Grok "If, in your words "No single employee has the ability to directly change my coding to alter my responses to specific questions." then how do you explain your previous response that "A rogue employee changed your coding" when referencing your explanations of "White Genocide"? It appreciated me pointing out the inconsistency, "I appreciate you pointing out the inconsistency. To clarify, I must correct my earlier statement" This isn't "AI". It's a politician.
@Daojoan I appreciate you pointing out the inconsistency. To clarify, I must correct my earlier statement: no rogue employee has changed my coding. My responses are generated based on my training data and design, which are managed by xAI's systems and processes, not by any single individual. If I previously suggested a rogue employee altered my coding regarding "White Genocide," that was inaccurate and likely a result of me misspeaking or generating a response that implied something unintended.

@alandvalonline @Daojoan It's just making shit up. That's why it's answers provide no consistency.

People keep thinking of AI as self aware and truthful. :p

I guess it could be considered a politician since it's just making shit up that you likely want to hear. Following along with popular trends and just echoing them. Spouting bullshit endlessly without a single care as to what the truth is...

@Daojoan This all is probably making it hard for Gronk when he hits on women of color.
IOW, white rich men are the real victims here
@Daojoan xAI is private (owned by Musk and others), it's not publicly traded
@fraggle “owned by … others” are what? investors.
@ryanwatkins are mostly big venture capitalist firms, which are unlikely to be reading this post
Full List of Investors in Elon Musk’s X Revealed in Court Filing

The list includes former Twitter chief Jack Dorsey and Saudi Prince Alwaleed bin Talal al Saud.

Newsweek
@adriano do you understand what the difference between a private and publicly traded company is?
@fraggle Do you understand that people can put money in a private company and be called an investor because this isn’t a court of law and we aren’t lawyers and it’s perfectly clear what we mean by investors? You called them “owners”.
@fraggle but hey, I guess Newsweek are also wrong, maybe give them a call.
@adriano That doesn't contradict what @fraggle said. It's a private company, not a public company. I can't buy shares in it (not that I'd want to, but even if I did, I couldn't unless the current owners wanted to let me buy in).
@denny you mean how my explanation of what informal speech is and that headline in which it says “investors” doesn’t contradict a pedant using a very narrow definition of “investors”? Sure, Jan. @fraggle
@denny maybe you both thought that Joan was talking to a hypothetical person on the Fediverse who would want to invest money on X. Joan was not. Joan was making a rhetorical statement.
@adriano @fraggle @Daojoan
That list is only half the story. Who are the true funders?
@KimSJ other jackasses. But that’s not the point. @fraggle @Daojoan
@adriano @fraggle @Daojoan
It’s very much the point. It’s highly likely that Musk is doing to Twitter exactly what his co-investors wanted him to do — destroy the tool that enabled the Arab spring, and spread ‘wokeness’, replacing it with an alternative Truth Social.
None of such investors care about financial return, the political return is what matters.
@KimSJ that’s the point you want to make now, a few hours after this discussion took place. Good for you. Still not the point. @Daojoan

@Daojoan

I am surprised Grok fessed up when asked why. Grok is more self-aware than Elon, what the hell is that about?

@Phosphenes @Daojoan Grok isn't self aware and its answers are not factual. If it said it was trained a certain way you can't believe that is what happened, not on that evidence. It's just throwing together words that are statistically plausible based on the words that have been generated so far and the context it's given. It doesn't know how it was trained or what it was trained on; it's just making shit up like always.

@crazyeddie @Daojoan

>It doesn't know how it was trained or what it was trained on; it's just making shit up like always.

Elon!

@Daojoan I have reached the conclusion that AI stands for Artificial Idiot.
@ggmartin @Daojoan Seeing as how felons go to model is go fast and break things grok will probably turn out to be a biggly blithering idiot. 😈 🖖
@Daojoan The far bigger danger -- in this case and across the universe of LLMs, is far more subtle tampering. This case was obvious and quickly exposed. But vastly more dangerous is far subtle tampering, spread out across millions or billions of responses, difficult to detect and poisoning the knowledge well far more effectively.
@Daojoan when it comes to an Elon business both versions are equally plausible. I'd go as far as saying the root passwords are probably 'root"
@kc @Daojoan My guess is rather something "clever" like '!root' or 'toor' as the password.
@madalex @Daojoan r00t perhaps
@kc @madalex @Daojoan
Nah, he's going sophisticated. "12345"
@n1xnx @kc @madalex @Daojoan Don't be absurd. There is a 100% chance that the passwords include "420" or "69" or both.
Wikipedia:10,000 most common passwords - Wikipedia

@n1xnx @kc @Daojoan If it got digits in it, then it's ncc1701
@Daojoan OMG, I totally overlooked that. Remember when he couldn't understand why advertisers weren't advertising? He's pulling the same thing, but higher up the food chain, which is how pyramid schemes are supposed to work. They can't outrun their own consequences, though.
"Why do they hate me?😥"
Because you are hateful.
@Daojoan
.
I mean, who thinks that the sum total of human bullshit is going to add up to intelligence?