For the 1,000th time: "AI" does not have agency and cannot think and cannot act.

Chatbots cannot "evade safeguards" or "destroy things" or "ignore instructions".

They do literally only one thing and one thing only: string tokens together based on statistics of proximity of tokens in a data corpus.

If you attribute any deeper meaning to this, it's a sign of psychosis and you should absolutely never use chatbots, possibly you should even touch grass.

@thomasfuchs it's an optimization algorithm - it's not trying to solve a problem, it's finding the shortest path to the stated goal that doesn't violate your constraints.

If you don't specify your constraints correctly enough (and most people don't because it's really hard) it will find a path that you were unaware of and really didn't want. Sometimes, even if you have specified it well, you run headlong into Goodhart's law and stops achieving your goal.

Just in case it's not clear: this isn't how the human brain works.