Sorry, can't do scarves.

https://sopuli.xyz/post/41832893

I’m sad that the relevant xkcd is kinda obsolete now (because it’s been long enough for that research team to finish going its thing).
Tasks

xkcd
What would be a “nearly impossible” task in this post-AI world? Short of the provably impossible tasks like the busy beaver problem (and even then, you would be able to make an algorithm that covers a subset of the problem space), I really can’t think of anything.
Deterministic answers from AI
Do you have a link explaining what deterministic means in the context of AI? Preferably for noobs

Deterministic means for the same input you always get the same output.

For AI it would be if you ask it a question multiple times using exactly the same words you would get the same answer.

Wouldn’t you just set the temperature to 0?
Still going to be non-deterministic for any commercial AIs offered to us. It’s a weird technology. I had a link to an article explaining why but I can’t find it anymore.
Ah yeah I was wrong. You set top-k to 1 to get a deterministic output.

Most AI are deterministic, it’s only a small subset of AI that are non-deterministic, and in those cases it’s often by design. Also, in many cases, the AI itself is deterministic, but we choose to use the output in a non-deterministic way, e.g. the AI gives a probability output, and will always give the same probabiliies for the same input, and instead of always choosing the one with highest probability, we choose based on the probability weight, leading to a non-deterministic output.

Tl;Dr. Non-determinism in AI is often not an inherit property of the model, but a choice in how we use the model.

Okay, probably fair. I’ve only been working with LLMs that are extremely non-deterministic in their answers. You can ask same question 17 times and the answers have some variance.

You can request an LLM to create an OpenTofu scripts for deploying infrastructure based on same architectural documents 17 times, and you’ll get 17 different answers. Even if some, most or all of them still manage to get the core principals right, and follow the industry best practices in cases that were not specified, you still have large differences in the actual code generated.

I think more important would be non-chaotic answers. It doesn’t matter too much if their not identical if the content is roughly the same. But if you can get significantly different answers from trivial changes in prompt wording, that really does break things.

Still doesn’t mean it’s correct though.