I like how we took something computers were masters at doing, and somehow fucked it up.
@oli I get the right answer when I try. Same inputs.
@jesusmargar @oli and this is one of the problems with LLMs—they’re inherently stochastic
@jesusmargar @oli the inability to create reproducible test cases for these systems is an enormous problem for our ability to integrate them into other systems
@kevinriggle @jesusmargar @oli we’ve been told we should create ‘plausibility tests’ that use a (different?) llm to determine whether the test result is fit for purpose. also, fuck that.
@airshipper @kevinriggle @oli perhaps the problem is to expect deterministic behaviour rather than some degree of inexactness. I mean, I wouldn't use it to make final decisions on cancer treatment, for instance, but maybe it's ok to polish a text that isn't too important.
@jesusmargar @kevinriggle @oli i would use it to generate waste heat from the exchange of tokens, after shifting a sizable chunk of our engineering budget from salaries to services sigh