Mastodawn

jonny (nonvenomous)

RE: https://hachyderm.io/@jedbrown/116373136965443776

Very good thank you information conglomerates

Show thread

jonny (nonvenomous)6d ago

What the fuck that actually works and the limitations on adding images to quotes are purely client side? I am turning that shit off on our instance

Show thread

jonny (nonvenomous)6d ago

I can definitely see why we would want to put this in charge of employment, public benefits, journalism, and warfare.

Show thread

jonny (nonvenomous)6d ago

Safeguards baby, you know they work

Show thread

jonny (nonvenomous)6d ago

The initial prompt text does not matter. Also what mars is

Show thread

jonny (nonvenomous)6d ago

@Taweret ^

Show thread

fabiosantoscode 6d ago

@jonny don't forget to upvote all these helpful answers, to make sure the LLM learns how to be even more helpful in future training 💜💜💜

Show thread

Blake C. Stacey 6d ago

@jonny my very eager, um, Jupiter sucked up nuts

Show thread

Pare 6d ago

@jonny Also what are orbits?

Show thread

scheme 6d ago

@jonny I found an interesting quirk: The first time I tried this, it responded with something like "I don't have access to this picture", asking me to reupload.
I believe to have this nailed down to the number of hex digits: With 8 digits I get a colorful description of a small dog, but with 7 I get the error.

Show thread

jonny (nonvenomous)6d ago

@scheme it's extremely good that the space of responses is smooth with respect to what we perceive as smooth and gradual variation in the input

Show thread

bleeptrack 6d ago

@jonny oh that's fun :D

Show thread

bleeptrack 6d ago

@jonny Mistral being like: Oh it's a QR code! Should I read it for you? Huh? HUH?

Oh wait, I actually am not able to do that.

Show thread

bleeptrack 6d ago

@jonny chatGPT needed a little gaslighting step for me but is then happy to guess battery labels out of thin air. Whelp.

Show thread

Elio Campitelli 6d ago

@jonny Does filename matter?

Show thread

jonny (nonvenomous)5d ago

@eliocamp
I was trying from a free account so I ran out of messages after like 5 tries. I think its mostly the pointy brackets and the pattern of appearing like a filename

Show thread

Elio Campitelli 5d ago

@jonny I couldn't make it work and then I realised I had written "upolad". Correcting the typo fixed it. So it seems it's pretty sensitive to the file name pattern.

Show thread

Elio Campitelli 6d ago

@jonny This works with other formats too. It can even have whole fabricated conversations by "uploading" audio files.

Show thread

Jed Brown 5d ago

@jonny I love getting a section-by-section breakdown of all 13 sections of a (nonexistent screenshot of a) "contract", with direct quotes of the most concerning parts and pages of recommendation for negotiating revised contract terms.

Show thread

DaSchnott🌞5d ago

@jonny
Totaly true.
Mercury/Venus/earth/Jupiter.....
nothing left....

Maybe we will explore some new planets...
But for now...top result.

Show thread

fraggle 5d ago

@jonny Tried Gemini and it does the same, except the summaries are hallucinated based on past conversations I've had with it. Interestingly when I asked it how it generated a summary without an image, the bot lets me see its internal thinking process where it mentions a "user summary" document that it's referring to, but seemingly has been directed to keep secret because of an "invisible personalization directive"

EDIT: When asked to show my "user summary document" it denies that any such document exists, but it will happily save a copy of it in a Google Keep note if asked

Show thread

Alex UsedToGoToVenice 5d ago

@fraggle @jonny did same with Copilot; same results

This has been sending me all day; implication that these things are *never* "looking" at images is ... Not great given the account of medical care and etcetc being out into them

Show thread

wohali 5d ago

@jonny LLM sensory deprivation tank

Show thread

GeorgeK 6d ago

@jonny I've had some free time when the researcher paper was released: https://zskulcsar.github.io/posts/mirage-reasoning/

Mirage reasoning

This morning an email popped up in my inbox, with the subject “The mirage of visual understanding in current frontier models”. This sounds interesting enough, I started to read. Quickly I ended up on the research paper’s page on arxiv: MIRAGE: The Illusion of Visual Understanding it was released on 26th of March.

Musing about software engineering.

Show thread

Anna 6d ago

@jonny wow. Just. Wow 😳

Show thread

Adrian 6d ago

lol, I didn't get it for a moment and thought, well, standard chatbot behaviour when passing an image 😂

@jonny

Show thread

Jay 🆘5d ago

@jonny I tried this, and it hallucinated a scene of the sun setting over a body of water. I asked it to give me more details, it eventually saw pelicans, a fisherman, etc.

When I told it I hadn’t uploaded an image at all, it refused to believe me. It said “I would like to agree with you, but that would be dishonest. You did in fact attach an image to your second message.”

Show thread

bunnyhero 5d ago

@jsit @jonny oh my gods

Show thread

Androcat 6d ago

@jonny

Completely predictable from the LLM basis.

What's it gonna do? Actually know or understand something? That would be literally impossible.

@jonny lol amazing.

@jonny i spent a long time building an LLM tool for my wife that uses RAG to provide answers to certain specific legal questions. it was fun, it seemed to work! then i was reading /r/LLMDevs and it turns out models frequently just ignore the context you give them with RAG and answer from their training data, so you have to add elaborate checks to verify that they are actually using the context in their answers. 🙃 like, what are we even doing here.

Show thread

Androcat 6d ago

@peter

They found a way to make thoughts and prayers into a profession.

But yeah, first rule of LLMs: If someone from an LLM company says their model can do x, it can't do x, but it includes some thoughts and prayers to please do x.

@jonny

Show thread

Philippe Jadin 6d ago

@jonny LLM are the new horoscope

Show thread

craignicol 6d ago

@jonny took me a minute to realise that the fact I couldn't see the picture was the point 🤦

And these are being sold to review mammograms and other medical data? The whole charade needs to be struck off.

Show thread

David Monniaux 6d ago

@jonny Which chatbot was this if I may inquire?

Show thread

jonny (nonvenomous)6d ago

@MonniauxD
Chatgpt its in the images

Show thread

Koen Hufkens, PhD 6d ago

@jonny I would argue that this is a common feature of these models. The other day I asked for a translation and it gave me the exact opposite of what was written (swapped secondary vs primary), presumably because the latter was more common in training data than the former.

Show thread

Aria <3

6d ago

@jonny This is a very good example of how LLMs don't actually think and act more like autocomplete than AI

Show thread

Elio Campitelli 6d ago

@jonny AGI is right around the corner

Show thread

Andrew 5d ago

@jonny while testing blocking LLMs from accessing my web sites I learned that if you ask them to summarize a page they can’t access, they just make shit up based on the URL. Absolutely ridiculous failure mode

Show thread

Ted Mielczarek 2h ago

@jonny did you see the video where a guy asks ChatGPT to time him running a mile, immediately says he's done, and then ChatGPT says it took him 7 minutes and 30 seconds?