The co-founder of Koko (non-profit that offers peer mental health support) has a Twitter thread (https://twitter.com/RobertRMorris/status/1611450197707464706) about an experiment where they fed requests for help to GPT-3 and help providers could send those AI-generated support messages rather than their own. They found that AI responses were rated higher but also "once people learned the messages were co-created by a machine, it didn’t work." But there have been some interesting questions about the ethics... 🧵 #gpt3
Rob Morris on Twitter

“We provided mental health support to about 4,000 people — using GPT-3. Here’s what happened 👇”

Twitter
I'm a little confused by this response about informed consent (https://twitter.com/RobertRMorris/status/1611582827224797185) but I think it illustrates a significant problem among some researchers with conflating "research ethics" with "would an IRB allow me to do it" which is potentially really harmful. I would hope that the reason to seek informed consent isn't because a regulatory body forces you to, but because it is the right and ethical thing to do. (2/n)
Rob Morris on Twitter

“@royperlis This would be exempt. The model was used to suggest responses for help providers, who could opt in to use it or not. We didn’t use any PII, all anonymous data, no plan to publish. But MGH's IRB is formidable... Couldn't even use red ink in our study flyers if i recall...”

Twitter
But regardless, based on the thread I think that though the help providers were aware of the AI (since they were choosing to use it), it seems that the people seeking help were not aware. Though based on the "once people learned" finding, at least some of them must have been debriefed? Were they essentially following typical protocol for a deception experiment? (Though if that were the case I would have expected that as an answer re: consent rather than "we didn't have to".) (3/n)
The Twitter thread emphasizes that they weren't using PII, but prompts from people seeking mental health support are still potentially quite sensitive, and some folks on Twitter were concerned about data going back to OpenAI - I assume that GPT-3 can run internally though? In which case I suppose the privacy risks would be the same as when people choose to use the system at all. (4/n)

@cfiesler I believe the author later clarified that the people were not directly chatting with the model. It was used more as a tool to help peers craft their responses.

While having a human in the loop does mitigate some of the PII issues, the lack of informed consent is still stands.

@rajatsahay @cfiesler
The question I'd have is: did the model see input from the vulnerable person? Or was it preprocessed/summarised in some way, and if so, how and for what purposes? For example, if the model saw said input, an absolute minimum might be to remove identifiers/pseudonymise, though sharing even pseudonymised personal info with a third party service without explicit informed consent is still a serious issue.

@emmatonkin @cfiesler I think the demo video showed that the operators had the option of directly forwarding the responses to the model. I'm assuming (hoping) the humans acted as filters for personal stuff.

What's worse is that stuff like this just sets precedent for even more outrageous applications of LLMs.

@rajatsahay @cfiesler
Ack, though. A) I wonder what guidance, training, eval they were given because that's quite some task to carry out in a hurry and B) hang on, is the LLM responding with no context other than the last message received, then? More usual to give it context for a (seemingly) relevant answer.

Totally agreed re precedent. Not only does it need careful regulation, but I suspect this is already in breach of existing regs.

@rajatsahay @cfiesler tbh I don't think "i gave the thread a quick scan, decided it contained no personal data and sent it w/o explicit consent to a cloud service that famously retains data" is a fair responsibility to give to J Random Employee, at least not without significant work to ensure they were adequately trained, that risks were handled and that the task was realistic. Setting them up for trouble otherwise.

@emmatonkin @cfiesler your response perfectly highlights a huge problem with AI hype. Most companies cite human moderation to deploy borderline illegal services, claiming their "AI model" gives unreal performance- and are within the letter of the law.

When the model inevitably fails, any blame for the misaligned decisions is put directly on the same moderators- who usually receive little to no training on how to handle these situations.