Manoel Horta Ribeiro

236 Followers
227 Following
89 Posts

CS PhD student @ EPFL

Keywords: Computational Social Science, Platforms, Communities, Moderation

websitehttps://manoelhortaribeiro.github.io/
twitterhttps://twitter.com/manoelribeiro

This ACM CSCW paper shows that online groups requiring active participation commitments (i.e., that you post regularly) see double the contributions and retain users 4 times more (compared to groups with passive reminders). Prob very useful to OSS and Wiki folks.

https://dl.acm.org/doi/10.1145/3687027

Commit: Online Groups with Participation Commitments | Proceedings of the ACM on Human-Computer Interaction

In spite of efforts to increase participation, many online groups struggle to survive past the initial days, as members leave and activity atrophies. We argue that a main assumption of online group design---that groups ask nothing of their members beyond ...

Proceedings of the ACM on Human-Computer Interaction

In a new blog post, I argue that discourse on social media's (potentially harmful) impact is entirely ~vibes based~.

https://doomscrollingbabel.manoel.xyz/p/discourse-on-social-media-is-vibes

Discourse on social media is vibes-based

In the fall of 2021, The Wall Street Journal dropped a bombshell on Facebook (now Meta): “Facebook Knows Instagram Is Toxic for Teen Girls.” The report drew from “The Facebook Files,” documents leaked by former employee Frances Haugen, which, among other things, discussed internal research on mental health and well-being.

Doomscrolling Babel

In this new blog post, I argue that to improve online platforms, we need to study content curation practices. That is pretty hard to do, but we should do it anyway :)

https://doomscrollingbabel.manoel.xyz/p/content-curation-in-online-platforms

Content Curation in Online Platforms

(This is a big rant on why research on content moderation, algorithms, and monetization strategies is hard and why we desperately need it. It is an interpolation between some of the materials I prepared for my job talk and my PhD thesis) Online platforms like Facebook, Wikipedia, YouTube, Amazon, Uber, DoorDash, Airbnb, and Tinder have changed the world and become embroidered into the social fabric. It is hard to imagine how our lives would be without them: our economies, our relationships, and how we acquire knowledge have become deeply connected to these online platforms. The United Nations Conference on Trade and Development estimated that the global value of e-commerce sales reached almost

Doomscrolling Babel

➡ Key takeaways:

- LLMs (even open source ones in some tasks) perform similarly to humans (who perform poorly).

- Supervised approaches perform much better.

- Stacking LLM predictions increases performance substantially.

➡ This approach can help continuously benchmark LLM persuasiveness capabilities (much simpler than running human-in-the-loop experiments). Congratz to Paula (her master thesis), who led this work :)

➡ We extend a dataset from a defunct debate website (Durmus & Cardie, 2018) containing debates, votes, & voter traits. We propose tasks to

1. Judge argument quality.

2. Correlate demographics and stances toward the debate proposition.

3. Determine votes given demographics.

➡ Can large language models (LLMs) recognize convincing arguments? Kind of.

We propose evaluating LLMs' persuasiveness capabilities by measuring their ability to recognize whether an argument would convince a user with specific demographics.

📜 :arxiv.org/abs/2404.00750

Audience capture is understudied IMO. I find it fascinating because the term comes from IDW folks close to audience-captured influencers (somehow, they reinvented Rebecca Lewis's report). Plus, it is well aligned with findings that the algorithm is not radicalizing people.

Shoutout to @fraslv (this is his MSc thesis). Francesco is a fantastic and mature researcher—and is considering doing a Ph.D.! Working on this has been a continuous source of joy and amazement. Grazie mille!

(We are still running some analysis—this is a working paper)

These results have implications for online platforms, as they suggest that misinformation and influence operations benefit from LLMs-driven personalization. With elections right around the corner, platforms should consider how to mitigate nefarious LLM-driven persuasion.
W/o personalization, GPT-4 is at least as good as humans. Relative to debating with humans, participants debating GPT-4 had +21.3% higher odds of agreeing with their (non-human) opponent (p=0.31). But with personalization, participants had 81.7% (p<0.01) higher odds!