Mastodawn

Administrative account Nov 18, 2022

As more academics reach Fedi, please PLEASE consider not doing research on users here without explicit opt-in consent

This isn't a zoo

It's not just condescending for you to treat us that way, it's also against a lot of instances' terms of use

See "Use of Scholar Social for research" at the following link for an example:

https://scholar.social/privacy-policy

Scholar Social

Microblogging for researchers, grad students, librarians, archivists, undergrads, high schoolers, educators, research assistants, profs—anyone involved in learning who engages with others respectfully

Mastodon hosted on scholar.social

Show thread

Dr. Tim Schatto-Eckrodt Nov 18, 2022

@socrates Thank you for sharing! I’m not quite sure if I agree with that policy. I feel like a post that is set to be publicly accessible constitutes public communication.

I would agree that publishing more than just anonymized aggregate data or IDs (like Twitters TOS suggest) is out of the question but analyzing public communication can’t require a specific opt-in from every user. That does not seem feasible to me.

Show thread

Josh Braun Nov 18, 2022

@Kudusch @socrates I feel like Kususch's stance on this is probably more nuanced than is coming across in this thread, but FWIW the Association of Internet Researchers argues, as a scholarly community, that context matters when deciding what should count as "public" for the purpose of research. I recommend reading this post by Hugh Rundle for an idea of how long-time Mastodon users have viewed their contributions and privacy in context. https://www.hughrundle.net/home-invasion/

[Edited for clarity]

Home invasion - Mastodon's Eternal September begins

The fediverse is dealing with a huge wave of Twitter people bringing toxic ideas with them.

Show thread

Administrative account Nov 18, 2022

@josh @Kudusch Okay the context here is that we told you explicitly not to do it and now you're aware of that expectation

Many of our users came here to *avoid* being included in Cambridge Analytica type situations and we're telling you: No

That's the context

If a person talking in a public place like a coffee shop told you "stop writing down everything I'm saying," what would the right thing to do be?

Show thread

Dr. Tim Schatto-Eckrodt Nov 18, 2022

@socrates @josh I definitely get what you are saying. An explicit opt-out like for the users of scholar.social must be considered. That's basic research ethics.

My background is in disinformation studies, where there are usually very little good-faith actors. Users actively using the fediverse to spread (potentially harmful) misinformation are tough to identify before somehow accessing public communication at large.

Show thread

Dr. Tim Schatto-Eckrodt Nov 18, 2022

@socrates @josh

Maybe implementing something like an opt-in for (non-commercial) research on an account-basis that can be access through the API could be a solution on the long-run.

Show thread

Administrative account

@Kudusch @josh Now that would be very interesting

Show thread

Dr. Tim Schatto-Eckrodt Nov 18, 2022

@socrates @josh That would still leave the issue of bad-faith actors unsolved, but actually building research consent into the protocol itself might be a good start!

Because, if I'd access the API from an instance where research is even explicit allowed, I still might get content from users of instances, where it is not. Being able to just filter out user's content that opted-out would be very helpful.

Show thread

Mark Dang-Anh Nov 19, 2022

@Kudusch @socrates @josh we had a similar discussion https://scholar.social/@mdanganh/109304920124242505 i.a. @javier suggested: "do not scrap" meta tag as default, plus an opt-in option for active consent in the preferences for sharing account contents for research purposes (open science, non-commercial). That info would have to be available via API.
Tools like {rtoot} should be designed to only scrape accounts that have actively consented (still, *informed* consent is an issue)

Mark Dang-Anh (@[email protected])

I guess that many fellow linguists and social media researchers who, like me, are new here, are already itching to compile corpora from Mastodon data. I have to admit, so do I. But all I can find on #scraping in toots, server rules and blogs is: "Don't! Scrape! Mastodon!" May this be the right time to rethink ingrained research habits, reconsider ethical standards and adapt to a new community and a different social media culture? Any thoughts? #researchethics @linguistics #corpuslinguistics

Scholar Social

Show thread

Josh Braun Nov 19, 2022

@mdanganh @Kudusch @socrates @javier Could you set up a bot/relay tied to the crawler that would notify a user via a DM that their instance or account had been scraped, give a link to a study description, and provide info for opting out post-hoc, even if their initial profile metadata had indicated opt-in? And provide the researchers' contact info, obviously.

Show thread

Dr. Tim Schatto-Eckrodt Nov 19, 2022

@socrates @mdanganh @josh @javier That’s a fantastic idea!

Show thread

Josh Braun Nov 19, 2022

@Kudusch @socrates @mdanganh @javier Unless it becomes spam. No one would want to get 40 of these a month.