As more academics reach Fedi, please PLEASE consider not doing research on users here without explicit opt-in consent

This isn't a zoo

It's not just condescending for you to treat us that way, it's also against a lot of instances' terms of use

See "Use of Scholar Social for research" at the following link for an example:

https://scholar.social/privacy-policy

Scholar Social

Microblogging for researchers, grad students, librarians, archivists, undergrads, high schoolers, educators, research assistants, profs—anyone involved in learning who engages with others respectfully

Mastodon hosted on scholar.social
@socrates Consider? This should be a "Do not," not a "please consider not"

@socrates from what I've been able to detect, you know what you're doing over there. I've advised a couple scholars of note, in hopes that they find a home on scholar.social

Me, well, I'm more of a dork. But I appreciate things done well. Kudos!

@socrates Boosting this, and adding: remember this is NOT Twitter. Every instance/server has its own admin; their server, their rules. This is a FEDERATED microblogging platform, that is not a monolithic corporation.

@socrates Yeah this isn't a zoo.

Is there a zoo nearby?

@stijn see this post by @socrates as we were discussing last night.
Administrative account (@[email protected])

As more academics reach Fedi, please PLEASE consider not doing research on users here without explicit opt-in consent This isn't a zoo It's not just condescending for you to treat us that way, it's also against a lot of instances' terms of use See "Use of Scholar Social for research" at the following link for an example: https://scholar.social/privacy-policy

Scholar Social

@socrates Thank you for sharing! I’m not quite sure if I agree with that policy. I feel like a post that is set to be publicly accessible constitutes public communication.

I would agree that publishing more than just anonymized aggregate data or IDs (like Twitters TOS suggest) is out of the question but analyzing public communication can’t require a specific opt-in from every user. That does not seem feasible to me.

@Kudusch @socrates I feel like Kususch's stance on this is probably more nuanced than is coming across in this thread, but FWIW the Association of Internet Researchers argues, as a scholarly community, that context matters when deciding what should count as "public" for the purpose of research. I recommend reading this post by Hugh Rundle for an idea of how long-time Mastodon users have viewed their contributions and privacy in context. https://www.hughrundle.net/home-invasion/

[Edited for clarity]

Home invasion - Mastodon's Eternal September begins

The fediverse is dealing with a huge wave of Twitter people bringing toxic ideas with them.

@josh @Kudusch Okay the context here is that we told you explicitly not to do it and now you're aware of that expectation

Many of our users came here to *avoid* being included in Cambridge Analytica type situations and we're telling you: No

That's the context

If a person talking in a public place like a coffee shop told you "stop writing down everything I'm saying," what would the right thing to do be?

@socrates @Kudusch Edited the above comment for clarity. I think we're actually on the same page here, Socrates.

@socrates @josh I definitely get what you are saying. An explicit opt-out like for the users of scholar.social must be considered. That's basic research ethics.

My background is in disinformation studies, where there are usually very little good-faith actors. Users actively using the fediverse to spread (potentially harmful) misinformation are tough to identify before somehow accessing public communication at large.

@socrates @josh

Maybe implementing something like an opt-in for (non-commercial) research on an account-basis that can be access through the API could be a solution on the long-run.

@Kudusch @josh Now that would be very interesting

@socrates @josh That would still leave the issue of bad-faith actors unsolved, but actually building research consent into the protocol itself might be a good start!

Because, if I'd access the API from an instance where research is even explicit allowed, I still might get content from users of instances, where it is not. Being able to just filter out user's content that opted-out would be very helpful.

@Kudusch @socrates @josh we had a similar discussion https://scholar.social/@mdanganh/109304920124242505 i.a. @javier suggested: "do not scrap" meta tag as default, plus an opt-in option for active consent in the preferences for sharing account contents for research purposes (open science, non-commercial). That info would have to be available via API.
Tools like {rtoot} should be designed to only scrape accounts that have actively consented (still, *informed* consent is an issue)
Mark Dang-Anh (@[email protected])

I guess that many fellow linguists and social media researchers who, like me, are new here, are already itching to compile corpora from Mastodon data. I have to admit, so do I. But all I can find on #scraping in toots, server rules and blogs is: "Don't! Scrape! Mastodon!" May this be the right time to rethink ingrained research habits, reconsider ethical standards and adapt to a new community and a different social media culture? Any thoughts? #researchethics @linguistics #corpuslinguistics

Scholar Social
@mdanganh @Kudusch @socrates @javier Could you set up a bot/relay tied to the crawler that would notify a user via a DM that their instance or account had been scraped, give a link to a study description, and provide info for opting out post-hoc, even if their initial profile metadata had indicated opt-in? And provide the researchers' contact info, obviously.
@socrates @mdanganh @josh @javier That’s a fantastic idea!
@Kudusch @socrates @mdanganh @javier Unless it becomes spam. No one would want to get 40 of these a month.
@Kudusch @socrates I'd also check out writing by @robertwgehl, who's researched communities on Mastodon for quite some time, while navigating the space thoughtfully. Research isn't impossible here, but it requires careful thinking about context and consent, which is a good thing.
@josh Thank you for the reading suggestions!

@josh @Kudusch @socrates And it's helpful to think of it this way, too: I leave my apartment to get groceries, but that doesn't mean it's okay for someone to snap my photo and publish it online.

I leave my curtains open sometimes for natural light, but that doesn't mean a neighbor can film me and upload the video anywhere they like.

@socrates @josh @Gemma That is a good analogy for some circumstances, sure. But invading someone’s privacy by filming them in their home is in a different category than e.g. using a post by someone on social media for some analysis that tries to understand how people share news with their network.
@Kudusch @socrates @josh Even after someone has made it clear that they prefer to be consulted or asked permission before someone else makes money off of their content?

@josh @Gemma @socrates Marketing/for-profit research and publicly funded, academic research should probably be differentiated in this discussion.

On a technical level, users do have the option on Mastodon to be excluded in the public timeline/user directory. I’m not sure there needs to be another level of opt-in/out for access to their content.

@Kudusch @josh I believe scraping tools can bypass that, if I'm not mistaken, but others who know better than me could give an actual answer.

When I hear that marketing and academic research should be differentiated, what it sounds like is that boundaries don't really matter. When you don't belong to a marginalized and frequently exploited group, it probably seems like it shouldn't matter, or, if it does, it should be an easy fix (which it's not, as Matthew has shown us).

@Kudusch @josh And I've been seeing that a lot since the influx started. "Well just [do XYZ] if you don't like it." We did. We had blocked that journalist. He lifted someone's content anyway.

It's so easy for people with that kind of power to talk to people to obtain written informed consent, and for many projects it's a requirement. That's part of the solution.

@Kudusch @socrates It is NOT necessarily public; just because I post something to the TL does NOT mean I want it scraped.

@Kudusch @socrates The "Use of Scholar Social for research" does not require a specific opt-in for every user. They are requiring compliance with the ethical standards for social science research in North America.

Europeans who find them too onerous should find another field of research because GDPR has much higher penalties.

@socrates I see enough of that on my uni's main Facebook group, I don't really want to see it here too!

@socrates When I was doing my honors thesis research it would have been easier to get IRB approval to just "scrape" public posts without consent, than it was to get approval to talk to neurodivergent folks on here with consent.

I had to jump through so many hoops to get approval for an opt-in survey because they questioned whether neurodivergent folks could consent to participate.

It's a messed up standard that needs to be re-examined

@autistic_anthropologist Okay that just sounds like a reviewer who has no idea how to even think about neurodivergent people
@socrates yeah definitely. But what frustrated me is that the easier options were to do things I consider way more ethically fraught, like scraping posts without consent or *asking the parents of adult neurodivergent people* to give their consent.

@autistic_anthropologist

@socrates

There is a double standard in academic social and ecological research, to be sure. The International Society of Ethnobiologists code of ethics is a really thorough and good example <https://www.ethnobiology.net/what-we-do/core-programs/ise-ethics-program/code-of-ethics/>, but we have found that academic collaborators in the life and social sciences from North American and European universities were (1) angry at us for insisting on these standards and (2) encouraged their students to circumvent our protocols when we were not actually present. It was hugely destructive. Some anthros we know are trustworthy.

The ISE Code of Ethics - International Society of Ethnobiology

The Code of Ethics of the International Society of Ethnobiology has its origins in the Declaration of Belém, agreed upon in 1988 at the founding of the International Society of Ethnobiology (in Belém, Brazil). The Code of Ethics was initiated in 1996 and completed in 2006. The final version, adopted by the ISE membership at... Read more »

International Society of Ethnobiology
@yetiinabox thank you for sharing. It's deeply concerning how a lot of anthropological research really doesn't seem to spend much time considering ethics. We can't rely only on institutions like the IRB and assume that anything they sign off on is ethical, especially when they don't necessarily know the specific situation or community that well
@socrates yeah I really don't care for the way people think they're entitled to others people's info without even asking its so, concerning. Good to be mindful for sure
@socrates
Its also not very ethical for research to be done on humans without their consent offline so it should not be done online either
@Jessica

@socrates Conducting research on people is, in my mind, no different than any other social interaction: consent is a prerequisite, can be withdrawn at any point, and should never be assumed.

It bothers me that there are researchers out there that don't consider mining an online forum's posts to be unethical, especially when consent is explicitly not given in the site's terms of service.

@socrates What about researching the inherently evil instances? Just look at some of the Reject reasons from poa.st. https://poa.st/about
@socrates I was part of a community of mad people on the bird site, it was not uncommon for researchers to mine our tweets and blogs without consent. Some of us recognised our words in their articles. I hope people listen to you and don’t try to do that here
@socrates I’m reminded of efforts for machine-readable content (IPR) policies in the late 90s-early 00s. Some of us at #HPLabs wrote about this 20 years ago for the #W3C https://www.w3.org/2000/12/drm-ws/pp/hp-erickson.html
Principles for Standardization and Interoperability in Web-based Digital Rights Management

@socrates

just a tiny reminder from a passer-by:

Germany has recently amended its law, so "gross trivialization" of ALL genocides, crimes against humanity and war crime around the world, even those that has not been established by a national or international court, have now been criminalized in Germany

that will relocate certain abhorrent speech from category 2 "Xenophobic and/or violent nationalism" to category 3

tho I am sure users here are much more civil

Source:
https://www.bundestag.de/presse/hib/kurzmeldungen-916934

Billigung, Leugnung und Verharmlosung von Völkermorden

Berlin: (hib/SCR) Die Strafbarkeit der öffentlichen Billigung, Leugnung und gröblichen Verharmlosung von Völkermorden, Verbrechen gegen die Menschlichkeit und Kriegsverbrechen soll...

Deutscher Bundestag