Mastodawn

Can researchers detect #AI bots taking paid surveys?

#Prolific tested humans and #LLM agents with various #dataQuality checks.
- The company says they caught 100% of the non-humans.
- My take-away: #reCAPTCHA and #mouseTracking caught 95%

https://www.prolific.com/resources/authenticity-checks-how-we-tested-the-most-accurate-method-for-identifying-agentic-ai

#surveyMethods

Nick Byrd, Ph.D.Jan 19

Hey #SurveyMethods and #MedEd folks:

In a workshop for #MedSchool faculty about questionnaire design and survey research methods, what

- objectives should be prioritized?
- materials are out there?
- activities are worth it?
- assessments work?

Example: https://www.aamc.org/what-we-do/mission-areas/medical-education/meded-research-certificate-program/workshop-descriptions

Show thread

Nick Byrd, Ph.D.Jan 9

We've found recruiting people for online #research via #onlineAdvertising yielded good results on overt and covert #dataQuality measures (perhaps because participation incentives aren't financial):

Attention checks passed ≅ 2.6 out of 3

ReCAPTCHA (v3) ≅ 0.94 out of 1.0

Sample size > 5000 (from six continents)

https://doi.org/10.1017/S0034412525000198

#surveyMethods #cogSci #psychology #xPhi #QualityControl #econ #marketing

Analytic atheism and analytic apostasy across cultures | Religious Studies | Cambridge Core

Analytic atheism and analytic apostasy across cultures - Volume 61 Issue S1

Cambridge Core

Show thread

Nick Byrd, Ph.D.Jan 9

I forgot to share the #mTurk data quality result that got scooped:

“In late 2020…. Participants from the United States were recruited from Amazon Mechanical Turk, -#CloudResearch, #Prolific, and a #university. One participant source yielded up to 18 times as many low-quality respondents as the other three.”

https://doi.org/10.1093/analys/anaf015

#psychology #philosophy #surveyMethods #quantMethods #dataScience #qualityControl

Nick Byrd, Ph.D.Dec 17

RE: https://mastodon.acm.org/@neilernst/115607469843033537

😳 The #AI survey taker "rendered [attention quality] checks [ACQs] effectively obsolete. Across 6,000 total trials..., [it] committed only 10 errors, achieving an overall pass rate of 99.8% and scoring perfectly on 18 of the 20 ACQ types."

#surveyMethods #psychometrics #psychology #tech #psychology #philSci #SciComm #dataQuality

Nick Byrd, Ph.D.Sep 1, 2025

Can thinking aloud accurately capture how we decide?

Nisbet & Wilson's 1977 paper famously suggested it can't.

But #NLP and #AI methods may indicate that it can:
- https://escholarship.org/uc/item/4sb936m3
- https://openreview.net/forum?id=1Tny4KgGO2

#cogSci #psychology #philosophy #surveyMethods #thinkAloud #LLM

GESIS Jun 5, 2025

#ComparativeResearch #EducationMeasurement #SurveyMethods
Measuring education across countries is complex—but crucial for valid, comparable survey data. This study tests 16 coding strategies using ESS data and finds a strong contender for a new international standard.

Read the article:
Schneider S.L. & Urban J. (2025). A myriad of options: Validity and comparability of alternative international education variables. Survey Methods: Insights from the Field: 10.13094/SMIF-2025-00008

Show thread

Ranjit K. Singh May 9, 2025

@GESIS
Thanks for sharing the talk!
Here is the Question-Link R-Package and an in depth tutorial on using it:
https://matroth.github.io/questionlink/index.html

And please note that we also offer consultations regarding harmonization techniques (and other survey method topics).
https://www.gesis.org/en/consulting/survey-methods-consulting

#surveymethods #harmonization #rstats

Harmonizing single item survey questions on the same construct

What the package does (one paragraph).

Show thread

Nick Byrd, Ph.D.Apr 11, 2025

Given that surveys tend to overestimate belief in #conspiracyTheories (https://osf.io/preprints/psyarxiv/zsncr_v1) and support for #politicalViolence (https://doi.org/10.1073/pnas.2116870119), I wonder how much of the correlation between such variables remains after accounting for such measurement error.

#stats #psychometrics #surveyMethods

OSF

Nick Byrd, Ph.D.Apr 7, 2025

New #surveyMethods paper replicates and extends differences in #dataQuality, attention, naivety, decision style, etc. by
- online #research recruitment platform (#mTurk, #Prolific, #Qualtrics, #Pollfish)
- device (#mobile v. #desktop)
- person's incentive

https://doi.org/10.3758/s13428-025-02618-1

Evaluating mobile-based data collection for crowdsourcing behavioral research - Behavior Research Methods

Online crowdsourcing platforms such as MTurk and Prolific have revolutionized how researchers recruit human participants. However, since these platforms primarily recruit computer-based respondents, they risk not reaching respondents who may have exclusive access or spend more time on mobile devices that are more widely available. Additionally, there have been concerns that respondents who heavily utilize such platforms with the incentive to earn an income provide lower-quality responses. Therefore, we conducted two studies by collecting data from the popular MTurk and Prolific platforms, Pollfish, a self-proclaimed mobile-first crowdsourcing platform, and the Qualtrics audience panel. By distributing the same study across these platforms, we examine data quality and factors that may affect it. In contrast to MTurk and Prolific, most Pollfish and Qualtrics respondents were mobile-based. Using an attentiveness composite score we constructed, we find mobile-based responses comparable with computer-based responses, demonstrating that mobile devices are suitable for crowdsourcing behavioral research. However, platforms differ significantly in attentiveness, which is also affected by factors such as the respondents’ incentive for completing the survey, their activity before engaging, environmental distractions, and having recently completed a similar study. Further, we find that a stronger system 1 thinking is associated with lower levels of attentiveness and acts as a mediator between some of the factors explored, including the device used and attentiveness. In addition, we raise a concern that most MTurk users can pass frequently used attention checks but fail less utilized measures, such as the infrequency scale.

SpringerLink