It's #LoveReplicationsWeek!
I'll be posting impressions from the talks all week here.
>> You can still sign up for talks since we are sending out invites on the morning for new registrations.
https://forrt.org/LoveReplicationsWeek/
>> Slides will be shared in our Zenodo Community: https://zenodo.org/communities/lrw/records
Here we go!
@ElenLeFoll and Mara van der Ploeg kick off the
#LoveReplicationsWeek with their talk about the what, why, and how of reproducibility in the language sciences.
Elen presents examples of reproductions from her stats class. Here is a wonderful example of a perfect reproduction by one of Elen's student Beatrice. Tthe numbers did not perfectly match and there was a plot for which she had to recreate the code and find out what the error bars represented.
Here is another interesting case where the authors computed means of means, which was not clear from the report and let to the student get larger estimate when computing means from the non-aggregated data.
Some of Elen's students publish their reproductions as case studies in her book. This way, they don't only learn about research methods but also apply their knowledge and make a valuable and visible scientific contribution:
https://elenlefoll.github.io/RstatsTextbook/B_CaseStudies/Poppy/ThrowVerbs.htmlMara is now presenting the Tromso Repository of Language and Linguistics or short - TROLLing, which is a free and curated data repo.
It is not only free but "curated" means, that the team helps you making your data adhere to FAIR principles, visible, peer-review-friendly, and all of that happens fast.
Next up: Lenka Fiala from the Institute for Replication is taking us on a Reproducibility Study Speedrun.
She is actually showing us a reproduction study of an article published in the last four weeks!
With over 100 reproduction studies every year, she knows all sorts of tips, here are some of them:
- Data and code can sometimes be found on the journal website and sometimes in the authors' institutional repo
It looks like this is a simple case: The authors have provided a readme, code, and data. She has seen lots of replication packages where researchers shared sensitive data that should not appear in replication packages.
The longest part: Debugging. Sometimes, there are additional requirements that are not part of the readme (folder structure, paths, variable renaming, ...).