Mastodawn

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

I’m at the University of Sheffield today for the Perspectives on Teaching Reproducibility symposium at the Teaching Reproducible Research and Open Science conference.

https://www.sheffield.ac.uk/smi/events/teaching-reproducible-research-and-open-science-conference

I’ll add links and notes about the day in this thread #OpenResearch #reproducibility #psyTeachR

Teaching Reproducible Research and Open Science Conference

Organisers: University of Sheffield (Sheffield Methods Institute, Open Research Working Group and University Library) and Project TIER.

The University of Sheffield

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

First up, Jennifer Buckley on “Opportunities and challenges for teaching reproducibility in the context of UK Higher Education in the Social Sciences – insights from a consultation with teaching staff”

If you aren’t familiar with the UK Data Service, it’s a fantastic resource for managing social science data for research and teaching.

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

The survey involved 109 lecturers in social science most of whom teach quantitative methods and 16 follow-up interviews. Most agree that teaching reproducibility is important and that demonstrations and examples would be useful.

Most still use SPSS (seems to be more polisci than psych in the dataset)

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Almost half of the lecturers surveyed prepare data to make it more usable for students. They often find there is no time to teach data preparation (one of the most important skills we emphasise in the #psyTeachR curriculum)

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Link to the UK Data Service report:

https://ukdataservice.ac.uk/app/uploads/heconsultationreport2021-11-26.pdf

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Next up, Jon Reades on Building Foundations: Pythonic (Geo)Data Science from the Ground Up

https://jreades.github.io/talks/reproducible/#/building-foundations-reproducible-geographic-data-science

Presentations – index

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

I love this list of benefits of reproducible workflows:

* Abstraction
* Employability
* Learning by seeing
* Learning by breaking
* Workload management

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

The Docker method of making sure all students have the same packages and resources looks fruitful. I’d be curious to see how easy it is to deploy docker on students’ diverse machines.

I’m also impressed with teaching git/GitHub; I think version control is so important, but teaching it can be tricky and derail the class with esoteric problems.

And yay for more #QuartoPub resources!

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Reades makes the excellent point that REF2028 has just been announced and it’s clear that they want to promote wider thinking on research environment — now called people and culture — and the contribution of more diverse outputs, which should include open teaching materials.

https://www.ukri.org/news/early-decisions-made-for-ref-2028/

Early decisions made for REF 2028

The UK’s higher education funding bodies have made initial decisions on the high-level design of the next Research Excellence Framework (REF).

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Next up, Marina Bazhydai on "The good, the bad and the ugly: Teaching first year psychology undergraduates about research integrity and open science" (with Emma Mills, Richard Philpot, Mike Vernon, & @dermotlynott from Lancaster University)

The UG methods course focuses on broad questions of how to do science, in addition to the stats. They are supported by the PROSPR network https://www.lancaster.ac.uk/psychology/research/open-science/

Open Science

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

It's a really interesting idea to teach undergrads how to use tools like StatCheck and GRIM to detect research errors (or fraud) and the engage with the repliCATS project. Also, this demo is fab!

https://fivethirtyeight.com/features/science-isnt-broken/

Science Isn’t Broken

If you follow the headlines, your confidence in science may have taken a hit lately. Peer review? More like self-review. An investigation in November uncovered a scam in which researchers were rubber-stamping their own work, circumventing peer review at five high-profile publishers.

FiveThirtyEight

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

The first keynote is by Norm Medeiros and Richard Ball from Project Tier – The New (Aspirational) Normal: Saturating Quantitative Methods Instruction with Reproducibility

https://www.projecttier.org/

This talk focusses on integrating computational reproducibility across all curricula as a precondition for other dimensions of research tansparency.

Project TIER | Project TIER | Teaching Integrity in Empirical Research

TIER Documentation Protocol provides instructions for assembling files documenting steps of data processing & analysis for a research paper.

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

**Documentation is the key to Reproducibility**

Essential elements:
- Original data
- Code

Additional elements:
- Output of computational results
- Additional information on data sources
- A read-me file

((I’d argue a README is essential!))

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Very cool that the American Economic Association has a dedicated data editor and great online resources!

https://aeadataeditor.github.io/aea-de-guidance/

Step by step guidance

The following steps outline what you should expect after conditional acceptance of your manuscript, in compliance with the AEA Data and Code Availability Policy. Note that the AEA’s Data and Code Availability Policy is compatible with the Data and Code Availability Standard v1.0. Prepare Prepare your data and code replication package (including data citations and provenance information). You can do this at any time, even before submitting to the AEA journals. Start Upload Provide metadata and upload the replication package. This step simultaneously prepares the materials for the verification process as well as for subsequent publication. Do it! Submit Submit the Data and Code Availability Form together with your manuscript native files as instructed, and as per guidelines at your journal (for example, AER guidelines). Only once these materials have been received by the editorial office are verification checks started. Ready to submit? What to expect next The next steps happen behind the scenes, until you receive the replication report: Next steps Approximate process flow Learn More

Office of the AEA Data Editor

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

I like the "reproducibity trifecta":

1. Fixed folder structure
2. Explicit management of the working directory
3. Use of relative directory paths in scripts

And the "key dimensions of reproducibility":

1. Soup-to-nuts reproducibility
2. (Almost) automated reproducibility
3. Portability

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Higher order educational goals served by teaching reproducibility

• Instructors can understand what students produce.
• Students can understand what they produce.
• Students can believe in what they produce.
• Dramatic enhancement of instructor's ability to advise and evaluate student projects (especially with use of a file sharing platform).
• Reinforces core lessons about intellectual integrity that are central to undergraduate education.

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Project TIER has been focussing their workshops on individual researchers/instructors, and will be explanding thier focus on making more department-wide changes, in collaboration with the UKRN (thanks for the lovely shout-out to Glasgow #PsyTeachR as a pioneer in this!)

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Librarians are key for facilitating #OpenResearch (seriously, go make friends with your uni librarians!)

Data librarians can:

- provide assistance with documentation and metadata
- advise on file naming conventions and format consistency
- recommend strategies for organising and backing up files

(It's very cool that they do basic code review to make sure data prep code runs on another computer)

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Terrific point from Norm Medeiros: reproducibility is difficult to retrofit; you need to integrate reproducibility practices at every point in the lifespan of a project.

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Now on, Carlos Utrilla Guerrero (https://carlosug.github.io) from TU Delft Library on "What can an open science educator do on teaching and building digital competences in reproducibility? Our lessons learned implementing the Research Data and Software management training"

https://www.tudelft.nl/en/library/research-data-management/r/training-events/training-for-researchers

Home

Carlos Utrilla-Guerrero personal website, with his resume.

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

TU Delft Library vision for Research Data and Software Management training as part of the education and skills development of students and researchers.

https://zenodo.org/record/3516874

Vision for Research Data & Software management training at TU Delft

This is TU Delft Library vision for Research Data and Software Management training as part of the education and skills development of students and researchers. The courses (some already available and some in preparation) are organised in four different modules, which build upon each other (Fig. 1). The different levels (from bottom to top) increases the specificity of the content from considering data into a general context (e.g. open science) to skills that apply to a specific data type or a research discipline. The realization of this vision will be a collaborative work of TU Delft Library with different relevant stakeholders within the university (e.g. Data Stewards, researchers, other support services offices, etc.) and with external organizations that have already developed training material and/or courses. This collaborative effort aims at ensuring the sustainability of the training.

Zenodo

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

The data flow map exercise from this course looks really interesting! It's adapted from https://dataflowtoolkit.dk/

- Create a comprehensive list of datasets (incl. code) used in the project
- Annotate with the actions required for each dataset (e.g. collect, reuse, annotate, anonymise, etc)
- Flag datasets with special characteristics (e.g. personal data, commercial data)

DataFlowToolkit

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

The TU Delft course for PhD candidates, Research Data Management 101 (RDM 101), is openly available as a self-learning course. It has 5 modules:

1: The importance of RDM
2: Essentials for Research Data
3: FAIR data principles and their main elements
4: Realizing FAIR data
5: How to plan for RDM

https://tu-delft-library.github.io/rdm101-book/intro.html

Welcome to RDM101 — RDM101 Course

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Next, Julia Kasmire @JKasmireComplex from the UK Data Service on Teaching reproducibility to social scientists. This talk will describe a 5-week bootcamp course from the National Centre for Research Methods that covered:

1 – Intro, generals and specifics of reproducibility
2 – Collaboration, communication and tools thereof
3 – Documenting mind, workflow, processes
4 – Data basics and advanced topics
5 – Publication and AOB

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

It's nice to hear someone admit that these skills can be challenging to learn and that not everyone needs to learn them to expert level, but we should all know enough to comminumcate with our teams (go #TeamScience!)

It makes me think there should be a resource aimed specifically at people who don't want to learn the technical end of open research, but just the concepts and jargon needed to communicate with their team memebers who do.

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Next, Andrew MacLachlan on Reproducible geographical information systems and science. (another lovely #QuartoPub presentation!)

https://andrewmaclachlan.github.io/perspectives-on-teaching-reproducibility/Sheffield_conference.html

Sheffield_conference - Reproducible geographical information Systems and Science

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Ooh, the GitHub classroom method for distributing assessments looks really nice. I do wish I could integrate git and github in my teaching more (but I find git installation to be too tricky for the amount of time we have, and it's not on our lab computers)

https://andrewmaclachlan.github.io/CASA0023/

CASA0023 Remotely Sensing Cities and Environments

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Time for the second keynote by #PsyTeachR's own @HelenaPaterson on Teaching Reproducibility: reflections on redeveloping a curriculum for teaching reproducible methods

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Why do we teach like this at UofG PsychNeuro? We think students need the conceptual and technical skills to be able to complete a research project.

"What is something observable that you think students in your field ought to be able to do when they graduate, and are you adequately preparing them to do this?"(Nolan & Temple Lang, 2010; Peck and Chance, 2007)

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

We often only see the end result of data processing and the rest of the pipeline is hidden. If you only give students clean "final" data, they don't learn the skills needed to deal with real raw data and are set up to fail at independent research.

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

In this paper, the #PsyTeachR team argues that training in data processing and transformation should be embedded in field-specific research methods curricula. Promoting reproducibility and open science requires not only teaching relevant values and practices, but also providing the skills needed for reproducible data analysis.

https://psyarxiv.com/hq68s/

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

Tools in our toolbox:

- R and RStudio
- tidyverse
- dirty data
- #PsyTeachR open resources
- Open research focus in assessment
- Community support

Show thread

Lisa DeBruine 🏳️‍🌈Jun 20, 2023

One way to make time for teaching these skills:

You do not have to teach every statistical test if you teach the foundations well. It's more important to teach student how to learn in a self-directed manner than to teach the individual tests.

Teaching Reproducible Research and Open Science Conference

Presentations – index

Early decisions made for REF 2028

Open Science

Science Isn’t Broken

Project TIER | Project TIER | Teaching Integrity in Empirical Research

Step by step guidance

Home

Vision for Research Data & Software management training at TU Delft

DataFlowToolkit

Welcome to RDM101 — RDM101 Course

Sheffield_conference - Reproducible geographical information Systems and Science

CASA0023 Remotely Sensing Cities and Environments

Methods & MetaScience Seminar

PsyTeachR