Mastodawn

Daniel Oberski Mar 31, 2023

Erik-Jan Mar 30, 2023

Today in our Data Science reading group @utrechtuniversity we talked about this paper by Jessica Hullman et al.: https://arxiv.org/abs/2203.06498v6

It's really great & well-written; as @daob mentioned in the meeting, an incredible amount of work went into Table 1, which compares pitfalls of the scientific process in the social psychology and machine learning fields:

The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning

Recent arguments that machine learning (ML) is facing a reproducibility and replication crisis suggest that some published claims in ML research cannot be taken at face value. These concerns inspire analogies to the replication crisis affecting the social and medical sciences. They also inspire calls for greater integration of statistical approaches to causal inference and predictive modeling. A deeper understanding of what reproducibility concerns in research in supervised ML have in common with the replication crisis in experimental science can put the new concerns in perspective, and help researchers avoid "the worst of both worlds," where ML researchers begin borrowing methodologies from explanatory modeling without understanding their limitations and vice versa. We contribute a comparative analysis of concerns about inductive learning that arise in causal attribution as exemplified in psychology versus predictive modeling as exemplified in ML. We identify themes that re-occur in reform discussions, like overreliance on asymptotic theory and non-credible beliefs about real-world data generating processes. We argue that in both fields, claims from learning are implied to generalize outside the specific environment studied (e.g., the input dataset or subject sample, modeling implementation, etc.) but are often impossible to refute due to forms of underspecification. In particular, many errors being acknowledged in ML expose cracks in long-held beliefs that optimizing predictive accuracy using huge datasets absolves one from having to make assumptions about the underlying data generating process. We discuss risks and opportunities that arise as both fields attempt to resolve concerns about methods.

arXiv.org

Daniel Oberski Feb 20, 2023

Dgar Feb 20, 2023

Roses are red.
Roses are blue.
Depending on their velocity
relative to you.

Show thread

Daniel Oberski Feb 15, 2023

@wviechtb I would say it clearly is, but the smallest randomization p-value is 0.5, which is not very helpful..

Daniel Oberski Feb 15, 2023

Wladimir Mufty Feb 14, 2023

Each day new students, researchers and employees within the Dutch educational and research community can join Mastodon with their existing institutional account!

Not only lowering the threshold to explorer Mastodon but also supporting “group” accounts besides providing personal accounts!

Join social.edu.nl ! See how to register an account or see if your institution is already connected!

https://surf.nl/mastodon-pilot

#mastodon #publicvalues #research #education #surfconext

Mastodon-pilot voor onderzoek en onderwijs

SURF en Universiteiten van Nederland verkennen samen Mastodon als open source platform voor het onderwijs en onderzoek in Nederland.

SURF.nl

Daniel Oberski Jan 18, 2023

Casper Albers ✅Jan 17, 2023

Wikipedia heeft een prijs naar mij vernoemd 🙏

Erg eervol dat het grootste openbare kennisdelingsproject dit doet. Dat het dan nog een prijs is voor de beste samenwerking, is helemaal mooi.

https://www.wikimedia.nl/actueel/blog/wikiuil-vernoemd-naar-casper-albers/

WikiUil vernoemd naar Casper Albers - Wikimedia Nederland

Wie is Casper Albers waarnaar de ‘SamenwerkingsUil’ is vernoemd?

Wikimedia Nederland

Daniel Oberski Dec 24, 2022

Kit Yates Dec 24, 2022

Q. Why do mathematicians confuse Halloween and Christmas?

A. Because 31 Oct = 25 Dec.

Happy Christmas.

Daniel Oberski Dec 22, 2022

Calling all ECRs who would like to learn more social data science by doing!

The ODISSEI Social Data Science (SoDa) team offers SoDa traineeships for early career social scientists. Successful SoDa trainees will spend between 3-8 months full-time working on a social science research project they propose. During this time, they are members of the SoDa team at the Methodology & Statistics department of Utrecht University and mentored by one of the senior team members.

https://odissei-data.nl/wp-content/uploads/2022/10/soda_traineeship_call-2022.docx-Google-Docs.pdf

Daniel Oberski Dec 20, 2022

@[email protected] Codex, the system behind copilot, does have an API, but it does not know about R (yet). Then there are the familiar ethical issues with using people's code without asking..

Probably won't be part of Rstudio in the immediate future, but it could happen later.

Daniel Oberski Dec 20, 2022

@[email protected]

You can already use copilot for R in vscode!

If you want it in Rstudio, you might like to bump this issue: https://github.com/rstudio/rstudio/issues/10148

Github Copilot integration with RStudio · Issue #10148 · rstudio/rstudio

Hi! Are there any plans to make Github Copilot available in RStudio? RStudio is definitely a great development environment. It's just a pity that Copilot is not available. I've been using Copilot w...

GitHub

Daniel Oberski Dec 20, 2022

@sandorspruit @RenseC Indeed...

Group website @UU	https://hds.sites.uu.nl
Publications	https://daob.nl/publications