Enoiu et al 2026: "Folklore in Software Engineering: A Definition and Conceptual Foundations" https://arxiv.org/abs/2601.21814

Sultana et al 2026: "The Role of Social Identity in Shaping Biases Against Minorities in Software Organizations" https://arxiv.org/abs/2601.21259

#nwit

Folklore in Software Engineering: A Definition and Conceptual Foundations

We explore the concept of folklore within software engineering, drawing from folklore studies to define and characterize narratives, myths, rituals, humor, and informal knowledge that circulate within software development communities. Using a literature review and thematic analysis, we curated exemplar folklore items (e.g., beliefs about where defects occur, the 10x developer legend, and technical debt). We analyzed their narrative form, symbolic meaning, occupational relevance, and links to knowledge areas in software engineering. To ground these concepts in practice, we conducted semi-structured interviews with 12 industrial practitioners in Sweden to explore how such narratives are recognized or transmitted within their daily work and how they affect it. Synthesizing these results, we propose a working definition of software engineering folklore as informally transmitted, traditional, and emergent narratives and heuristics enacted within occupational folk groups that shape identity, values, and collective knowledge. We argue that making the concept of software engineering folklore explicit provides a foundation for subsequent ethnography and folklore studies and for reflective practice that can preserve context-effective heuristics while challenging unhelpful folklore.

arXiv.org
Saavedra et al 2025: "Environmental Impact of CI/CD Pipelines" https://arxiv.org/abs/2510.26413 It isn't good news… #nwit
Environmental Impact of CI/CD Pipelines

CI/CD pipelines are widely used in software development, yet their environmental impact, particularly carbon and water footprints (CWF), remains largely unknown to developers, as CI service providers typically do not disclose such information. With the growing environmental impact of cloud computing, understanding the CWF of CI/CD services has become increasingly important. This work investigates the CWF of using GitHub Actions, focusing on open-source repositories where usage is free and unlimited for standard runners. We build upon a methodology from the Cloud Carbon Footprint framework and we use the largest dataset of workflow runs reported in the literature to date, comprising over 2.2 million workflow runs from more than 18,000 repositories. Our analysis reveals that the GitHub Actions ecosystem results in a substantial CWF. Our estimates for the carbon footprint in 2024 range from 150.5 MTCO2e in the most optimistic scenario to 994.9 MTCO2e in the most pessimistic scenario, while the water footprint ranges from 1,989.6 to 37,664.5 kiloliters. The most likely scenario estimates are 456.9 MTCO2e for carbon footprint and 5,738.2 kiloliters for water footprint. To provide perspective, the carbon footprint in the most likely scenario is equivalent to the carbon captured by 7,615 urban trees in a year, and the water footprint is comparable to the water consumed by an average American family over 5,053 years. We explore strategies to mitigate this impact, primarily by reducing wasted computational resources. Key recommendations include deploying runners in regions whose energy production has a low environmental impact such as France and the United Kingdom, implementing stricter deactivation policies for scheduled runs and aligning their execution with periods when the regional energy mix is more environmentally favorable, and reducing the size of repositories.

arXiv.org
Bridgeford et al 2025: "Ten Simple Rules for AI-Assisted Coding in Science" https://arxiv.org/abs/2510.22254 #nwit
Ten Simple Rules for AI-Assisted Coding in Science

While AI coding tools have demonstrated potential to accelerate software development, their use in scientific computing raises critical questions about code quality and scientific validity. In this paper, we provide ten practical rules for AI-assisted coding that balance leveraging capabilities of AI with maintaining scientific and methodological rigor. We address how AI can be leveraged strategically throughout the development cycle with four key themes: problem preparation and understanding, managing context and interaction, testing and validation, and code quality assurance and iterative improvement. These principles serve to emphasize maintaining human agency in coding decisions, establishing robust validation procedures, and preserving the domain expertise essential for methodologically sound research. These rules are intended to help researchers harness AI's transformative potential for faster software development while ensuring that their code meets the standards of reliability, reproducibility, and scientific validity that research integrity demands.

arXiv.org
Dodor & Lopes 2025: "Checkstyle+: Reducing Technical Debt Through The Use of Linters with LLMs" https://arxiv.org/abs/2510.23068 #nwit
Checkstyle+: Reducing Technical Debt Through The Use of Linters with LLMs

Good code style improves program readability, maintainability, and collaboration, and is an integral component of software quality. Developers, however, often cut corners when following style rules, leading to the wide adoption of tools such as linters in professional software development projects. Traditional linters like Checkstyle operate using rigid, rule-based mechanisms that effectively detect many surface-level violations. However, in most programming languages, there is a subset of style rules that require a more nuanced understanding of code, and fall outside the scope of such static analysis. In this paper, we propose Checkstyle+, a hybrid approach that augments Checkstyle with large language model (LLM) capabilities, to identify style violations that elude the conventional rule-based analysis. Checkstyle+ is evaluated on a sample of 380 Java code files, drawn from a broader dataset of 30,800 real-world Java programs sourced from accepted Codeforces submissions. The results show that Checkstyle+ achieves superior performance over standard Checkstyle in detecting violations of the semantically nuanced rules.

arXiv.org
Rahman et al 2025: "Which Is Better For Reducing Outdated and Vulnerable Dependencies: Pinning or Floating?" https://arxiv.org/abs/2510.08609 #nwit
Which Is Better For Reducing Outdated and Vulnerable Dependencies: Pinning or Floating?

Developers consistently use version constraints to specify acceptable versions of the dependencies for their project. \emph{Pinning} dependencies can reduce the likelihood of breaking changes, but comes with a cost of manually managing the replacement of outdated and vulnerable dependencies. On the other hand, \emph{floating} can be used to automatically get bug fixes and security fixes, but comes with the risk of breaking changes. Security practitioners advocate \emph{pinning} dependencies to prevent against software supply chain attacks, e.g., malicious package updates. However, since \emph{pinning} is the tightest version constraint, \emph{pinning} is the most likely to result in outdated dependencies. Nevertheless, how the likelihood of becoming outdated or vulnerable dependencies changes across version constraint types is unknown. The goal of this study is to aid developers in making an informed dependency version constraint choice by empirically evaluating the likelihood of dependencies becoming outdated or vulnerable across version constraint types at scale. In this study, we first identify the trends in dependency version constraint usage and the patterns of version constraint type changes made by developers in the npm, PyPI, and Cargo ecosystems. We then modeled the dependency state transitions using survival analysis and estimated how the likelihood of becoming outdated or vulnerable changes when using \emph{pinning} as opposed to the rest of the version constraint types. We observe that among outdated and vulnerable dependencies, the most commonly used version constraint type is \emph{floating-minor}, with \emph{pinning} being the next most common. We also find that \emph{floating-major} is the least likely to result in outdated and \emph{floating-minor} is the least likely to result in vulnerable dependencies.

arXiv.org
Welter et al 2025: "An Empirical Study of Knowledge Transfer in AI Pair Programming" https://www.se.cs.uni-saarland.de/publications/docs/WSD+.pdf #nwit
Anderson et al 2025: "Who Do You Think You Are? Creating RSE Personas from GitHub Interactions"
Characterizes research software engineers into Ephemeral Contributors, Occasional Contributors, Project Organisers, Moderate Contributors, Low-Process Closers, Low-Coding Closers, and Active Contributors based on GitHub data.
https://arxiv.org/abs/2510.05390
#nwit
Who Do You Think You Are? Creating RSE Personas from GitHub Interactions

We describe data-driven RSE personas: an approach combining software repository mining and data-driven personas applied to research software (RS), an attempt to describe and identify common and rare patterns of Research Software Engineering (RSE) development. This allows individuals and RS project teams to understand their contributions, impact and repository dynamics - an important foundation for improving RSE. We evaluate the method on different patterns of collaborative interaction behaviours by contributors to mid-sized public RS repositories (those with 10-300 committers) on GitHub. We demonstrate how the RSE personas method successfully characterises a sample of 115,174 repository contributors across 1,284 RS repositories on GitHub, sampled from 42,284 candidate software repository records queried from Zenodo. We identify, name and summarise seven distinct personas from low to high interactivity: Ephemeral Contributor; Occasional Contributor; Project Organiser; Moderate Contributor; Low-Process Closer; Low-Coding Closer; and Active Contributor. This demonstrates that large datasets can be analysed despite difficulties of comparing software projects with different project management factors, research domains and contributor backgrounds.

arXiv.org
Smite et al 2025: "A Wave of Resignations in the Aftermath of Remote Onboarding"
HR data from 2016-2025 in Ericsson Sweden [showed that] employees onboarded remotely during the pandemic were significantly more likely to resign within their first three years, even after returning to the office.
https://arxiv.org/abs/2510.05878
#nwit
A Wave of Resignations in the Aftermath of Remote Onboarding

The COVID-19 pandemic has permanently altered workplace structures, normalizing remote work. However, critical evidence highlights challenges with fully remote arrangements, particularly for software teams. This study investigates employee resignation patterns at Ericsson, a global developer of software-intensive systems, before, during, and after the pandemic. Using HR data from 2016-2025 in Ericsson Sweden, we analyze how different work modalities (onsite, remote, and hybrid) influence employee retention. Our findings show a marked increase in resignations from summer 2021 to summer 2023, especially among employees with less than five years of tenure. Employees onboarded remotely during the pandemic were significantly more likely to resign within their first three years, even after returning to the office. Exit surveys suggest that remote onboarding may fail to establish the necessary organizational attachment, the feeling of belonging and long-term retention. By contrast, the company's eventual successful return to pre-pandemic retention rates illustrates the value of differentiated work policies and supports reconsidering selective return-to-office (RTO) mandates. Our study demonstrates the importance of employee integration practices in hybrid environments where the requirement for in-office presence for recent hires shall be accompanied by in-office presence from their team members and more senior staff whose mentoring and social interactions contribute to integration into the corporate work environment. We hope these actionable insights will inform HR leaders and policymakers in shaping post-pandemic work practices, demonstrating that carefully crafted hybrid models anchored in organizational attachment and mentorship can sustain retention in knowledge-intensive companies.

arXiv.org
Bharadwaj et al 2025: "The Shifting Sands of Toxicity: The Evolving Nature of Interpersonal Challenges in Open Source" https://biancatrink.github.io/files/papers/ESEM_technical_2025.pdf "…a significant increase in reported interpersonal challenges in 2024 [vs] 2017 [including] threats of violence, impersonation, sustained harassment, stalking, and doxxing. …experiencing rudeness, stalking, and name-calling became strongly linked to stopping contributions, adopting pseudonyms, working privately, and avoiding offline events." #nwit
Pan et al 2025: "Hamster: A Large-Scale Study and Characterization of Developer-Written Tests" https://arxiv.org/abs/2509.26204 Our results highlight that a vast majority of developer-written tests exhibit characteristics that are beyond the capabilities of current ATG (automatic test generation) tools. #nwit
Hamster: A Large-Scale Study and Characterization of Developer-Written Tests

Automated test generation (ATG), which aims to reduce the cost of manual test suite development, has been investigated for decades and has produced countless techniques based on a variety of approaches: symbolic analysis, search-based, random and adaptive-random, learning-based, and, most recently, large-language-model-based approaches. However, despite this large body of research, there is still a gap in our understanding of the characteristics of developer-written tests and, consequently, in our assessment of how well ATG techniques and tools can generate realistic and representative tests. To bridge this gap, we conducted an extensive empirical study of developer-written tests for Java applications, covering 1.7 million test cases from open-source repositories. Our study is the first of its kind in studying aspects of developer-written tests that are mostly neglected in the existing literature, such as test scope, test fixtures and assertions, types of inputs, and use of mocking. Based on the characterization, we then compare existing tests with those generated by two state-of-the-art ATG tools. Our results highlight that a vast majority of developer-written tests exhibit characteristics that are beyond the capabilities of current ATG tools. Finally, based on the insights gained from the study, we identify promising research directions that can help bridge the gap between current tool capabilities and more effective tool support for developer testing practices. We hope that this work can set the stage for new advances in the field and bring ATG tools closer to generating the types of tests developers write.

arXiv.org