a war about the definition of open source AI? corporations merrily open-washing their models? looks like things are playing out exactly as we predicted in our #facct2024 paper https://dl.acm.org/doi/10.1145/3630106.3659005

as we noted: co-optation by corporate lobbies occurs primarily on the terrain of standards, in the form of weakening or dilution

#foss #opensourceAI #openGenAI #opensource

/edit: fix link to doi

Rethinking open source generative AI: open-washing and the EU AI Act | Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency

ACM Other conferences
Next was an incredible talk by Sanmi Koyejo on predictability and surprise in language model benchmarks at @FAccT. Koyejo brings the 🔥, starting with one of the lines of the year: "The new NLP is language models trained by mad libs" 😂, then thoroughly dismantles the notion of "emergence" in these models through rigorous empirical and theoretical work. Highly recommend https://www.youtube.com/watch?v=27J904Y2JGk (5/9) #AI #FAccT2024
FAcct '24 Keynote: Sanmi Koyejo, Stanford University "The measure and mismeasure of AI"

YouTube
Next was a thought-provoking talk by Virgilio Almeida on governing algorithmic systems through a sociotechnical lens at @FAccT. Almeida introduces the concept of treating algorithms as institutions in their own right, which I think is compelling and deserves to be explored more https://youtu.be/aNuTOeohfVU?si=iHcqhbs0qwYwhQzV&t=998 (6/9) #FAccT2024 #AIEthics
FAccT '24 Keynote: Virgilio Almeida

YouTube

If you are #FAccT2024, please join the trust and reliance session tomorrow (Wednesday) at 11:35am in the Sao Conrado room to hear from me, Naina Balepur, Arianna Manzini, @ruotongw, and Ziyang Guo

Program 📃: https://programs.sigchi.org/facct/2024/my-schedule/session/164930

I'll talk about LLMs' uncertainty expression and its impact on (over)reliance/trust

Paper 🧵: https://hci.social/@sunniesuhyoung/112374090391640712

Conference Programs

This talk touched on 3 of my recent papers/manuscripts.

I talked about the Consequences-Mechanisms-Risks (CMR) framework introduced in our #FAccT2024 paper.
https://arxiv.org/abs/2312.10076

A Framework for Exploring the Consequences of AI-Mediated Enterprise Knowledge Access and Identifying Risks to Workers

Organisations generate vast amounts of information, which has resulted in a long-term research effort into knowledge access systems for enterprise settings. Recent developments in artificial intelligence, in relation to large language models, are poised to have significant impact on knowledge access. This has the potential to shape the workplace and knowledge in new and unanticipated ways. Many risks can arise from the deployment of these types of AI systems, due to interactions between the technical system and organisational power dynamics. This paper presents the Consequence-Mechanism-Risk framework to identify risks to workers from AI-mediated enterprise knowledge access systems. We have drawn on wide-ranging literature detailing risks to workers, and categorised risks as being to worker value, power, and wellbeing. The contribution of our framework is to additionally consider (i) the consequences of these systems that are of moral import: commodification, appropriation, concentration of power, and marginalisation, and (ii) the mechanisms, which represent how these consequences may take effect in the system. The mechanisms are a means of contextualising risk within specific system processes, which is critical for mitigation. This framework is aimed at helping practitioners involved in the design and deployment of AI-mediated knowledge access systems to consider the risks introduced to workers, identify the precise system mechanisms that introduce those risks and begin to approach mitigation. Future work could apply this framework to other technological systems to promote the protection of workers and other groups.

arXiv.org
I’m not at #FAccT2024 (instead on the tail end of a long-planned vacation) but wanted to share two papers being presented! One paper is about fairness for recsys providers (e.g. YouTube creators or people on dating apps) and one is about intro AI courses on YouTube. 🧵

Our lead author Eshta recorded the talk for "Machine Learning Data Practices through a Data Curation Lens: An Evaluation Framework" as a 12 minute video: https://youtu.be/C5VwJBE31JY

She will present it tomorrow, June 4, in the ‘Towards better data practices’ session which looks fantastic (3:45-5pm local, Bossa room). https://facctconference.org/2024/schedule #FAccT2024

2/2

Machine Learning Data Practices through a Data Curation Lens: An Evaluation Framework

Presented at ACM FAccT 2024.

YouTube

New paper! 📢 #FAccT2024
For "Machine Learning Data Practices through a Data Curation Lens: An Evaluation Framework", we evaluated data practices in machine learning *as* data curation practices. We developed an evaluation rubric and applied it to a sample of NeurIPS papers to see what we would find. Preprint at https://arxiv.org/abs/2405.02703

We found that researchers in #MachineLearning, which often emphasizes model development, struggle to apply standard data curation principles. 1/2

Machine Learning Data Practices through a Data Curation Lens: An Evaluation Framework

Studies of dataset development in machine learning call for greater attention to the data practices that make model development possible and shape its outcomes. Many argue that the adoption of theory and practices from archives and data curation fields can support greater fairness, accountability, transparency, and more ethical machine learning. In response, this paper examines data practices in machine learning dataset development through the lens of data curation. We evaluate data practices in machine learning as data curation practices. To do so, we develop a framework for evaluating machine learning datasets using data curation concepts and principles through a rubric. Through a mixed-methods analysis of evaluation results for 25 ML datasets, we study the feasibility of data curation principles to be adopted for machine learning data work in practice and explore how data curation is currently performed. We find that researchers in machine learning, which often emphasizes model development, struggle to apply standard data curation principles. Our findings illustrate difficulties at the intersection of these fields, such as evaluating dimensions that have shared terms in both fields but non-shared meanings, a high degree of interpretative flexibility in adapting concepts without prescriptive restrictions, obstacles in limiting the depth of data curation expertise needed to apply the rubric, and challenges in scoping the extent of documentation dataset creators are responsible for. We propose ways to address these challenges and develop an overall framework for evaluation that outlines how data curation concepts and methods can inform machine learning data practices.

arXiv.org

Derechos Digitales está presente en FAccTConference
.
👉Nuestra codirectora Jamila Venturini
moderará la sesión "Grounded Frameworks for uditing public sector uses of AI in Latin America"
👉3/6 de las 15h30 a las Brasil
👉Calendario completo: https://facctconference.org/2024/schedule

#FAccT2024

ACM FAccT - 2024 Program

Hey, this is cool! I'm going to be on a panel today at #FAccT2024 to share some thoughts related to this project, a zine created by the DAIR Institute, Collective Action School and Collective Action in Tech about labor and generative AI. If you can't make the talk, check out the zine: https://collectiveaction.tech/2024/bits-in-the-machine/