I have a new essay in print today. “Objective Vision: Confusing the Subject of Computer Vision” appears in the September issue of Social Text. This essay takes up the critical genealogy that I offered in my recent The Birth of Computer Vision book to examine contemporary convolutional neural networks.
It argues that the ontology of computer vision developed in the 1950s persists into the present and that the assumptions of that ontology are the enabling technology of many of the biases that we see in computer vision.
Understanding the construction of computer vision’s “field of vision” is important because these technologies are EVERYWHERE now and, as I argue, they may “[render] individuals guilty by association [through] the displacement and transfer of background object features to subjects in the foreground.”
Much attention in #CriticalAI has been given training datasets and ImageNet-trained networks are especially problematic but theoretical work is needed to examine the models themselves and how they instantiate ideology and its biases and preferences.
The historicization of machine learning is key because many of the problems we see as residing in the present (OpenAI, I’m looking at you) are the accretions of decisions made in the past.
This essay (initially versions written way back in 2018!), importantly, argues that the problem is not just training data but the methods themselves: it provides examples of racial bias in the ontology (amplifying fragmented representation) and in feature preference (textures).
You can read the essay at Duke UP. If you don’t have access, please email me and I’ll be happy to send you a copy! https://read.dukeupress.edu/social-text/article/41/3%20(156)/35/382378/Objective-VisionConfusing-the-Subject-of-Computer
Objective Vision: Confusing the Subject of Computer Vision

Abstract. Convolutional neural networks (CNNs) are a key technology powering the automated technologies of seeing known as computer vision. CNNs have been especially successful in systems that perform object recognition from visual data. This article examines the persistence of a mid‐twentieth‐century ontology of the digital image in these contemporary technologies. While CNNs are multidimensional, their ontology flattens distinctions between background and foreground, between subjects and objects, and even the relations established among the categories of information used to organize and train these models. This ontology enables the introduction and amplification of bias and troubling correlations and the transfer or slippage of learned associations between humans and objects found in the training image archives. Inspecting and interpreting what CNNs learn and index through their complex architectures can be difficult if not impossible because of how they encode and obfuscate quite human ways of seeing the world and the image repertoires used to train these algorithms that are rife with residues of prior representations.

Duke University Press