To what ethical standards should we hold AI machines to, if at all? In this essay, I am not so much concerned with if AI-generated images can be considered art, original or unoriginal, or if AI can truly “learn” to make art, though I will touch upon its processes to provide some background information. Rather, I am primarily interested in AI as a tool that can generate harmful images or images of harm, and whether or not these products generate violence.
AI Generated Image, using keywords “victims, Khmer Rouge, photograph”. March 14th, 2023. Made with Stable Diffusion. https://stablediffusionweb.com/#demo.
My main image of interest is an AI-generated image I prompted on Stable Diffusion, with the keywords “Khmer Rouge, victims, photograph”. I wanted to see if AI could show me what my father had seen when his sister was taken to a re-education camp in Vietnam, or when he had to live on a boat for two years, or when they finally fled to Canada as refugees. However, the AI only produced melancholic walls of “people’s” faces. I will argue that while AI-generated images can be a powerful tool for “return engagements” as outlined in Viet Le’s monograph on contemporary art’s traumas in Saigon and Phnom Penh, the very nature of AI’s generation process produces a grotesque facsimile of the victims it’s prompted to depict, ultimately enacting representational violence on the subjects—even though these subjects are not “real people”.
First, I will explain how AI images are generated, and how even from pre-conception, these kinds of images can be harmful due to underrepresentation in media culture. Second, I will examine the visual characteristics of my AI-generated image, ultimately concluding that it is an artefact of representational violence through its monstrous rendering of the Cambodian victims. Last, I will compare the AI image to Dinh Q. Le’s Hill of Poisonous Trees (2008) series, and explain how although the creation process is similar in both pieces, Dinh Q Le’s work engages directly with memory and identity, while the AI image obscures it. Therefore, by looking at the temporal context of AI, the visual characteristics of an AI-generated image of Khmer Rouge victims, and a human made artwork of the same subject, I will examine the ethics of AI-generated images of pain and violence and conclude that such artefacts are harmful representations that, due to their continuation of genocidal violence, are unethical to produce.
To start, I will establish the scope of AI in the current day and will argue that the process of turning word prompts into images rests on a database of biassed information; therefore, it fails to address underrepresentation, altered photographs, and harmful imagery. The first AI-powered image generating software emerged in the 1960s The function of generative AI in basic terms employs a "generator" to create images and a "discriminator" to determine which images are successful. The rubric for the discriminator is trained on large data sets sourced from the internet and follows basic probability rules. For example, a machine can discriminate between authentic images with the rule “what is the probability of y given x?” For an AI-generated image of a dog, two labels exist for this image: “dog” or “not dog”; thus, the problem would be expressed as “what is the probability that an AI-generated image is a dog based on the characteristics it displays?” There are two main issues with this system; first, the labels for subjects are binary—it is either “dog” or “not dog” —therefore prohibiting nuanced understandings of any kind of subject. training models fail to capture and accurately represent minority groups as they are based on sets of images that are sourced without random sampling. A paper on bias in Internet datasets states, “systemic biases may also be manifested in the form of availability bias when datasets that are readily available but not fully representative of the target population… Disadvantaged groups including indigenous populations, women, and disabled people are consistently underrepresented.”Evidently, the availability of varied representations, or representation at all, is systemically suppressed by generalised datasets. Considering our dog example once more, the term is vague and lacks subtlety. This can be corrected with precise wording, perhaps inputting “golden retriever” rather than “dog”, but the product remains a generalisation. Similarly, language models used in generative systems are subject to bias; indeed, to input “human” into a GAN requires a consensus on what a human is. It begs the question: who defines these terms? The hegemony on information by certain power structures that define mainstream understandings of the English language lead to systematic gaps in performance. Given this, there are concerns about the availability of historically undocumented violence, the definitions of violence by those in positions of power, and the ethics of training AI on deceased victims of genocide when they cannot consent to their image being used. In Andrea D. Fitzpatrick’s, “Reconsidering the Dead in Andres Serrano’s The Morgue” states that the dead are subject to manipulation by the living as their identities, which are subjective performance, are in the hands of the artist, and their aesthetic choices can perform the identity of the deceased without their consent. The Khmer Rouge kept extensive records of their regime in Cambodia, however victim identifiers, such as names, were hardly documented. Thus, each victim’s likeness is the last semblance of their identity; when an AI machine obscures their representation beyond recognition, they become pure victims—an archetype of the pain caused unto them, defined by criminality and violent death. There have been moves to design AI systems with empathy, such as a conference by NeurIPS called “Resistance AI” held in 2020 that invited multimedia works to highlight power disparities in the AI pipeline. However, these systems remain niche and unsuccessful.
Next, I will argue that the AI image I have generated with the keywords “Khmer Rouge, victims, photograph” is an artefact of representational violence as evidenced by the grotesque rendering, the de-individualisation, and the ethnographic composition of the subjects. From afar, the image presents a grid of faces, realistic to varying degrees of accuracy. The words above the photos are primarily intelligible. It appears well rendered and detailed. Upon close inspection, the faces of the subjects reveal distortion, increasingly so as our gaze moves from right to left. Indeed, the subject on the far left of the third row is a skeletal mass, with dark abscesses for eyes and a mouth imparted as if to shriek. To her immediate right, a man’s face is pure white, like a spectre, and his facial features sloping off the right side of his face. Evidently, all of the generated faces contain similar traces of monstrosity, no doubt due to the limits of AI at approximating a subject as complex as human likeness. What Stable Diffusion has produced is a mass of distorted faces made by selecting the characteristics of real, deceased persons that communicate victimhood and Cambodian-ness. Indeed, the individuality of the subjects in the AI’s source material is consciously cut up and haphazardly reassembled, the result is a dehumanising spectacle of pain. Furthermore, the image features photographs of subjects taken with ethnographic conventions. The subjects are facing frontally on plain backgrounds, and are rendered in black and white colour. The latter is an interesting “artistic” choice by the AI, as it can communicate that the documented humiliation of the genocide victims is timeless.
Additionally, Saskia Sassen in “Black and White Photography as Theorizing” writes about how monochrome photography creates distance and unsettles meaning. As such, we are further pulled away from the inner world of the generated subjects. Those with some knowledge on the Cambodian genocide would recognise this image as a approximation of the portraits taken by the Khmer Rouge wherein prisoners were photographed by the Tuol Sleng guard before they were tortured and killed. Therefore, the AI-generated image reproduces the Khmer Rouge’s visual symbol of successful genocide. I would argue that it inserts new faces in the corpus, multiplying its reach. Thus, I argue that this image continues the violence done unto the Cambodian people.
So far, we have considered the generated image as a photograph, we have established the history of AI images, conducted visual analysis, and will now compare it to a “real” image of violence from the Khmer Rouge Genocide of Cambodian citizens.
Dinh Q. Lê. Untitled (from the Hill of Poisonous Tree Series). 2008. Multimedia Photo Weaving. Kadist Gallery, San Francisco.
The work is an untitled piece from Dinh Q. Le’s Hill of Poisonous Trees (2008) series, which features three photographs of anonymous Cambodian men, linked to the Khmer Rouge Genocide through their placement in prison backgrounds, interwoven with yellow-hued shots of the Tuol Sleng museum. The series’ name comes from the museum, the Poisonous Hill being a Khmer phrase for a place to trap those who bear or supply guilt. Le takes photos of this museum’s long halls, small warren of prisons, and its torture chambers for his work, though he weaves stills from films, postcards, and journalistic photographs in other pieces. I will examine my AI-generated image and Hill of Poisonous Trees (2008) in light of Viet Le’s Return Engagements: Contemporary Art's Traumas of Modernity and History in Sài Gòn and Phnom Penh. This monograph writes about the purpose of returning to past times, whether in person or through visual representation, and how these “return engagements” reconcile trauma and memories with the future.
To start, the creation processes of both of the images are similar. In a way, both create an image from collaging multiple photographs together. AI machines take similar characteristics and merge them into a novel visual, while Le cuts photographs by hand and weaves them together to form a tapestry-like composition. However, I argue that Le’s work effectively engages in discourse around visual representations of violence while the AI generated image does not. This is because of the Le’s position as a Vietnamese artist and the formal quality of the work which allows the viewer to engage with the subject matter while also containing it. On the latter point, Le’s artistic choice to cut up the images of genocide victims can be read as violent fragmentation; yet, the act of weaving these “timeless” black and white photographs can read a transporting the subjects to the present. As a result, the Khmer Rouge regime is represented transgressively; these are the people who could be standing here today, if not for their tragic murder. The work reminds the viewer of loss while resurrecting each figure. Thus, I argue this work is an act of rescue, rather than a continuation of harm. The AI-generated photo merely reproduces pain, while Le’s appropriation of historically violent imagery expands cultural history. Though, there is a question about who can be an “authentic” voice to speak on highly controversial stories and if they should be represented at all. Who has the rights to reproduce images of violence, and in what ways? Proximity to such vexed issues is an important factor in answering this question, but there is no definitive answer. Vision can be a violent act, but it is critical to recall the concept of “return engagements”. I argue that to revisit the past is to reinforce memory and community, while the inverse works towards its destruction. Viet Le writes in his preface to the book, “the afterlife of trauma leaves invisible traces”; when these traces are removed from their temporal context, or distorted in such a way that they are no longer recognisable, we bury these narratives and the potential they hold for cultural recovery. The erasure of an ethnic other’s collective memory renders refugees as “liminal specter” and centralises US hegemony. Thus, with no home to return to, all the roads out lead to the West.
In conclusion, I have argued that AI-generated photographs fail to generate discursive power as they can only duplicate representations of violence, as they are trained on historical and institutionally biassed sets of images. As in, they cannot invent “new” perspectives as learning machines are limited to an over-simplified understanding of nuanced discussions around historically undocumented events and marginalised peoples. As it stands currently, while AI can be useful in regurgitating information, these representations are often monstrous and dehumanising due to the limits of current AI technology and only harms victims further.
Bibliography