Here’s what worries about AI surpassing human musical creativity get wrong
This was originally published on February 26, 2023 on my Substack newsletter. I’m republishing it here to get this post on my platform. (Never trust platforms you don’t own.) If you would like to support the work I do (I don’t get research funds now that I’m not faculty), I’m running a sale on newsletter subscriptions: a year for $24. Offer is good until 23 February 2024.
Singer-songwriter Colin Meloy recently wrote a Substack post about his experience asking Chat GPT to write a song in the style of the indie rock band The Decemberists. He prompts it with a title, “Sailor’s Song,” and asks the bot to write lyrics and chords. After figuring out the best melody those chords could support, Meloy then records a version of the song he and ChatGPT have generated. “For the record,” he writes, “this is a remarkably mediocre song…there’s just something missing.” For Meloy, the thing that’s missing is the bot’s ability to imagine beyond what’s already coded into “the western pop canon”:
ChatGPT lacks intuition. That’s one thing an AI can’t have, intuition. It has data, it has information, but it has no intuition. One thing I learned from this exercise: so much of songwriting, of writing writing, of creating, comes down to the creator’s intuition, the subtle changes that aren’t written as a rule anywhere — you just know it to be right, to be true. That’s one thing an AI can’t glean from the internet.
Here, “intuition” stands in for the irreducibly ‘human’ (heavy on the scare quotes) ability to break the rules, to improvise, to play. Chat GPT just doesn’t know when or how to take risks and color outside the lines.
As much as this perspective claims to prize an ineffable human capacity over hard numbers to crunch, it nevertheless adopts music scholars’ equivalent to the “hard numbers” approach, i.e., the “just the notes” perspective common to very traditional music scholarship. This perspective began to be recognized and criticized as such in the late 80s and early 90s by “new musicologists” who brought the resources of critical, feminist, and queer theory to bear on music scholarship (it was still very white and critical theories of race didn’t get much uptake in the field till the late 2010s). Susan McClary describes the “just the notes” approach as the field’s tendency to “restric[t] the questions it acknowledges to matters of formal process as they appear in musical scores” (20). From this perspective, music is reducible to the objective properties of sounds, like harmony, melody, rhythm, etc. Since the 80s, this “just the notes” approach tends to be held in explicit contrast to “newer” approaches that center on so-called “extramusical” social, cultural, and political factors. Because many practitioners of the “just the notes” approach tended to disavow the salience of method that highlighted music’s existence in a world structured by white supremacist capitalist patriarchy, this method has come to be understood as a fundamentally conservative one that naturalizes existing inequalities while claiming to be apolitical.
Though Meloy doesn’t seem to be aware of, invested in, or calling on the political connotations of academic musicology’s “just the notes” approach, his perspective on AI’s capacity for songwriting reflects the same underlying musical ontology in which music is, at bottom, just the notes. In his analysis, songwriting is reducible to the way a song’s elements–melody, lyrics, harmony, rhythm, etc.–are put together. It’s these elements that evince the lack of “intuition” Meloy diagnoses ChatCPT with. From this perspective, songwriting is a skill individuals can master to greater or lesser degrees. In Meloy’s view, AI has yet to master human songwriters’ ability to know when and how to creatively break the rules.
This idea of creativity as Individual mastery is related to the labor theory of value. A tool of colonialism and capitalism, the labor theory of value holds that one adds ‘value’ to a thing by taking it from its supposedly ‘natural’ or un-labored-upon state and laboring on it, thus adding the value of one’s labor to the thing. For example, 20 yards of linen becomes, through the labor of a tailor, a coat, just as 20 acres of forested land becomes, through the labor of the property developer, real estate. However, as these examples suggest, the labor theory of value is also implicated in capitalism’s ability to extract surplus value from laborers: especially these days, it’s not the property developer that did all the manual labor themself, but subcontracted construction workers whose wages don’t reflect the full value their labor creates for the person who owns the land they are improving with their labor. Philosophers like John Locke explicitly connected the labor theory of value to private property ownership–by laboring on things in the state of nature, they become mine, etc. Patriarchal racial capitalism, however, distributes the private property our labor creates unevenly. Framing artistry as individual mastery likewise treats creativity as a form of private property: by working on my craft and on myself, I take myself out of my ‘natural’ state and become master of my capacities in the same way the property developer becomes owner of their real estate.
Because it frames creativity as a form of private property, the individual mastery framework cannot provide an effective alternative to Silicon Valley business models that use AI as a way to extract value from art and artists. For all the fretting one might want to do about whether and when AI will “surpass” humans’ skill levels, one still treats art and creativity as things that are owned (e.g., as human capital). The only alternative the individual mastery framework can propose to Silicon Valley is the claim that artists, not startups and tech giants, should own their work and the value it creates. In other words, the individual mastery framework can only understand art and creativity as private property.
Thankfully, however, art and creativity are so much more than private property or private property relations. They are practices that foster pleasure and sociality. The reason why people pay to get drunk and paint pictures with friends isn’t to create a commodity or an asset–it’s to have fun and enjoy spending time with other people. Similarly, there are millions of amateur musicians who practice and gig because they enjoy musicking and the socializing that goes with it. As Angela Davis argues, the reason why art can be liberatory has less to do with its content and everything to do with the social relations it enables. As K-Pop fans’ recent successes in online activism make clear, music is a medium that brings people together in collectivities that can make a political impact beyond their fandom. The experience of making and listening to music (what musicologists call “musicking”) is physical and social, and so much more than “just the notes.”
These physical and social pleasures are what motivates amateurs and fans to keep on musicking. AI’s de-skilling of musical labor will change the ways people go about musicking, and there will need to be some new rules about etiquette and ethics to go along with those changes (e.g., YouTuber Bambor Leany trained a vocal AI on Ariana Grande acapella tracks and made a recording of it singing Rihanna/Sia’s “Diamonds,” and there are some questions regarding the ethics of vocal deepfakes like this one). However, such de-skilling won’t stop or prevent people from enjoying the experience of making, listening to, and sharing the experience of music and musicking. GarageBand and SoundCloud changed how people make and share music, but it didn’t change the fact that they like to do it. If you understand music less as a skill for individuals to master and more as a shared experience, the question of AI’s impact on music and musicking becomes much less dire.
One thing to worry about in this music-as-shared-experience framework is the fact of communicative capitalism, or the fact that social media has transformed communication into an opportunity for companies to extract unpaid labor from customers/users. TikTok and YouTube offer many examples of the monetization of people sharing their musical experiences. Perhaps most famously there’s YouTube’s Williams Brothers, whose reaction video recording their first listen to the drum solo in Phil Collins’s “In The Air Tonight” went viral in 2021 and currently has over 10 million views. On TikTok, Nathan Apodaca went viral with a video of him skateboarding while vibing out to some Fleetwood Mac. And those are just examples of the most viral instances of platformed shared musicking: TikTok and YouTube are filled with millions of less viral videos of people’s shared musicking. On social media, the shared experience of music and musicking is a site of capitalist extraction.
By de-skilling music, AI opens out more kinds of musical experiences for platformed extraction as people with little musical practice or training will be able to make and share more kinds of music and musical experiences. This puts amateur musicians and fans in the same position that the industry has put working musicians, and is a significant point of solidarity among professionals, amateurs, and fans. (However, AI itself extracts from musicians in a different way than it extracts from platformed content creators/users; by training on musicians’ IP, AI extracts the value of their assets, not of their labor. I’ll probably write more about that later.) The question of AI’s ability to master musical skill is at best a red herring designed to distract us from the intensified forms of extraction it enables.