Some philosophical implications of the “loudness war” and its criticisms

Recent shifts in the aesthetic value of audio loudness is a symptom of broader shifts in attitudes about social harmony and techniques for managing social “noise.” Put simply, this shift is from maximalism to responsive variability. (“Responsive variability” is the ability to express a spectrum of features or levels of intensity, whatever is called for by constantly changing conditions. You could call it something like dynamism, but, given the focus of this article on musical dynamics (loudness and softness), I thought that term would be too confusing.)  It tracks different phases in “creative destruction” or deregulation–that is, in neoliberal techniques for managing society. In the maximalist approach, generating noise is itself profitable–there has to be destruction for there to be creation, “shocks” for capitalism to transform into surplus value; the more shocks, the more opportunities to profit. However, what happens when you max out maximalism? What do you do next? That’s what responsive variability is, a way to get more surplus aesthetic, economic, and political value from maxed-out noise. (To Jeffrey Nealon’s expansion→ intensification model of capitalism, I’d add → responsive variability. He argues that expansion has been maxed out as a way to generate profits–that’s the result of, among other things, globalization. Intensification is how capitalism adapts–instead of conquering new, raw materials and markets, it invests more fully in what already exists. But once investment is maxed out, then, I think, comes responsive variability: responsiveness and adaptation are optimized.)

Maximal audio loudness was really fashionable in the late 1990s and first decade of the 21st century. Due to both advances in recording and transmission technology (CDs, mp3s), and an increasingly competitive audio landscape, especially on the broadcast radio dial, “loud” mixes were thought to accomplish things that more dynamic mixes couldn’t.

Loud mixes compress audio files so that the amplitude of all the frequencies is (more or less) uniform–i.e., uniformly maxed-out. Or, as Sreedhar puts it, compression “reduc[es] the dynamic range of a song so that the entire song could be amplified to a greater extent before it pushed the physical limits of the medium…Peak levels were brought down…[and] the entire waveform was amplified.” This way, a song, album, or playlist sounds like it has a consistent level of maximum sonic intensity throughout. This helps a song cut through an otherwise noisy environment; just as a loud mix on a store’s Muzak can pierce through the din of the crowd, a loud mix on the radio can help one station stand out from its competitors on the dial. For much of its history, the recording industry thought that loudness correlated to sales and popularity.

But many now consider loudness to be passe and even regressive. Framing it as a matter of “tearing down the wall of noise,” Sreedhar’s article treats loudness as the audio equivalent of the Berlin Wall–a remnant of an obsolete way of doing things, something that must be (creatively) destroyed so that something more healthy, dynamic, and resilient can rise from its dust. Similarly, the organizers of Dynamic Range Day argue that the loudness war is a “sonic arms race” that “makes no sense in the 21st century.” (What’s with the Cold War metaphors?) Maximal loudness, in their view, offers no advantages–according to the research they cite, it neither sells better, nor do average listeners think it sounds better. In fact, critics often claim overcompression damages both our hearing (maybe not our ears, but our discernment) and the music (making it less robust and expressive). Loudness is, in other words, unhealthy, both for us and for music.

As Sreedhar puts it,

many listeners have subconsciously felt the effects of overcompressed songs in the form of auditory fatigue, where it actually becomes tiring to continue listening to the music. ‘You want music that breathes. If the music has stopped breathing, and it’s a continuous wall of sound, that will be fatiguing’ says Katz. ‘If you listen to it loudly as well, it will potentially damage your ears before the older music did because the older music had room to breathe.’

At the end of 2014, we are well aware that breathing room is a completely politicized space: Eric Garner didn’t get it, cops do. “Room to breathe” is the benefit the most privileged members of society get by hoarding all the breathing room, that is, by violently restricting the movement, flexibility, dynamism, and health of oppressed groups. For example, in the era of hyperemoloyment, the ability to sit down and take a breather, or even to take the time to get a full night’s sleep, to exercise, to care for your body and not run it into the ground, that is what privilege looks like (privilege bought on the backs of people who will now have even less space to breathe–like, upper middle class white women who can Lean In because they rely on domestic/service labor, often performed by women of color)? “Room to breathe” is one way of expressing the dynamic range that neoliberalism’s ideally healthy, flexible subjects ought to have. So, it makes sense that this ideal gets applied to music aesthetics, too. Just as we ought to be flexible and have range (and restricting dynamism is one way to reproduce relations of domination), music ought to be flexible and have range.

By now it is well-known that women, especially women of color who express feminist and anti-racist views on social media, are commonly represented as lacking actual dynamic range, as having voices that are always too loud. As Goldie Taylor writes, unlike a white woman pictured shouting in a cop’s face as an act of protest, “even if I were inclined, I couldn’t shout at a police officer—not in his face, not from across the street,” because, as a black woman, her shouting would not be read as legitimate protest but as excessively violent and criminal behavior. White supremacy grants white people the ability to be understood as expressing a dynamic range; whites can legitimately shout because we hear them/ourselves as mainly normalized. At the same time, white supremacy paints black people as always-already too loud: as Taylor notes, Eric Gardner wasn’t doing anything illegal when he was killed–other than, well, existing as a black body in public space. White supremacy made his voice seem that because Gardner’s voice emanated from a black body, it was already shouting, already taking up too much “breathing room,” and thus needing to be muted to restore the proper “dynamic range” of a white supremacist public space.

Taylor continues, “merely mention the word privilege, specifically white privilege, anywhere in the public square—including on social media—and one is likely to be mocked.” These voices feel too loud because they are both supposedly, from the perspective of their critics (a) lacking in range–they stay fixated on one supposedly overblown issue (social justice), and (b) overrepresented among the overall mix of voices. Feminists on social media are charged with the same flaws attributed to overcompressed music (here by Sreedhar): “When the dynamic range of a song is heavily reduced for the sake of achieving loudness, the sound becomes analogous to someone constantly shouting everything he or she says. Not only is all impact lost, but the constant level of the sound is fatiguing to the ear.” Compression feels like someone “shouting” at you in all caps; this both diminishes the effectiveness of the speech, and, above all, is unhealthy and “fatiguing” for those subjected to it. Similarly, liberal critics of women of color activists often characterize them as hostile, uncivil, or overly aggressive in tone, which supposedly diminishes the impact of their work and both upsets the proper and healthy process of social change and fatigues the public. Just as overcompressed music is thought to “sacrifice…the natural ebb and flow of music” (Sreedhar,) feminist activists are thought to to “sacrifice…the natural ebb and flow” of social harmony. But that’s the point. They’re sacrificing what white supremacist patriarchy has naturalized as the “ebb and flow” of everyday life.

But this “ebb and flow” is totally artificial. It just feels “natural” because we’ve grown accustomed to it as a kind of second nature. This ebb and flow is also what algorithmic technical and cultural practices is designed to manage and reproduce. That is, they (re)produce whatever “ebb and flow” that optimizes a specific outcome–like user interaction, which optimizes data production, which ultimately optimizes surplus value extraction.

It’s not too hard to see how an unfiltered social media feed–like OG Twitter–might seem like overcompressed music. Linear-temporal, unfiltered Twitter TLs work like compression: each frequency/user’s stream of tweets is brought up to the same “level” of loudness or visibility–at its specific moment of expression, each rises all the way to the top. But just as overcompressed songs kill dynamic range and upset the balance between what “ought” to be quiet and what “ought” to be loud, unfiltered social media feeds supposedly upset the balance between what “ought” to be quiet and what “ought” to be loud, what “ought” to remain buried in the rest of the noise and what “ought” to cut through as clear signal. (Though what this norm “ought” to be is, of course, the underlying power issue here.) So in an era where all individuals can be egregiously loud, we need technologies and practices to moderate the inappropriately, fatiguingly loud voices, and amplify the ones whose voices contribute to the so-called health of that population.

Many digital music players and streaming services have algorithms that cut overly loud tracks down to size. There’s Replay Gain, which is pretty popular, and Apple’s Sound Check; neither makes any individual track more dynamic, but instead they tame overly loud tracks and bring the overall level of the mix/library/stream to an average consistency. In a way, these are sonic analogues to social media’s feed algorithms–they restore the “proper” balance of signal and noise by moderating overly loud voices, voices that generate user/listener responses that don’t contribute to the “health” of whatever institution or outcome they’re supposed to be contributing to. In a way, Replay Gain and Sound Check seem to work a lot like compression–instead of bringing everything in a single track to the same overall level of loudness, they bring everything in a playlist, stream, or library to the same overall level of loudness. Is the difference between dynamic compression for loudness and algorithmic loudness normalization simply the level at which loudness normalization is applied? The individual track versus the overall mix, the individual subject versus the population
Dynamic compression and range isn’t just about music, or hearing, or audio engineering. The aesthetic and technical issues in the compression-vs-range debate are local manifestations of broader values, ideals, and norms. The era of YOLO is over. Dynamic range, or the ability to responsively attune oneself to variable conditions and express a spectrum of intensity is generally thought to be more “healthy” than full-throttle maximalization–this is why there are things like “digital detox” practices and rhetoric about “work/life balance” and so on. At the same time, range is only granted to those with specific kinds of intersecting privilege. Though the discourse of precarity might encourage us to understand it as an experience of deficit, perhaps it is better understood, at least for now, as an experience of maximal loudness, of always being all the way on, of never getting a rest, never having the luxury of expressing or experiencing a range of intensities.