On Rousseau’s Essay on the Origin of Languages: unless you realize this is about the overtone series, you probably misread Rousseau’s argument

So, this post is really rough and sorely in need of editing, but I need to get it up and running for my Phil of Music course, which starts Monday. In some ways it’s a tl;dr of one of the chapters of The Conjectural Body…


“Note how everything constantly brings us back to the moral effects about which I have spoken, and how far the musicians who account for the impact of sounds solely in terms of the action of air and the excitation of nerve fibers are from understanding wherein the force of this art consists” (Rousseau, EOL 293).


Philosophers frequently misinterpret Jean-Jacques Rousseau’s Essay on the Origin of Language. He is NOT arguing for immediate bodily/affective presence; in fact, affect is, for Rousseau (or at least the Rousseau of the Essay) fundamentally mediated…as are bodies. What he’s mainly interested in critiquing is the idea and ideal of physical reductionism and biological determinism–the idea that our social and cultural structures ought to reflect the structure of the natural world. In this view, as Rousseau puts it, “the only precise relations to be found in nature,” which is treated as “the rule for all relations” (283). Social relations, artworks, everything ought to reflect the organization of the natural world. Thus, according to this view, “if we are to philosophize properly we must go back to the physical causes” (283). Rousseau pretty much scoffs at the very idea: cultural discourses–like language or science–shape the very physical phenomena they purport to merely observe. But before I get into his critique, let me lay out the move he’s challenging.

This idea that social and cultural hierarchies are justified because they follow/reflect natural hierarchies is a common foundation for both (a) Rameauan tonal harmony and (b) classically liberal approaches to social identity. Jean-Phillipe Rameau is a French music theorist and composer; his Treatise on Harmony is one of the main documents that codified tonal harmony–the system of keys and chord functions/progressions that we still use today in most pop music. Tonality abstracts a system of musical organization from a particular interpretation of acoustics, namely, the overtone series. In tonality, chords built on each scale-degree (do, re, me, fa, sol, or I-ii-III-IV-V, etc.) each have a ‘function’–the more consonant (I, V, IV, ii) ones produce a sense of resolution, and the more dissonant ones (ii, vii) produce a sense of tension. But how is relative consonance or dissonance determined? What is the standard or measuring device? Enlightenment music theorists like Rameau used the overtone series–that is, the acoustic properties of a fundamental pitch and its harmonics–as this gauge. Each tone is composed of a fundamental tone, the main pitch, and a series of decreasingly intense (loud) harmonics. These harmonics sound at an octave above the original tone, a fifth above that, a fourth above that, and so on. This series of intervals (I, V, IV…) is the hierarchy theorists like Rameau used to determine the relationships among chords and chord functions. In this way, the organization of musical works reflected the hierarchies present in the physical structures of sound. Similarly, Enlightenment concepts of race and gender also grounded social hierarchies in (supposedly) natural ones. Philosophers–like, for example, Kant–thought that the organization of society ought to reflect the naturally-given organization of bodies. In Kant’s race theory, the Earth’s varying geographic conditions give rise to different types of human bodies. Human bodily difference is naturally given–it is caused by the physical features of the earth. These differences are hierarchically related: European geography is most conducive to life and civilization, so European/white bodies are the most advanced ones, the ones most cable of fully flourishing/realizing their ‘human’ nature. Next comes Asian ones, and African ones are last. It’s the same type of thinking in both Rameau’s and Kant’s case: there are naturally given hierarchies, and socio-cultural systems ought to reflect these hierarchies.

From this perspective, differences in linguistic structures and social customs “can be explained by the difference in organs. I should be curious,” Rousseau challenges, “to see this explanation (252).” He critiques this style of reasoning from “physical causes”–what O. Oyewumi calls “body-reasoning”–with a snarky argument by analogy. Comparing sound to light, Rousseau uses a prism as a metaphor for the overtone series:

Here you have the resolution of light, the primary colors, their relations, their proportions, the true principles of the pleasure you derive from painting. All this mysterious talk about drawing, representation, shape is pure imposture on the part of French painters who think that with their imitations they can arouse I know not what movements in the soul, then it is well known that there are only sensations. You hear wonderful reports about their painting, but look at my hues” (283).

Rameau’s physicalism treats sound merely as sensation (“hues”), and not as cultural discourse (“painting”). People understand and are affected by paintings not just, and not even primarily, because of the way artworks refract specific frequencies of light at our eyeballs. We understand a painting as a painting because it is situated in complex cultural practices–this is why Warhol’s Brillo Box is an artwork, not a misplaced box of cleaning supplies. “Melody” is Rousseau’s term for the musical equivalent of “painting”–sound as a cultural discourse, not a physiological given. “Harmony” is his term for “hues.” Rameau’s theory of harmony attempts to ground music in the nature of sound, and it ignores the role of “melody”–that is, affective cultural discourses or implicit knowledges–in producing the very body that is supposedly immediately ‘given’ by nature.

Rousseau isn’t objecting to the lack of immediacy in Enlightenment political and musical theory–rather, he’s objecting to the attempt to claim as unmediated something that is highly mediated. Enlightenment-style body-reasoning disavows the cultural mediation done by affect/implicit understanding. THAT is what Rousseau’s objecting to. For Rousseau, the body only appears as or feels immediate and “natural” because it has been habituated to implicit knowledges.


Voice and Affect:
Sonic language–the ‘voix’–is absolutely social, and not at all physiological or naturally determined. Unlike gestural language, which is immediate and naturally determined (‘dictated’ by ‘needs’), sonic language is “wrung” (252) from the passions, which are themselves socially mediated. Passions–social and affective needs arising from intersubjective relations–are the origin of spoken language; they are catalyzed by social pressures.“If we had never had any but physical needs, we might very well never have spoken and yet have understood one another perfectly by means of the language of gesture alone” (251). However, because we have complicated social needs and desires, we speak, we communicate with our voices. “Voice,” for Rousseau, means sonic affect.

Sounds are unmediated sonic gestures; they are ‘natural’ noises. Voices, however, have to be trained. They are physical and physiological phenomena, but the practice of speaking orients the body, shapes its organs (mouth, vocal chords, ears), so that even ‘the body’ is not a natural given, but something that must be “made sensitive…by dint of habit.” To go back to his prism analogy, we only “see” the hues we’ve been habituated to see (e.g., different cultural notions of color, or of whole/half/quartertones).

So, for Rousseau, there is no voice of nature, no ‘natural’ human voice emerging directly from geographical and/or physiological materiality. To demonstrate this point I’m going to quote at length, because it’s really, really important to get the nuance here. In the Essay, there are

three states of man considered in relation to society. The savage is a hunter, the barbarian a herdsman, civil man a tiller of the soil. So that regardless of whether one inquires into the origin of the arts or studies the earliest morals everything is seen to be related in its principle to the means by which men provide for their subsistence, and it is for those among these means that unite men, they are a function of the climate and the nature of the soil. Hence the diversity of languages and their opposite characteristics must also be explained by the same causes” (272)

Geographic conditions create a situation to which social systems must respond–but these responses are not directly caused or determined by the physical properties of the enviornment. They are contingent, manufactured methods of addressing problems. For example, when “forced to make provisions for winter, people have to help one another and are thus compelled to establish some kind of convention amongst themselves” (274 emphasis mine). Convention is a human response to enviornmental conditions. Language is one such convention–or rather, it’s a second-order convention. Geography determines social system (savage, barbarian, civil man), which then determines linguistic features/structures (gestural or vocal language, for example).“Aidez-moi” (help me) and “amez-moi” (love me) (279)–the first utterances in Northern and Southern languages, respectively–are socially and culturally shaped responses to environmental conditions. The PHYSICAL causes of the differences in languages (e.g., the physiology of vocalization and sound production) come from affective and social responses to geography and climate–it has NOTHING to do with either the immediate structures of the earth, or of the human body–it’s how human beings respond to nature. Passion–the need for help, the need for love–is social responsiveness, not bodily immediacy (it is bodily, just mediated).

In a way, “passion” is what contemporary philosophers would call implicit understanding. What he’s really trying to talk about is implicit understanding. “Passionate” music engages not just cognitive understanding, but haptic, corporeal knowledges (like Sara Ahmed’s ‘orientations,’ or Linda Martin Alcoff’s ‘interpretive horizon’). The ancient Greeks thought of music in this way: this is why Plato thought the moderation and self-mastery required of the guardians of the Republic could be learned by practicing music and athletic training–musical training was corporeal/affective training. Sounds whose “effect is entirely physical” and “is due to the interaction of the different particles of air set in motion by the sounding body and by all of its constitutent parts” may be pleasant, but “unless this pleasure is enlivened by familiar melodic inflections it will not be totally delightful” (286). So, I can find random sounds interesting, but only those sounds situated in interpretive discourses and contexts are really meaningful. For Rousseau, purely physical, culturally unmediated sounds are just nice, they’re not affectively moving. This is because he thinks affect it not immediate, but culturally habituated/learned. For example, he argues: “rude ears perceive our consonances as mere noise. it is not surprising that when the natural proportions are altered, natural pleasure disappears” (286). Music is pleasurable and meaningful because it engages our affective implicit knowledges–our internalized and shared cultural discourses. “Sounds act on us not only as sounds but as signs of our affections, of our sentiments [which are social]; this is how they arouse in us the emotions which they express and the image of which we recognize in them” (288).The Tarantella example shows that music affects bodies only to the extent that these bodies are acculturated to it: “each is affected only by accents with which he is familiar; his nerves respond to them only insofar as his mind inclines to them: he has to understand the language in which he is being addressed if he is to be set in motion by what he is told” (289). Rousseau isn’t arguing for affective immediacy, but culturally specific implicit understanding.

Rousseau is warning that the Enlightenment is a cultural formation that distances us from precisely what civilization gives us–passion, convention (which is what animals lack). Affect not immediate or natural; Rousseau thinks there are no affects in the state of Nature. So, when he says that “man did not begin by reasoning but by feeling” (253), he’s not arguing that feeling is more primitive and natural than reason; rather, he’s arguing that affect/feeling is what brings us out of the state of Nature. Affective relations are what make “us” civilized peoples, not savages (who communicate with gestures) or animals (who likewise communicate representationally rather than affectively). With its deadening of affect/passion and increasingly precise accuracy, European rationality is more like primitive gestural languages than the “songlike and passionate” (253) ones Rousseau champions. In a way Rousseau is arguing that Enlightenment rationality returns Europe to a more primitive, ancient episteme–a visual episteme (this time analytical rather than gestural). It is more accurate, but less affective: “accent, which was not forthcoming from the heart, was replaced by clear articulations…to arouse not sentiment but understanding” (280). (“Accent,” in the Essay, means PITCH. “Articulations” are consonants.)Rameauean tonality tries to cut this middleman out, and return directly to the clarity of viz/gesture/physiology. He is trying to be ‘objective,’ and cuts out this inter/subjective element (which, as Rousseau points out, is ultimately the universalization of his own subjective/cultural milieu).This objectivity misses the point:

Whoever wishes to philosophize about the force of sensations must therefore begin by setting the purely sensory impressions apart from the intellectual and moral impressions we receive by way of the senses, but of which the senses are only the occasional causes: let him avoid the error of attributing ot sensible objects a power which they either lack or derive from the affections of the soul which they represent to us (287).

In other words, Rousseau thinks Rameau attributes to nature–to the relationships among sound frequencies–a power that is actually cultural. The problem with Enlightenment rationality is that it feigns objectivity–it claims to cut out the middleman of “interpretive horizon” or “implicit understanding,” and get right straight to the universal, ‘objective’ science. But it doesn’t.

So why should anyone other than Rousseau scholars care about this?

Rousseau’s Essay does a couple of really interesting things:
  1. It shows the connection between tonal harmony and classically liberal concepts of social identity: they both use the “body-reasoning” or reasoning from physical causes that Rousseau critiques. We’re more familiar with this body-reasoning in the case of racialized and gendered social identities: there is a natural hierarchy of bodies (men and women, everybody else and white people, etc.), and a just society ought to respect and reflect that hierarchy, etc. But what Rousseau does is show that this type of reasoning, this epistemic framework, also grounds Enlightenment tonality.  The same general epistemic framework is present in both classical tonality and classically liberal political theory. So, Rousseau helps us understand the relationship between ways of thinking about music, musical practices, and ways of thinking about politics and political practices.
  2. As I have argued elsewhere, Rousseau’s early work might actually be the very sort of non-ideal theorizing that Charles Mills suggests or hints it may be. Instead of theorizing from actual politics, Rousseau is theorizing from actual musical practices. So, Rousseau’s musical writings might be better, more productive resources for political philosophy than his actual political writings.

    3. Rousseau’s geography is really different than Kant’s. Kant also uses geographic differences as the basis of what, in his work, is racial difference. As I mentioned earlier, Kant engages in precisely the sort of reasoning-from-nature that Rousseau is critiquing here. But that’s not their only difference. Rousseau is not using geography to ground a hierarchy of bodies–geography compels people to respond to it, and its these responses that are different. Now, Rousseau certainly thinks there’s a teleological progress from savage to civilized ‘man’, but savagery or civility is not determined by one’s physiology–either your climate or your body. So you might say that the Essay is not using the sort of reasoning we find in Enlightenment race theory (and Kant is). But another thing that’s interesting about Rousseau’s geography is the absence of Africa. He mentions Egypt and north Africa a few times, and “a Carib” once, but sub-saharan “black” Africa is totally absent from the Essay. In the essay, ASIA is the ultimate “other” to Europe. So the Essay doesn’t use the black/white binary that posits Africa as Europe’s polar opposite. There is definitely something going on about cultural difference, but it’s not “race” in the way we philosophers conventionally understand it.