Opinion: Modes, Mix Voice, Head Voice... What?

The world of singing is fraught with overlapping and misleading terms thrown about, typically with little or no explication. This can create a lot of confusion for beginning students and even for professional teachers. I am fortunate to have weekly chats with a mentor and colleague of mine to discuss our teaching and various pedagogical queries we might have. Most of the time, questions come up because of an article or video that has been carelessly thrown around terminology or used extremely antiquated terms that leave significant room for interpretation. These interactions are a large part of why I produce the Singer’s Term of the Day posts and write articles like this. So today, let’s look at descriptors for the state of the vocal folds during phonation.

These words are: Mode 0, Mode 1, Mode 2, Chest Voice, Head Voice, Mix Voice, Chest Register, Head Register, Mix Register, Fry, Whistle, Cricothyroid Dominant, Thyroarytenoid Dominant, Breathy, Stiff, Thick, and Thin. Exhausting, right? Don’t worry I didn’t forget about: Speech, Cry, Sob, Neutral, Over Drive, Edge, Yelling, Calling, Belting, Twang, etc. They just describe parts of the vocal technique that extend beyond the true vocal folds. I’m confident you have heard at least a few of these words thrown around in relation to singing. But what are they really?


Mode 1, Chest Voice, Fry, Mix, etc., all talk about the configuration of the true vocal folds during phonation. To phonate, air is expelled through your adducted (closed) vocal folds causing them to blow apart and come back together. Pitch, intensity, and part of harmonic intensity are all determined by how adducted the folds are, how stretched they are, and how thick they are. What makes this complicated is that unless you are being scoped, you cannot see your vocal folds, and you have very little sensory feedback for the tiny movements of the vocal folds. This lack of information caused teachers and singers to talk about their voice in how it feels and how it sounds, which got us terms like Head & Chest Voice.

We’ve learned a lot about how the vocal instrument works since the start of using terms like Head, Chest, & Mix Voice. This has led to a push by many teachers to be more accurate and so anatomically correct terms like cricothyroid dominant and thyroarytenoid dominant start showing up, as well as terms like thick, thin, stiff, and slack in the Estill Voice model. There has additionally been some pushback on these more accurate terms that they lead to overcomplicating. “Was that a thick sound or a thin sound?”, “I think it was a thinner-thick fold production.” Enter the Modes, a way to classify the productions into categories that are not nearly as inaccurate as Head, Chest, or Mix voice but leave more vagueness than cricothyroid dominant production. Which method is best? Oh, we will get to that soon, but not yet. First, let’s look at what all of these actually mean.


Mode 1, Chest Voice, Chest Register, Thyroarytenoid Dominant, Thick Folds

As you may have guessed after reading above, all of these refer to a Thyroarytenoid Dominant sound production. That is to say that the Thyroarytenoid muscles, which are the muscle part of your vocal folds, are engaged, causing the vocal folds to become thicker, increasing the amount of surface area that is coming together during phonation. This will increase the intensity of the produced harmonics, and more harmonics will sound richer and fuller. For some real voice nerdery, this also will increase the closed quotient (how long your vocal folds are together each cycle), which increases the amplitude (intensity). All of this increased harmonic activity can cause the larger cavities of the chest to vibrate, making it feel like the sound is originating in the chest.


Mode 2, Head Voice, Head Register, Cricothyroid Dominant, Thin Folds (and possibly Falsetto, it’s complicated)

All of these refer to a Cricothyroid Dominant sound production in which the cricothyroid muscles pull the thyroid cartilage forward and down, stretching the vocal folds. That stretching thins the vocal folds causing less surface area to come in contact during phonation. This stretching of the vocal folds is important for raising the pitch level. These higher pitches, with their higher harmonics, resinate more strongly in the sinuses of the head, making it feel like they are coming from the singers face.

Falsetto is complicated, depending on who you are talking to it might mean different things. It could be Mode 2 or thin folds. It could also be a breathy phonation such as in the Estill Voice stiff folds where there is a gap in the closure of the vocal folds, called a posterior glottal chink. Or it could be some hybrid. Either way they are probably talking about a higher and lighter style of phonation for male singers.

Whistle, another modification in this category, typically referring to the highest of high notes for female singers due to the whistle like characteristic that the tone takes on somewhere above C6.


Mode 0, Fry, or Slack Folds

On the opposite end of the spectrum these refer to an asymmetric vibration in the vocal folds when the folds are well, slack. This can be used for color effects or to sing notes that are too low to be sung in Mode 1. The lowest note ever sung was 0.189 Hz and it was with this technique.


Mix Voice or Mix Register

This one doesn’t have a lot of names but is particularly problematic. The general idea is that mixed voice is some hybrid of chest and head voice. Biologically there really isn’t much here or there is a lot here. We have two muscles the Cricothyroid and the Thyroarytenoid muscles and they are constantly at odds with each other while you are singing, with that anatomical view there is Mode 1 or Mode 2, and no Mode 1 1/2. From another view point, the only register that really exists is Mixed because all phonation is some combination.


So What?

Okay, it’s opinion time. The question really is does it matter what terms we use? Yes, and no. On the surface, whatever term helps a student is the right term to use and none of the terms I’ve been talking about are “wrong”. Where it starts to matter is that I have had the pleasure of working with a lot of teenage and young adult students that don’t understand that pitch is created at the vocal folds and not in the chest or somewhere in their head when they come to my studio the first time. I’ve met singing teachers that don’t know what the cricothyroid and thyroarytenoid muscles are or how they work; to me that is a problem.

Here’s the breakdown.

Chest & head voice; I’m solidly not a fan of these terms being used as the primary terminology that a singer hears. I think it is a great concept to bring a students attention to notice what they are feeling; but, these terms get us too far from the reality that all pitch is created at the vocal folds, which is a fundamental building block for understanding the vocal instrument.

Cricothyroid Dominant & Thyroarytenoid Dominant Vocal Production. A+ for accuracy and a D for usability. So on the simplest level try saying either of those five times fast; one of them is like fourteen syllables, it takes longer to say the term than to sing most phrases in contemporary music. I mentioned this earlier but these terms can also get incredibly complicated when we get into the realm of everything between a purely Cricothyroid or Thyroarytenoid Dominant production. Still Voice’s Thick & Thin have the same challenge for me. I have had quite a few conversations about thicker thin folds and thinner thick folds which feels needlessly complicated. And yes, us good evidence based teachers could pull out a spectrogram and decide which it is, but is that the best use of time?

Falsetto as a term should probably just go away.

Whistle, I don’t really have an opinion on. I think more research needs to go into what is whistle register and if it is its own thing and how it happens. I have significant reservations about it due to its use with such a small subset of singers. Along with falsetto I am concerned that the term is potentially inaccurate and only applied to a specific gender.


Mode 0, 1, & 2, I think this might be the best that we have right now. These terms refer to conditions that are specific but vague enough to not cause too much trouble, they are easy enough to say for a younger student, they require explanations that allow for the sharing of vocal science where appropriate, and they are not style or gender specific, looking at you falsetto. If you need a way to describe Mode 1 to a student try part of all of this: Mode 1, speech like quality, rich in harmonics, created by a thickening of the vocal folds caused by the thyroarytenoid muscles, you might feel a vibrating sensation in your chest, can be hard to take Mode 1 into a higher pitch range and will require a reduction in intensity to do so, the closed quotient is close to 50% creating significant amplitude (volume).


So there you have it a quick primer on a bunch of terms and my two cents about what’s best.

Josh Manuel

Josh Manuel, a voice instructor and founder of VoiceScience, is dedicated to empowering singers by providing evidence-based techniques and knowledge for enhanced performance and vocal health. His expertise and passion in the field of vocal science have made him a trusted resource for singers seeking to improve their skills and achieve their full potential.

Our goal at VoSci is to provide the most accurate and up to date information available on the internet for singers and teachers. While we strive for perfection, there is a lot of misinformation available and new information that becomes available every day. If you find information on this page or any page on VoSci that you believe is out of date please let us know using our contact form so we can look into it.