← All Articles

Technique · April 15, 2026 · 12 min read

Estill Voice Training: What It Is, How It Works, and Why It Matters

TL;DR

Estill Voice Training (EVT) is a voice training system developed by Jo Estill that identifies and isolates the individual structures of the vocal mechanism — from vocal fold mass to larynx height to aryepiglottic sphincter width. Instead of teaching 'styles,' it teaches control of individual structures, allowing singers to build any vocal quality on demand. The system is based on 13 compulsory figures (isolated controls) that combine into 6 voice qualities (Speech, Falsetto, Sob/Cry, Twang, Opera, Belt).

The System That Changed How We Understand the Voice

In the early 1980s, American voice researcher Jo Estill asked a question that no one had asked systematically before: *What are the individual, independently controllable parts of the vocal mechanism?*

The prevailing approach to voice training was aesthetic: learn to sing in a particular style (classical, musical theater, pop) by copying the sound. If the student couldn't reproduce it, they were told to "feel" it differently, "support" more, or "place" the sound somewhere else.

Jo Estill wanted something more precise. She spent decades with laryngoscopes, EMG sensors, and acoustic analysis tools, mapping which structures in the vocal tract could be independently moved and what each movement did to the sound.

The result: **Estill Voice Training** — a systematic, anatomy-based approach to voice production that identifies 13 independently controllable structures (called "figures") that can be combined into any vocal quality.

The 13 Compulsory Figures

Each figure is a specific laryngeal or vocal tract structure that can be consciously controlled. Learning to isolate each one is the foundation of the Estill system:

Laryngeal Figures:

1. **Vocal Fold Mass** — Thick (TA-dominant, chest quality) vs. Thin (CT-dominant, head quality) vs. Stiff (falsetto) vs. Slack (fry) 2. **Vocal Fold Closure** — Loose (breathy) vs. Balanced (efficient) vs. Pressed (tight, compressed) 3. **True Vocal Fold Body-Cover** — Independent control of the fold body (muscle) vs. cover (mucosa) 4. **Cricoid Tilt** — The tilt of the cricoid cartilage relative to the thyroid, which stretches the folds for pitch change 5. **Thyroid Tilt** — Forward tilt of the thyroid cartilage, creating the "cry" or "sob" quality 6. **Larynx Height** — High (bright, twangy), Mid (neutral), Low (dark, operatic) 7. **Aryepiglottic Sphincter (AES)** — Narrowed (twang, brilliance, carrying power) vs. Wide (softer, darker)

Vocal Tract Figures:

8. **Tongue** — Forward, neutral, or retracted position affecting formant frequencies 9. **Velum (Soft Palate)** — High (oral resonance), Mid (balanced), Low (nasal resonance) 10. **Jaw** — Open, mid, or closed affecting F1 and oral space 11. **Lips** — Protruded (darker, rounded) vs. Spread (brighter)

Power Figures:

12. **Head and Neck Anchoring** — Engagement of the sternocleidomastoid and other neck muscles for laryngeal stability 13. **Torso Anchoring** — Engagement of the torso muscles (lats, intercostals, abdominals) for breath support and vocal power

The Genius of the System

What makes Estill revolutionary is the *independence* principle. Each figure is learned in isolation before being combined. This means:

  • •You can learn to raise your larynx *without* changing your fold mass
  • •You can narrow your AES *without* tensing your jaw
  • •You can tilt your thyroid *without* losing fold closure

This precision is impossible with "style-based" teaching, where everything changes at once and the student doesn't know which variable produced which result.

Estill doesn't teach you how to sound. It teaches you *how to control the parts*. The sound is your choice.

The Six Voice Qualities

By combining specific figure positions, Estill identifies six foundational voice qualities that cover the vast majority of commercial and classical singing:

1. Speech Quality - **Folds**: Thick, balanced closure - **Larynx**: Mid - **AES**: Mid/wide - **Thyroid**: Neutral (no tilt) - **Sound**: Natural, conversational, "talking on pitch" - **Used in**: Singer-songwriter, folk, conversational pop

2. Falsetto Quality - **Folds**: Stiff (edges only vibrating) - **Larynx**: Mid to high - **AES**: Wide - **Thyroid**: Tilted (CT engaged) - **Sound**: Airy, light, disconnected - **Used in**: R&B falsetto passages, Jeff Buckley-style singing

3. Sob/Cry Quality - **Folds**: Thin, tilted thyroid - **Larynx**: Mid to slightly lowered - **AES**: Mid - **Thyroid**: Tilted forward (the key element) - **Sound**: Emotional, vulnerable, warm — the "cry" in the voice - **Used in**: Ballads, emotional passages, Adele's softer moments

4. Twang Quality - **Folds**: Any mass (thick or thin) - **Larynx**: Mid to high - **AES**: Narrowed (the defining feature) - **Thyroid**: Variable - **Sound**: Bright, piercing, carrying, "witchy" - **Used in**: Country, musical theater, pop belting (as a component), cutting through a band mix

5. Opera Quality - **Folds**: Thick to mid - **Larynx**: Low - **AES**: Narrowed (singer's formant) - **Thyroid**: Tilted - **Velum**: High - **Anchoring**: Head, neck, and torso fully engaged - **Sound**: Powerful, round, dark-but-brilliant, carrying without amplification - **Used in**: Classical opera, art song, choral soloism

6. Belt Quality - **Folds**: Thick (TA-dominant) - **Larynx**: High - **AES**: Narrowed (essential for power and safety) - **Thyroid**: Tilted (prevents damage) - **Anchoring**: Full head/neck and torso anchoring - **Sound**: Powerful, chest-like quality at high pitches, "Broadway" sound - **Used in**: Musical theater, pop, rock, gospel

Why Every Singer Should Know Estill

1. Genre Freedom

With Estill training, you're not locked into one style. You can belt a Broadway number, then switch to a classical art song, then sing a breathy R&B verse — because you understand the *structural recipe* for each quality. Most traditionally trained singers can only produce one or two of the six qualities reliably.

2. Vocal Health

Because Estill identifies *exactly* what each structure is doing, it's much easier to spot and correct harmful patterns. Pressing? That's a closure issue. Straining on high notes? That's a missing thyroid tilt. Losing your voice after performances? That's likely missing AES narrowing in belt, causing you to over-rely on fold mass.

3. Efficient Learning

Instead of vague instructions like "make it warmer" (which could mean lower larynx, darker vowel, more thyroid tilt, or thicker folds), an Estill-trained teacher can identify the *specific structure* that needs adjustment. This cuts learning time dramatically.

4. Scientific Foundation

Estill is grounded in decades of laryngoscopic research. Every claim about what a structure does can be verified on camera. This makes it one of the few vocal training systems that passes the scientific standard: observable, repeatable, falsifiable.

How to Get Started

Self-Study Route: 1. Learn the anatomy basics (this article + "How Your Vocal Folds Work") 2. Practice isolating each figure one at a time — tongue forward/back, larynx high/low, jaw open/closed 3. Explore the voice qualities by following the "recipes" above 4. Record yourself and compare: can you hear the difference between your twang and your sob quality?

Formal Training Route: 1. Attend an Estill workshop (Level 1: Figures for Voice) 2. Work with an Estill-certified instructor for personalized guidance 3. Practice the 13 figures daily until each becomes automatic 4. Progress to Level 2: Figure Combinations for Voice Qualities

The Takeaway

Estill Voice Training is not a style. It's a *language* for talking about voice production. It gives you the vocabulary to describe what you're doing, the anatomy to understand why it works, and the methodology to reproduce it reliably.

In a world of vague instructions and recycled metaphors, Estill offers precision. And precision is what separates a singer who sounds good sometimes from a vocal athlete who sounds good on demand.

Frequently Asked Questions

What is Estill Voice Training?

Estill Voice Training (EVT) is a voice training system created by American voice researcher Jo Estill. It breaks vocal production into individual, independently controllable components (called 'figures') — such as vocal fold mass, larynx height, tongue position, and velum position. By learning to control each structure independently, singers can build any vocal quality on demand. It's used by singers, actors, speech therapists, and voice coaches worldwide.

What are the Estill Voice Qualities?

Estill Voice Training identifies six foundational voice qualities, each defined by a specific recipe of structural positions: (1) Speech — neutral, everyday quality; (2) Falsetto — light, airy, CT-dominant; (3) Sob/Cry — tilted larynx, thin folds, emotional quality; (4) Twang — narrowed aryepiglottic sphincter, bright and carrying; (5) Opera — combination of twang, lowered larynx, and high velum; (6) Belt — thick folds, high larynx, narrowed AES, powerful contemporary quality.

How is Estill different from other vocal methods?

Most vocal methods teach styles, genres, or aesthetics ('sing classically,' 'belt like this'). Estill teaches individual structural control — the building blocks that underlie ALL styles. This means an Estill-trained singer can produce any genre's characteristic sound by combining the right structural positions, rather than being locked into one aesthetic. It's also uniquely grounded in laryngeal anatomy and stroboscopic research.

How long does it take to learn Estill Voice Training?

The foundational course (Estill Figures for Voice) is typically taught as an intensive workshop over 3-5 days. However, internalizing the 13 figures to the point of automatic control takes 6-12 months of regular practice. The system is structured in levels: Level 1 (Figures for Voice — individual controls), Level 2 (Figure Combinations for Voice Qualities), and advanced work. Certification as an Estill Mentor or Master Trainer requires additional study.

Related Articles

→ how vocal folds work→ science of resonance→ how to belt safely

Ready to train your voice with science-backed precision?

Apply to Vox Method →
ID

Isarah Dawson

Founder, Vox Method