Your voice is produced by two small folds of tissue in your throat vibrating as air from your lungs passes between them. That vibration creates a raw buzzing sound, which then gets shaped into your recognizable voice by the spaces in your throat, mouth, and nose. The whole process involves three systems working together: your breathing, your vocal folds, and your vocal tract.
Air Pressure Starts Everything
Voice production begins in your lungs. When you exhale to speak, your diaphragm and rib muscles push air upward through your windpipe. This creates a column of pressurized air beneath your vocal folds, called subglottal pressure. Without enough air pressure, your vocal folds won’t vibrate at all. This is why running out of breath makes your voice weak or cuts it off entirely, and why taking a deep breath before speaking or singing gives you more power and control.
The pressure buildup happens quickly. Muscles in the larynx begin activating 40 to 90 milliseconds before any sound comes out, coordinating with the respiratory system to time everything precisely. Speaking and singing are, at their foundation, controlled acts of exhalation.
How Your Vocal Folds Create Sound
Your vocal folds (sometimes called vocal cords) sit inside your larynx, the structure you can feel at the front of your throat. They’re not simple strings. Each fold has five distinct layers: a thin outer lining, a gel-like layer that ripples easily, two deeper layers that form a supportive ligament, and a muscle that makes up the bulk of the fold. This layered design is what allows the folds to vibrate in complex wave-like patterns rather than just snapping open and shut.
When you breathe normally, your vocal folds stay open to let air pass silently. When you speak, small muscles pull the folds together so they nearly touch. Air pressure from below pushes them apart, and as air rushes through the narrow gap, a drop in pressure (the same principle that lifts airplane wings) pulls them back together. This cycle repeats rapidly, chopping the airstream into tiny pulses of sound. In a man’s conversational voice, this happens about 115 times per second. In a woman’s, about 200 times per second.
The range is far wider than those averages suggest. Men’s vocal folds can vibrate from about 90 to 500 times per second, while women’s can range from 150 to 1,000. Children and the highest sopranos can push close to 2,000 vibrations per second. At the low end, around 60 vibrations per second, you get the deepest bass tones humans can produce.
What Controls Pitch
Two muscles do most of the work when you change pitch. One stretches the vocal folds longer and thinner, increasing their tension and stiffness, which makes them vibrate faster and produces a higher pitch. Think of tightening a guitar string. The other muscle, which forms the body of the vocal fold itself, shortens and thickens the folds, loosening the outer layers and generally lowering pitch.
The relationship between these two muscles is surprisingly complex. Stretching the folds makes them longer, which by itself would lower pitch (longer strings vibrate more slowly). But the increase in stiffness and tension more than compensates, so the net effect is a higher note. Your brain coordinates these opposing forces automatically every time you raise your voice to ask a question or drop it to sound serious.
How Your Vocal Tract Shapes the Sound
The raw sound your vocal folds produce is a rough buzz, not much different from one person to the next. What transforms that buzz into a recognizable voice is everything above the folds: your throat, mouth, tongue, palate, lips, and nasal cavity. These spaces act as a filter, amplifying some frequencies and dampening others depending on their shape and size.
This is known as source-filter theory. The vocal folds are the source, and the vocal tract is the filter. Different configurations of your tongue, jaw, and lips create different filter patterns, which is how you form vowels and consonants. Say “ee” and then “oo” slowly, and you can feel your tongue and lips completely rearrange the shape of your mouth, changing which frequencies get boosted.
Certain parts of the vocal tract have specific effects. Narrowing the space just above the vocal folds (near structures called the aryepiglottic folds) boosts frequencies in the 2,000 to 4,000 Hz range, giving the voice a bright, carrying quality. This is the “ring” that trained singers develop to project over an orchestra. Constrictions in the mouth or at the lips tend to weaken those same frequencies, producing a duller or more muffled sound.
Why Every Voice Sounds Different
Your voice is unique because no one else has your exact combination of vocal fold size, throat length, mouth cavity dimensions, and nasal passages. Larger throat and mouth cavities produce a darker, richer tone. Smaller cavities produce a brighter, higher sound. This is why body size loosely correlates with voice quality: a very tall person tends to have a longer vocal tract and larger resonating spaces, while a smaller person has a shorter tract that emphasizes higher frequencies.
Vocal fold thickness and length also play a role. Longer, thicker folds vibrate more slowly, producing a lower fundamental pitch. But the shape of your resonating spaces matters just as much as pitch in determining how you sound. Two people can speak at the same pitch and still sound completely different because their vocal tracts filter the sound differently. It’s the same reason a trumpet and a clarinet can play the same note but sound nothing alike.
How Puberty Changes the Voice
During puberty, hormones (primarily testosterone) cause the larynx to grow and the vocal folds to lengthen. In boys, this change is dramatic. Research tracking boys through puberty found a gradual increase in vocal fold length across developmental stages, with a corresponding drop in pitch. But the most noticeable voice change, the “cracking” period, doesn’t line up neatly with fold length alone. The sharpest drop in pitch happens between mid and late puberty, and it appears to be driven more by changes in the mass and internal structure of the folds than by length alone.
Girls experience a more modest version of the same process. Their vocal folds lengthen less, and the pitch drop is smaller, which is why it often goes unnoticed. By adulthood, men’s vocal folds are typically longer and thicker than women’s, which is the primary reason for the average pitch difference between male and female voices.
How Hydration Affects Your Voice
The gel-like surface layer of your vocal folds needs moisture to vibrate efficiently. When you’re dehydrated, your vocal folds become stiffer, and it takes more air pressure to get them moving. Severe dehydration can increase the effort needed to start vibrating by about 23%, while also raising pitch and reducing vocal efficiency. The effect is most noticeable at higher pitches, where the folds need to vibrate fastest.
Surface-level hydration (breathing in steam or using nebulized saltwater) can reduce the effort needed for phonation by 9% to 22% within 30 minutes. Systemic dehydration, from not drinking enough water, takes much longer to affect the voice (5 to 12 hours) and also takes longer to reverse. For everyday purposes, staying generally well-hydrated keeps the vocal folds supple and reduces strain, especially if you use your voice heavily for work or singing.
Volume and Projection
Getting louder involves two things: pushing more air pressure through the vocal folds, and closing the folds more firmly so they stay together longer during each vibration cycle. The muscles that press the folds together (called adductors) are key here. Stronger closure means the folds resist the air pressure longer before popping open, creating a more forceful burst of sound with each cycle. Whispering, by contrast, involves keeping the folds only loosely together so air leaks through continuously without clean vibration.
Projection, the ability to be heard at a distance without shouting, comes largely from vocal tract tuning. Adjusting the spaces in your throat to boost frequencies in the 2,000 to 4,000 Hz range, where human hearing is most sensitive, lets a voice cut through background noise. This is a technique, not just anatomy, which is why trained speakers and singers can fill a room without straining.

