How Do Humans Make Sound? From Air to Speech

Humans produce sound by pushing air from the lungs through two small folds of tissue in the throat, causing them to vibrate rapidly. These vibrations create a raw buzzing tone that gets shaped into recognizable speech by the throat, mouth, nose, lips, and tongue. The whole process involves dozens of muscles coordinated by multiple brain regions, all working together in fractions of a second.

Air Pressure: The Engine Behind Your Voice

Every sound you make starts with a breath. Your diaphragm, a dome-shaped muscle beneath your lungs, contracts and pushes air upward through the windpipe toward the larynx (your voice box). This airstream is the fuel for your voice. Without sufficient air pressure building up below the vocal folds, no sound can be produced.

During normal speech, the air pressure below the vocal folds typically ranges from 600 to 1,200 pascals. That’s a surprisingly small amount of pressure, roughly equivalent to the force you’d feel blowing up a balloon. But it’s enough to set the vocal folds into motion and sustain vibration for as long as you keep exhaling. Louder speech requires more air pressure, which is why shouting makes you run out of breath faster.

How the Vocal Folds Create Sound

The vocal folds (often called vocal cords) are two bands of tissue stretched across the inside of your larynx. When you’re breathing quietly, they stay open to let air pass freely. When you speak, muscles pull them together until they’re nearly touching, closing off the airway.

As air pressure builds beneath the closed folds, it eventually becomes strong enough to push them apart. Air rushes through the narrow gap, and something interesting happens: the fast-moving air creates a drop in pressure between the folds (the same physics principle that helps airplane wings generate lift). This suction, combined with the natural springiness of the tissue, snaps the folds back together. Pressure builds again, the folds blow apart again, and the cycle repeats. This open-close-open-close pattern happens incredibly fast. In an average male voice, the folds vibrate about 140 times per second. In an average female voice, roughly 230 times per second. Children’s vocal folds vibrate even faster.

The vibration isn’t as simple as two flaps swinging open and shut. Different parts of each fold move at slightly different times, creating a rippling wave that travels across the surface. This ripple effect is what keeps the vibration going. Without it, the folds would just blow open and stay open. The wave-like motion ensures that the folds are shaped differently when opening than when closing, which allows energy from the airstream to keep feeding into the vibration cycle after cycle.

How Your Throat, Mouth, and Nose Shape the Sound

The buzzing produced by the vocal folds alone sounds nothing like speech. It’s a raw, flat tone, similar to the noise you’d get from pressing your lips together and blowing. That tone needs to be filtered and amplified before it becomes your recognizable voice.

This is where your resonating chambers come in. After leaving the vocal folds, sound travels upward through the throat (pharynx), into the mouth (oral cavity), and potentially into the nasal passages. Each of these spaces acts like the body of a guitar, amplifying certain frequencies and dampening others. The hard palate and teeth reflect acoustic energy, boosting the volume of sound. The size and shape of your throat and mouth determine which frequencies get emphasized, giving your voice its unique tone and character.

Your soft palate, the fleshy area at the back of the roof of your mouth, acts as a gate between the oral and nasal cavities. When it lifts, it seals off the nasal passage, directing all sound through your mouth. When it lowers, some sound energy flows through your nose, which is what gives sounds like “m,” “n,” and “ng” their distinctive humming quality. If you pinch your nose while saying “mama,” you’ll immediately feel the difference.

Turning Sound Into Speech

Resonance gives your voice its tone, but articulation gives it meaning. Your tongue, lips, teeth, jaw, and palate work together to carve the continuous stream of sound into distinct vowels and consonants.

Vowels are the simplest to produce. They require a relatively open vocal tract with no major obstructions. What distinguishes one vowel from another is the position of your tongue (how high or low, how far forward or back) and whether your lips are rounded or spread. Say “ee” and then “oo” slowly, and you’ll feel your tongue shift position while your lips change shape.

Consonants involve creating some kind of obstruction in the airflow:

  • Complete closure: Sounds like “p,” “b,” and “m” are made by pressing both lips together, briefly stopping the air completely before releasing it. “T” and “d” work the same way, but with the tongue pressed against the ridge behind your upper teeth.
  • Narrow gap: Sounds like “f” and “v” are produced by pushing air through a tight space between your lower lip and upper teeth, creating turbulence. The “th” sounds in English use the tongue tip against the upper teeth.
  • Near contact: Sounds like “w,” “r,” “l,” and “y” are made by bringing two surfaces close together without creating friction. The air flows through smoothly, but the shape of the passage changes the sound.

Every consonant involves an active part (usually the tongue or lower lip) moving toward a stationary part (usually the teeth, palate, or upper lip). The location and degree of that contact is what makes a “k” sound different from a “t” sound, even though both involve a complete momentary blockage of air.

How You Control Pitch and Volume

Pitch and volume are controlled by two different mechanisms, though they often change together in natural speech.

Pitch is primarily determined by how tense your vocal folds are. A small muscle in the larynx tilts one cartilage against another, stretching the vocal folds tighter. Greater tension causes faster vibration, which produces a higher pitch, exactly the way tightening a guitar string raises its note. This is the muscle that works constantly during speech to create the rising and falling intonation patterns that convey meaning. It’s also the muscle singers rely on to hit specific notes, and they need to set the correct tension before they even start exhaling.

Volume is controlled mainly by air pressure. Pushing more air through the vocal folds with greater force makes them vibrate with larger movements, producing a louder sound. The vocal folds also press together more firmly during loud speech, which changes the quality of the vibration and contributes to the perception of increased intensity.

The Brain’s Role in Coordinating It All

Speaking requires your brain to coordinate the muscles of your diaphragm, larynx, throat, tongue, lips, and jaw simultaneously, adjusting them in real time based on what you hear coming out. This coordination involves several brain regions working as a connected system.

The process starts in areas of the frontal lobe, particularly a region known as Broca’s area and the neighboring premotor cortex, which together form what researchers call the speech sound map. This map stores the motor plans for producing individual speech sounds. When you decide to say a word, these regions activate the correct sequence of muscle commands, which flow to the primary motor cortex and then down to the muscles themselves.

But the system doesn’t just fire commands blindly. Your brain constantly monitors the sound you’re producing through your ears and the physical sensations in your mouth and throat. Auditory cortex in the temporal lobe and sensory cortex in the parietal lobe feed information back to the motor system, allowing instant corrections. This feedback loop is why hearing loss can gradually change a person’s speech patterns, and why speaking with a numb mouth after a dental visit feels so clumsy. The cerebellum and other structures beneath the cortex also contribute, helping to fine-tune timing and smooth out movements.

Why Voices Change During Puberty

The dramatic voice changes of puberty, especially in males, come down to physical growth of the larynx. Hormones during puberty cause the laryngeal cartilages to enlarge, and in males this growth is disproportionately greater in the front-to-back direction. This elongates the vocal folds, and longer folds vibrate more slowly, producing a deeper voice. Adult male and female larynxes differ in size by 10 to 50% depending on the specific measurement, which accounts for the typical pitch difference between male and female voices. The visible “Adam’s apple” in many men is simply the front edge of the enlarged thyroid cartilage pressing against the skin of the neck.