How Do Babies Talk? The Science of Speech Development

Babies learn to talk through a gradual process that starts long before their first word. It begins with cries and coos in the first weeks of life, progresses through months of babbling, and typically reaches recognizable words around the first birthday. The journey involves physical changes in the throat and tongue, rapid brain development, and constant social interaction with caregivers.

What Happens in the Body

A newborn’s vocal equipment looks nothing like an adult’s. The larynx (voice box) sits high in the throat, tucked between the first and fourth neck vertebrae, and the vocal folds are only about 7 millimeters long. Over the first six months, the larynx begins descending, creating more space in the throat for shaping different sounds. This descent continues for years, reaching an adult position only around age 12.

The internal structure of the vocal folds changes too. At birth, they have a single uniform layer of tissue. By about two months, a second layer forms. By five months, that two-layer structure is more developed. A full three-layer architecture, the kind adults use to produce rich, controlled speech, doesn’t appear until around age seven.

The tongue undergoes its own transformation. A newborn’s tongue mostly moves forward and back, like a piston. It has less fat and soft tissue than an adult tongue, with relatively large muscles connecting it to surrounding structures. As the baby grows, the tongue gains the ability to move up and down and to curl its tip independently. These finer movements are essential for producing consonants like “t,” “d,” and “l.” The front of the tongue develops faster-contracting muscle fibers suited for quick, precise movements, while the back of the tongue maintains slower fibers that provide stability.

The Timeline From Coos to Sentences

In the first two months, babies are limited to crying and vegetative sounds like burps and hiccups. Between two and four months, cooing begins. These soft vowel-like sounds (“ooh,” “aah”) are a baby’s first experiments with using their voice on purpose.

Around six to seven months, most babies start canonical babbling, the repetitive consonant-vowel chains like “bababa” or “mamama.” This is a major shift because it means the baby is coordinating their lips, tongue, and voice in a rhythmic pattern that mirrors the structure of real speech. Between seven months and one year, babbling becomes more varied, mixing different syllable types together (“bagido”), and babies begin to understand words for common objects like “cup,” “shoe,” or “juice” well before they can say them.

Most children produce one or two recognizable words by their first birthday, typically “hi,” “dog,” “dada,” or “mama.” Between 12 and 18 months, new words accumulate steadily. By 18 months, most children can say 50 to 100 words. Between one and two years, toddlers begin combining words into simple phrases (“more cookie,” “where kitty?”) and start asking two-word questions (“go bye-bye?”).

Understanding Comes Before Speaking

One of the most important things to know is that babies understand far more than they can say. A one-year-old who only says two words may already recognize dozens. This gap between comprehension and production is completely normal and persists throughout early childhood. Before they can form words, babies communicate through gestures like pointing, reaching, and waving. These gestures are not just cute; they’re an active part of language development and a sign that a child is building the cognitive framework for speech.

How the Brain Wires Itself for Language

Brain imaging of toddlers as young as 19 months shows that dedicated language-processing areas are already active in the left side of the brain. Both frontal regions (involved in producing language) and temporal regions (involved in understanding it) respond more strongly to comprehensible speech than to scrambled or backward audio. This was somewhat surprising to researchers, who had previously hypothesized that the temporal (understanding) areas would develop first and frontal (production) areas would follow later. Instead, the frontal component appears to be in place early, suggesting the brain’s language network is organized in a broadly adult-like pattern well before a child can speak in full sentences.

Why Baby Talk From Parents Actually Works

That high-pitched, sing-songy way adults naturally speak to babies, sometimes called parentese, is more than instinct. It has specific acoustic features that help infants learn language. Compared to normal adult conversation, parentese uses higher overall pitch, a wider pitch range, slower speech rate, exaggerated intonation, shorter sentences, and longer pauses between phrases. Studies across cultures (including the U.S., Sweden, and Russia) show that mothers speaking parentese produce more extreme vowel sounds, essentially stretching out the differences between vowels so babies can better distinguish the building blocks of words.

These exaggerated features serve concrete purposes. The slower pace and pauses help babies identify where one word ends and another begins. Exaggerated pitch peaks at the ends of sentences highlight important words. The simpler vocabulary and shorter sentences give infants a manageable amount of information to process. Research has shown that parentese helps babies with word segmentation, word recognition, and learning new word meanings. A longitudinal study published in the Journal of Child Language found that the amount of parentese infants heard predicted their language complexity and conversational ability at age five.

Bilingual Babies Stay on Track

Parents raising children with two languages sometimes worry that hearing both will cause delays. Research consistently shows this isn’t the case. Bilingual children reach their first words around 12 months, just like monolingual children. By 18 months, they typically produce 50 to 100 words, though that total may include words from both languages. They fall within the normal developmental range in at least one of their languages. A bilingual toddler who says “water” in one language and “leche” in another isn’t behind; they’re building two vocabularies simultaneously.

Signs of a Speech Delay

Because the timeline for speech varies from child to child, it helps to know which specific missing behaviors are genuine red flags rather than normal variation:

  • By 12 months: not using gestures like pointing or waving bye-bye
  • By 18 months: preferring gestures over sounds to communicate, or having trouble imitating sounds
  • By 2 years: only imitating speech rather than producing words spontaneously, saying only a few sounds or words on repeat, unable to follow simple directions, or having an unusual voice quality (raspy or nasal)

A baby who doesn’t respond to sound or vocalize at all should be evaluated promptly. As a general benchmark for clarity, parents and regular caregivers should be able to understand about 50% of a child’s speech at age two, 75% at age three, and most of it by age four, even for people who don’t know the child well.