What Is Connectionism? Neural Networks and the Brain

Connectionism is a theory in cognitive science that explains how the mind works using networks of simple, interconnected units modeled loosely on neurons in the brain. Rather than thinking of the brain as a computer that follows step-by-step rules, connectionism proposes that intelligence emerges from millions of small processing units working simultaneously, with the strength of connections between them determining what the network “knows.”

How Connectionist Networks Work

A connectionist network is built from three basic ingredients: units (which stand in for neurons), connections between those units, and numerical weights that control how strongly one unit influences another. Information flows through the network in layers. Input units receive raw data, such as the pixels of an image or the sounds in a word. Those activation values pass forward to a middle layer of “hidden” units, which combine and transform the signals. Finally, the result reaches output units that produce the network’s response.

The key insight is that knowledge isn’t stored in any single unit. Instead, it lives in the pattern of connection weights spread across the entire network. This is called a distributed representation. When you recognize your mother’s face, no single neuron fires to say “that’s Mom.” A whole constellation of units activates in a specific pattern, and that pattern is the representation. Distributed representations are actually the natural product of how connectionist networks learn, not something engineers have to design in deliberately.

This architecture makes connectionist networks especially good at problems that require juggling many competing constraints at the same time. Reading a handwritten word, for example, means simultaneously weighing letter shapes, spacing, context, and expectations about what word makes sense in a sentence. All of those constraints get resolved in parallel rather than checked one by one.

How the Network Learns

Connectionist networks learn by adjusting connection weights, not by being programmed with explicit rules. The most influential learning method is called backpropagation. It works like this: the network makes a guess, compares that guess to the correct answer, and then sends an error signal backward through the layers. Each connection weight gets nudged slightly in whichever direction reduces the error. Repeat this thousands or millions of times with different examples, and the network gradually becomes accurate.

The technique was originally developed in the early 1960s but didn’t become widely known until 1986, when David Rumelhart, Geoffrey Hinton, and Ronald Williams showed it could train networks with hidden layers. Before that, researchers had no reliable way to adjust the internal connections of a multi-layer network, which severely limited what neural networks could do. Backpropagation solved what’s known as the credit-assignment problem: figuring out which specific connection, buried deep in the network, deserves blame for an error at the output.

This learning process has a rough biological parallel. In the brain, the strength of a synapse between two neurons changes based on the activity of the sending neuron, the receiving neuron, and a chemical error signal like dopamine. Connectionist learning rules mirror this three-way interaction: the update to a connection weight depends on the activity of the unit before the connection, the unit after it, and the error signal flowing back through the network.

The 1986 Breakthrough

Connectionism as a movement crystallized around the 1986 publication of “Parallel Distributed Processing” by David Rumelhart, James McClelland, and the PDP Research Group at MIT Press. These two volumes argued that the massively parallel architecture of the human brain, not step-by-step symbol manipulation, holds the key to understanding intelligence. The work was influential enough that Rumelhart and McClelland received the 2002 Grawemeyer Award for Psychology for their contributions.

The PDP volumes covered everything from how networks could learn past tenses of English verbs to how they could store and retrieve memories. Geoffrey Hinton, a co-author on several chapters, went on to become one of the central figures in deep learning, the direct descendant of connectionist research that now powers modern AI systems like image recognition, language translation, and large language models.

Connectionism vs. Symbolic AI

Before connectionism gained traction, the dominant view in cognitive science and artificial intelligence was symbolic, sometimes called “Good Old-Fashioned AI.” Symbolic systems represent knowledge as explicit rules and structured symbols, then manipulate those symbols using logical operations. Think of a chess program that stores the rules of chess and searches through possible moves, or an expert system that chains together if-then statements to diagnose a disease.

Connectionist systems take the opposite approach. Instead of programming rules from the top down, they learn patterns from the bottom up by training on data. A symbolic system is transparent: you can inspect its rules and trace exactly why it made a decision. A connectionist system is opaque: the knowledge is smeared across thousands of connection weights with no easy way to extract a simple explanation. This tradeoff between power and interpretability remains one of the central tensions in AI. Modern large language models, which are direct descendants of connectionist architecture, are remarkably capable but notoriously difficult to explain.

The two approaches also fail in very different ways. Damage a single component in a traditional computer program, and the whole thing can crash. Connectionist networks degrade gracefully: remove some units or feed in noisy input, and the network’s output becomes slightly less accurate rather than completely wrong. This mirrors what happens in the brain, where small amounts of damage typically cause subtle deficits rather than total system failure.

Modeling Brain Damage

One of the most practical uses of connectionist models in cognitive science is simulating what happens when the brain is injured. Researchers build a network that performs some cognitive task (reading aloud, naming objects, recognizing faces), train it until it works well, and then deliberately damage it by removing units or weakening connections. This “artificial lesioning” lets them test whether the resulting errors match what clinicians see in real patients with brain damage.

These simulations have been used to study conditions like aphasia (language impairment after stroke) and reading disorders. They’ve also helped researchers evaluate some long-standing assumptions in neuropsychology, particularly around double dissociations, where two patients show opposite patterns of impairment. Connectionist lesion studies have shown that these patterns don’t always require separate brain modules, as traditionally assumed. Sometimes a single network damaged in different ways can produce both patterns.

Why Connectionism Still Matters

Connectionism reshaped how scientists think about the mind. Before the PDP revolution, the brain was widely treated as a biological computer running something like software. Connectionism offered a different metaphor: intelligence as the emergent behavior of vast numbers of simple units learning from experience. You don’t program a connectionist network to recognize a cat; you show it thousands of cats and let the connection weights self-organize until the network figures it out.

The deep learning systems that dominate AI today are scaled-up versions of the networks connectionist researchers were building in the 1980s. They use the same principles: layers of units, weighted connections, backpropagation, and distributed representations. The difference is scale. Where early connectionist models had dozens or hundreds of units, modern systems have billions. The core ideas, that knowledge lives in connection strengths, that learning means adjusting those strengths, and that intelligence can emerge from simple components working in parallel, have proven remarkably durable.