Simpson's Paradox I See Around Me

By
Compress 20260627 201325 5812

The ceiling fan in my study has three speeds, and I have spent the better part of a humid June afternoon trying to determine whether the slowest setting actually moves more air than the fastest, or whether the illusion of stillness at high speed is itself a kind of statistical trick played by the eye upon the mind. The blades blur into a grey disc, and the room feels no cooler. I switch to the lowest setting and the individual blades become visible again, each one carving a deliberate arc through the thick Calcutta air, and somehow—impossibly—the breeze seems more present, more real, more there. This is not physics. This is Simpson’s paradox made domestic.

I have worked as a statistician but really I am a software engineer of middling ambition who lives in a shabby flat in South Calcutta, where the monsoon has been tardy this year and the municipal water supply runs for exactly forty-seven minutes each morning, a duration I have timed with the obsessive precision of a man who has too much time and too little faith. But Simpson’s paradox has lodged itself in my consciousness like a splinter I cannot extract, and I see it now in everything: in the aggregate data of my electricity bill, in the demographic shifts of my neighborhood, in the way the city itself seems to improve and deteriorate simultaneously depending on which lens one applies.

The paradox, for those fortunate enough to have avoided it, is deceptively simple. A trend appears in several different groups of data but disappears or reverses when these groups are combined. The classic example, dredged up from every undergraduate statistics course, involves a university’s admission rates: Department A admits sixty percent of male applicants and fifty percent of female applicants; Department B admits twenty percent of males and twenty-five percent of females. In each department, women fare better or comparably. Yet when the numbers are aggregated, men appear to have higher overall admission rates because they apply disproportionately to the less selective Department A. The whole tells a lie that the parts refuse to corroborate. It is not a trick of mathematics so much as a revelation of how aggregation can obscure the very truths it purports to summarize.

I see this everywhere now. I see it in the newspaper I read this morning, which reported that India’s GDP growth remains robust while simultaneously noting that youth unemployment has reached levels not seen in decades. Both statements are true. Neither contradicts the other. But placed side by side, they produce a cognitive dissonance that no amount of economic jargon can resolve. The aggregate number glows; the disaggregated reality festers. I see it in the tech news that OpenAI has deferred the public rollout of GPT-5.6 because the United States government seeks early access to frontier AI models—a phrase that sounds like bureaucratic poetry until you realize it means that the most powerful language models are being held in abeyance not by technical limitation but by geopolitical anxiety, and that the same models which promise to transform productivity at work are the very ones causing Chinese office workers to wonder whether their jobs can be done by artificial intelligence, whether their companies will make the switch, whether the bar is being raised or the floor is being removed. The aggregate narrative is one of technological triumph. The disaggregated experience is one of quiet dread.

This morning I walked to the market to buy vegetables, and the vendor—a man named Rafiq who has occupied the same corner of Lansdowne Market for twenty-three years—told me that his business has never been better. His aggregate revenue is up. His aggregate customer count is stable. But when I pressed him, he admitted that the kind of customer has changed. The middle-class families who once bought two kilograms of tomatoes have been replaced by delivery app aggregators who purchase in bulk at negotiated rates, leaving him with thinner margins and the hollow satisfaction of higher transaction volume. His business looks healthier in the spreadsheet than it feels in his hands. Simpson’s paradox, again: the subgroups tell a story of decline that the total obscures.

The city itself is a living demonstration. Calcutta—Kolkata, if you prefer the official designation, though I find the old name more honest in its colonial exhaustion—has been declared a “smart city” by successive governments. The aggregate infrastructure spending is impressive. The aggregate number of Wi-Fi hotspots has multiplied. Yet walk through any of the older neighborhoods, through the lanes of Bhawanipore or the narrow streets of Kalighat, and the disaggregated picture is one of crumbling plaster, of open drains, of electrical wires tangled in configurations that would give an insurance adjuster a stroke. The smart city exists in the aggregate report. The lived city exists in the particular street, the particular building, the particular moment when the power fails during a heatwave and you remember that the French are currently experiencing record temperatures of forty point nine degrees Celsius and booking hotel rooms for pool access, and you wonder whether the global conversation about climate adaptation is itself an aggregate fiction that dissolves when you try to locate yourself within it.

I think about this while reading the news of Venezuela, where earthquakes have killed nine hundred and twenty people and tens of thousands remain missing, and where humanitarian aid is being organized in Bogotá by volunteers who unload donated goods from cars marked “For Venezuela.” The aggregate number—nine hundred and twenty dead—is already being absorbed into the flow of information, already becoming a data point in the larger narrative of natural disasters in 2026. But the disaggregated reality is a mother searching through rubble in La Guaira, a rescue worker carrying a body recovered from a collapsed building, a passerby trying to photograph the damage on a tower in Beijing while the world moves on to the next headline. The aggregate allows us to feel informed. The particular forbids us that comfort.

And then there is the war, which has become background noise in the way that all distant catastrophes eventually do. The United States struck Iran on Friday in response to a drone attack on a cargo ship in the Strait of Hormuz, and Vice-President J.D. Vance has declared that “violence will be met with violence,” which has the ring of a tautology dressed up as policy. The aggregate narrative is one of ceasefire violations and retaliatory strikes, of geopolitical chess played with human pieces. The disaggregated reality is four million Iranians plunged into poverty, two hundred and seventy billion dollars in damage, a nation trying to survive the aftermath of three months of war while the rest of the world debates whether the ceasefire was ever real to begin with. Bahrain claims Iran attacked a navy base. The United States claims it adhered to the ceasefire. Both claims can be true within their respective subgroups. The aggregate truth is that people are dying and the mathematics of justification continue to add up in ways that favor the powerful.

I am aware, even as I write this, that I am performing a kind of aggregation myself. I am taking the particular miseries of my city, my neighborhood, my ceiling fan, and weaving them into a pattern that pretends to coherence. This is the essayist’s temptation and the statistician’s trap. The particular resists pattern. The pattern betrays the particular. And somewhere in the space between them lies whatever truth we are capable of apprehending.

I think about this when I consider my own work. I build software for a living, or rather I orchestrate AI agents to build software, which is the current euphemism for a profession that has become increasingly abstract. The aggregate productivity metrics look excellent. The disaggregated experience is one of watching language models generate code that I then review, modify, and occasionally despair over, wondering whether I am becoming a curator of machine output rather than a creator of anything genuinely new. The researcher’s prize of one hundred thousand dollars for decoding zebra finch calls using AI seems simultaneously admirable and faintly absurd—a triumph of pattern recognition applied to creatures whose calls were never meant to be decoded, only heard. We are very good at finding patterns. We are less good at knowing what to do with the loneliness that remains when the pattern is found.

The monsoon clouds have been gathering over the Bay of Bengal for days now, heavy and grey and full of withheld promise. The meteorological department issues aggregate forecasts: probability of precipitation, expected rainfall in millimeters, district-wise warnings. But the rain falls on particular rooftops, particular streets, particular people waiting at particular bus stops. The aggregate forecast cannot tell you whether the water will seep through your ceiling at three in the morning, whether the drain outside your building will clog, whether the forty-seven minutes of municipal water supply will be extended or further reduced. These are the subgroups that the weather report cannot address, the disaggregated experiences that make up a life.

I have been reading about the Yarlung Tsangpo hydropower project, which China expects to generate roughly three times as much power as the Three Gorges Dam. The aggregate numbers are staggering: megawatts, cubic meters, economic impact. The disaggregated reality is a river being re-engineered, ecosystems being reconfigured, downstream nations watching with the particular anxiety of those who know that water does not respect borders. China also begins building a one-billion-dollar hydropower station in Cambodia, and I wonder whether the aggregate benefit to Cambodia’s energy grid will outweigh the particular costs to the communities displaced by the reservoir. The spreadsheet says yes. The lived experience may say otherwise. Simpson’s paradox does not require malice. It requires only the innocent act of adding columns together.

My neighbor, an elderly woman who has lived in this building since before I was born, told me yesterday that the city was better in her youth. I asked her what she meant, and she said that people were kinder, that the streets were cleaner, that there was a sense of community that has since evaporated. I do not know whether this is true in the aggregate. The crime statistics suggest otherwise; the literacy rates suggest otherwise; the life expectancy figures suggest otherwise. But in the particular—the particular of her memory, the particular of her daily walk to the temple, the particular of her interactions with the vegetable vendor and the milkman and the electrician who overcharges her—her experience is unassailable. The aggregate improvement does not invalidate her particular loss. The particular loss does not invalidate the aggregate improvement. They coexist, irreconcilable, like two versions of the same city occupying the same space without ever quite touching.

This is what Simpson’s paradox teaches, if it teaches anything: that truth is not a single number but a constellation of numbers, that the act of summation is also an act of erasure, that the whole is not merely different from the sum of its parts but often actively misleading about the nature of those parts. It is not a counsel of despair. It is a warning against the easy comfort of averages, the seductive clarity of headlines, the assumption that what is true in general must therefore be true in particular.

The ceiling fan continues its rotation. I have given up trying to determine which speed is objectively superior. The question itself is malformed, a category error born of the assumption that comfort can be quantified. Outside, the Calcutta evening settles into its familiar rhythms: the call of the vegetable seller, the distant honking of buses, the particular sound of a city that refuses to be reduced to any single narrative. Somewhere, a language model is being trained on text that includes this sentence. Somewhere, a war continues because the aggregate logic of retaliation outpaced the particular logic of stopping. Somewhere, a man in La Guaira searches through rubble while the world moves on.

I do not know how to end this. There is no neat lesson, no statistical moral, no comforting resolution. The fan turns. The clouds gather. The particular persists, stubborn and irreducible, while the aggregate hums its smooth, misleading song. I will go downstairs now to buy tea from the stall on the corner, where the vendor knows my order without asking and where the price has not changed in three years, a small particularity that no spreadsheet could value and no paradox could dissolve.

P.S. The etymology of “paradox” traces to the Greek paradoxon, meaning “contrary to expectation.” Edward H. Simpson described the statistical phenomenon in 1951, though Karl Pearson had noted similar effects half a century earlier. The paradox is not a flaw in mathematics but a feature of reality: our categories are always provisional, our aggregations always incomplete, and the truth—if there is such a thing—lives in the spaces between the numbers we are so eager to sum.