Hollerith punch card

Episode Summary

The podcast episode is titled "Hollerith Punch Card". It tells the story of Herman Hollerith, a young German American inventor in the 1880s who designed a machine to process census data more quickly than humans could. The US government conducted a census every 10 years. In 1880, they asked 215 questions, far more than previous years. It soon became clear that tabulating all the answers would take years, delaying the next census. Hollerith realized the problem could be solved using punched cards, which had already been used to control machines like the Jacquard loom. Hollerith designed a machine that used spring-loaded pins to detect holes punched in census cards. Where a hole was found, a dial would tick up by one. This allowed census data to be tabulated far faster than by hand. The government rented Hollerith's machines for the 1890 census, saving millions of dollars and years of time. The machines also made analyzing census data easier. Punched cards were soon adopted by businesses like insurance companies and railways for bookkeeping and logistics. Hollerith's company, later known as IBM, became very successful selling tabulating machines. However, it took another century for the true data economy to emerge, fueled by the vast trails of personal data left behind whenever we use our phones or other connected devices. This allows companies like Google and Amazon to achieve the bureaucrats' old dream of near omniscience over data.

Episode Show Notes

Data is a hugely profitable commodity - if you know how to process it. Tim Harford tells the story of Herman Hollerith, and how his 19th-century machine for processing census data laid the foundations for some of the world's most valuable companies.

Episode Transcript

SPEAKER_00: Amazing, fascinating stories of inventions, ideas and innovations. Yes, this is the podcast about the things that have helped to shape our lives. Podcasts from the BBC World Service are supported by advertising. SPEAKER_01: Hello, I'm Emma Twin. I'm a virtual twin for Dassault Système. My job, simulate multiple medical conditions on myself to develop new treatments for all. Basically, I'm like a crash test dummy for healthcare. It may sound like science fiction, but in fact, it's just science. I explain it all on my LinkedIn account. Look up Emma Twin from Dassault Système. SPEAKER_02: 50 Things That Made the Modern Economy with Tim Harford SPEAKER_03: Amazon, Alphabet, Alibaba, Facebook, Tencent. Five of the world's ten most valuable companies by the summer of 2019. All under 25 years old. And all got rich, in their own ways, on data. No wonder it's become common to call data the new oil. As recently as 2011, five of the top ten were oil companies. Now only ExxonMobil clings on. The analogy isn't perfect. Data can be used many times. Oil only wants. But data is like oil, in that the crude, unrefined stuff isn't much use to anyone. You have to process it to get something valuable. Diesel to put in an engine. Insights to inform a decision. Decisions such as which advert to insert in a social media timeline. Which search result to put at the top of the page. Imagine you were asked to make just one of those decisions. Someone is watching a video on YouTube, which is run by Google, which is owned by Alphabet. What to suggest she watches next? Peak her interest and YouTube gets to serve her another advert. Lose her attention and she'll click away. You have all the data you need. Look at all the other YouTube videos she's ever watched. What is she interested in? Now look at what other users have gone on to watch after this video. Weigh up the options. Calculate probabilities. If you choose wisely and she views another ad, well done. You've earned Alphabet all of maybe 20 cents? Clearly relying on humans to process data would be impossibly inefficient. These business models need machines. In the data economy, power comes not from data alone, but from the interplay of data and algorithm. In the 1880s, a young German American inventor tried to interest his family in a machine to process data more quickly than humans could manage. He designed it, but now he needed money to test it. Picture something that looks a bit like an upright piano, but instead of keys, it has a slot for cards about the size of a dollar bill with holes punched in them, facing you a 40 dials, which may or may not tick upwards after you insert each new card. Herman Hollerith's family didn't get it. Far from rushing to invest, they laughed at him. Hollerith evidently did not forgive. He cut them off. His children were to grow up with no idea they had relatives on their father's side. Hollerith's invention responded to a very specific problem. Every 10 years, the US government conducted a census. That was nothing new. Governments through the ages have wanted to know who lives where and who owns what to help raise taxes and find conscripts. But if you're going to send a small army of enumerators around the country, it must be tempting to ask about an ever wider range of things. What jobs do people do? Any illnesses or disabilities? What languages do they speak? Knowledge is power, as 19th century bureaucrats understood just as well as 21st century platform companies. Yet, with the 1880 census, the bureaucrats had swallowed more data than they could digest. In 1870, they'd asked just five questions. In 1880, they asked 215. It soon became clear that adding up the answers would take years. They'd barely have finished this census when it would be time to start the next one. A lucrative government contract surely awaited anyone who could speed the process up. Young Herman had worked on the 1880 census, so he understood the problem. Herman had decided to seek his fortune by inventing a new kind of brake for trains. As it happened, a train journey helped him solve the census problem instead. Rail tickets were often stolen, so railway companies found an ingenious way to link them to the person who'd bought them. A punch photograph. Conductors used a hole punch to select from a range of physical descriptors. As Hollerith recalled, light hair, dark eyes, large nose, etc. If SPEAKER_02: a dark-haired, small-nosed scoundrel stole your ticket, he wouldn't get far. But after SPEAKER_03: observing this system, Hollerith realized that people's answers to census questions could also be represented as holes in cards. That could solve the problem, because punched cards had been used to control machines since the early 1800s. The Jacquard loom wove patterned fabric based on them. All Hollerith needed to do was make a tabulating machine to add up the census punch cards he envisaged. In that piano-like contraption, a set of spring-loaded pins descended on the card. Where they found a hole, they completed an electrical circuit, which moved the appropriate dial up by one. Happily for Hollerith, the bureaucrats were more impressed than his family. They rented his machines to count the 1890 census, to which they'd added yet another 20 questions. Compared to the old system, Hollerith's machines proved years quicker and millions of dollars cheaper. More importantly, they made it easier to interrogate the data. Suppose you wanted to find people aged 40-45, married and working as a carpenter. No need to sift through 200 tonnes of paperwork, just set up the machine and run the cards through it. Governments soon saw uses far beyond the census. Across the world, says historian Adam Tooze, bureaucrats were inspired to dream of omniscience. America's first social security benefits were disbursed through punched cards in the 1930s. The following decade, punched cards notoriously helped organise the Holocaust. Businesses too were quick to see the potential. Insurers used punched cards for actuarial calculations, utilities for billing, railways for shipping, manufacturers to keep track of sales and costs. Hollerith's tabulating machine company did a roaring trade. You may have heard of the firm that through mergers, it eventually became IBM. It remained a market leader as punched cards gave way to magnetic storage and tabulating machines to programmable computers. It was still on the list of the world's 10 biggest companies a few years ago. But if the power of data was apparent to Hollerith's customers, why did the data economy take another century to arrive? Because there's something new about the kind of data that's now being compared to oil. The likes of Google and Amazon don't need an army of enumerators to collect it. We trail it behind us every time we use our phones or ask Alexa to turn the light on. This kind of data is not as neatly structured as the predefined answers to census questions precision punched into Hollerith's cards. That makes it harder to make sense of, but there's unimaginably more of it. And as algorithms improve and more of our lives lived online, the bureaucratic dream is fast becoming corporate reality. SPEAKER_01: Crowd singing back to you. That to me is the most satisfying thing. Music life. My SPEAKER_00: favorite thing about music is we can all listen to it and have completely different experience. SPEAKER_03: Music life. You have to say, okay, right. I do mean this music life. What's going on SPEAKER_03: in this room right now? A brand new podcast from the BBC World Service. I want to talk SPEAKER_00: about how this is ruining your life. Bringing together musicians from across the globe. I love this. This is brilliant. Talking to each other about how they make their music SPEAKER_01: better. Just having a good time here. The most free. There's less pressure to be like, I let me think of something that's going to blow everybody's mind and why they do what SPEAKER_00: they do. Everyone knows this track and his vibe into it. And that was just a special feeling. Music life. I guess this is just my destiny. Just search for music life wherever SPEAKER_01: you get your podcasts.