Just after Barack Obama announced Joe Biden as his vice-presidential pick—an announcement he had told supporters they would learn of in a text message from him—the outrage began to spread on blogs and Web sites such as Twitter. Some of his supporters felt gypped that he hadn’t delivered on his promise to tell them first—the Biden selection had been reported earlier that morning by CNN’s star political correspondent, John King.
Liberal blogger Jane Hamsher knew exactly what had happened: “U thought u would know before CNN? Ha! U got datamined.”
Obama’s promise of announcing his veep by text was far from a lark; he wasn’t just exploiting a chance to use the latest and greatest shiny tool that “the kids” are into. It was a carefully planned data-collection technique: You give me your cell-phone number, and I’ll give you the vice president.
At a certain level, all campaigns come down to numbers—fundraising figures, “points” of television advertising, and vote totals on Election Day. No campaign, though, has ever been as numbers-driven as this fall’s presidential race.
As computer power and databases have grown in strength, so has the role of data. In past years, campaigns couldn’t sort the electorate and use finely grained outreach and mobilization techniques. Today, thanks to expensive and powerful databases like the GOP’s Voter Vault and the progressive startup Catalist, targeting voting blocs is as simple as checking boxes on a computer screen.
In the November election, Catalist hopes its data crunching will help deliver Barack Obama the White House and help elect other candidates such as Virginia’s Senate candidate, Mark Warner. The work being done in Catalist’s McPherson Square offices—which, with its multiscreen computer terminals, resembles a Silicon Valley start-up—is helping revolutionize the fields known as data mining and microtargeting.
Gathering good data—home addresses, voter-registration records, and phone numbers—has always bedeviled political campaigns. After the 2000 election, when Terry McAuliffe took over as chair of the Democratic Party, he was horrified to discover that the party had no voter files to help contact potential supporters. When he dove into the state party lists, he found there were 27 million wrong addresses and phone numbers—meaning that in the 2000 election, won by some 500 votes in Florida and one vote on the Supreme Court, the Democrats had failed to contact 27 million potential supporters. In the Florida voting, the Democratic Party had more than 1.1 million incorrect files. McAuliffe fumed, “Don’t you think we could have found 537 votes if we had corrected that information earlier and contacted 1.1 million more people?”
Eight years later, Catalist is the front line of campaign offense—and with some $11 million in venture-capital funding, the backers of the for-profit enterprise hope for returns greater than just victories at the ballot box. Catalist was founded in August 2005 by Harold Ickes, the longtime Clinton deputy White House chief of staff, after the 2004 campaign to address the Democrats’ inability to harness data.
One of the first hires was a young engineer, Vijay Ravindran. Before arriving at Catalist as chief technology officer, Ravindran was director of the ordering-services group at Amazon.com, leading a team of about 130 engineers who built and maintained the site’s “shopping cart.”
“With my hiring, he made a decision that this was going to be a real company,” Ravindran says. As the chief data-architecture guy at Catalist, he’s part of a new trend in political technology: As data become more important in campaigning, candidates are increasingly turning to the tech industry for business-level expertise. The eCampaign director at the Republican National Committee, Cyrus Krohn, worked at Microsoft.
Though Ickes is closely identified with the Clintons, the company he founded is providing data and resources to all Democrats willing to pay. Today the client list reads like a who’s who of liberal Washington—Hillary Clinton and Barack Obama; the AFL-CIO and the breakaway union coalition Change to Win; the NAACP; and MoveOn.org.
Catalist, which competes with other political-data firms such as the Capitol Hill firm Aristotle, is pricey—subscriptions and data manipulation can run in the hundreds of thousands of dollars. It has spent three years compiling state, county, and local voter lists, cleaning up and standardizing the dozens of different data fields each locality collects, and removing duplicates from the resulting files.
The Catalist database is more than 30 terabytes—about 100 times the size of an average desktop computer’s hard drive—and contains some 280 million individual records. “We’re trying to build a complete record of every American over the age of 18,” says Catalist CEO Laura Quinn. “It’s not quite there, but we’re close.”
Catalist’s database builds on the voter lists with more than 450 commercially available data layers. Republicans and Democrats alike can purchase data from frequent-buyer cards at supermarkets and pharmacies, hunting- and fishing-license registries, catalog- and magazine-subscription lists, membership rolls from unions, professional associations, and advocacy groups such as the ACLU and the NRA.
Need to know how many single Asian men under 35 live in a given congressional district? Or how many college-educated women with children at home are in a specific precinct in Canton, Ohio? Databases such as the RNC’s Voter Vault and Catalist’s Q Tool now know.
All of the data collection has led to some interesting discoveries. Jaguar, Land Rover, and Porsche owners tend to be more Republican, while Subaru, Hyundai, and Volvo drivers lean Democratic. NASCAR and History Channel viewers are solidly GOP, whereas Bravo, Lifetime, and TNT watchers are more Democratic. Bourbon drinkers? Often Republican. Gin lovers? Democrats.
Day-to-day practitioners such as Catalist’s Laura Quinn and the GOP’s Cyrus Krohn caution that it’s impossible to extrapolate political beliefs from single data points. Instead, what their research and databases tend to show is that politics increasingly is part of an overall lifestyle—some of the strongest predictors of political ideology are things like education, homeownership, income level, and household size. Religion and gun ownership are the two most powerful predictors of partisan ID.
Through the aggregation of data and sample surveys of effective messages and preferences, Catalist and its competitors can build models to predict voter choices. As Quinn explains, “Based on how alike they are, you can assign a probability to them. You’re able to put a likelihood of support on each person based on how many character traits a person shares with your known supporters.”
The ability to draw patterns and predict leanings is critical to campaign outreach in areas like Loudoun County, where shifting demographics mean that preferences are increasingly mixed. Traditional voter-outreach efforts usually target areas where a party or candidate won by 65 percent of the vote, but such models are stymied by counties like Loudoun, where neither George W. Bush nor John Kerry won a single precinct with more than 65 percent of the vote in 2004. With the refined outreach capability of data-driven models to target potential voters, campaigns don’t have to waste as much effort reaching out to unlikely supporters and can more easily target critical swing voters—as well as find likely supporters in precincts that formerly would have been considered too solidly “red” or “blue” to target.
Says Quinn: “It completely changes the blue/red map.”
While the Republican Party began to perfect microtargeting during the 2004 election—its data-mining efforts are widely credited with helping deliver Ohio and other swing states for President Bush’s reelection—the Republicans are being very quiet about what they have up their sleeves for November.
In past elections, the GOP approached its microtargeting efforts with awe-inspiring methodicalness. During the 2002 cycle in the Texas Senate race and in a Colorado congressional race, it even studied the roads Republicans drove as they commuted to work, which allowed the party to put its billboards where they would do the most good.
One hint of what’s in store for this election comes from a trial the RNC ran last year in the Louisiana governor’s race on behalf of Bobby Jindal. Krohn and his team spent months tracking and targeting voters in Louisiana and delivered carefully calibrated online and offline messages to them. On Election Day last year, while statewide turnout was 46 percent, 76 percent of Krohn’s targeted group turned out to vote. He won’t say how big a group it was, but Jindal was able to avoid a runoff by winning 54 percent of the vote.
A major breakthrough in the databases since 2004 is the ability to assign unique IDs to every person in America, a move that for the first time allows political organizers to track people across state lines. Under previous systems, it was impossible to track the voting history of someone who had voted in Maryland in one election and moved to Virginia for the next. Given that 20 percent of Americans move in a given year, such shortcomings had a major impact.
Perhaps most revolutionary is Catalist’s “open source” approach to its core information. The company is asking each client to contribute back information either in real time or at the end of the campaign, meaning that the cell-phone number supporters gave to learn Barack Obama’s veep choice could live on in Catalist’s database for years. Clients can contribute survey results showing which voters care about which issues, mark who has signed an online petition on a particular issue or donated money to a particular cause, given an e-mail address to a particular organization, or put up a lawn sign or bumper sticker during a particular race—meaning that the database gets stronger the more people use it. Its predictions and ability to microtarget will improve with every voter contact.
Catalist’s database—what the company’s founders hope to be the living record of nearly every political action ever undertaken by an American—could become a formidable resource over the course of a election cycle or two as its many clients refine and correct it. Says Ravindran: “We aspire to be much more than just a database provider—we’re looking to build an ecosystem.”
But at the end of the campaign, there’s only so much that computer models and databases can tell you before the ultimate test: The question on Election Day comes down to a slice of market share so daunting that it would terrify any consumer packaged-goods company: Will 51 percent of the 120 million voters buy what John McCain is offering? Or will Barack Obama seem like a better deal?
Have something to say about this article? Send an email to email@example.com and your comment could appear in our next issue.
This article first appeared in the October 2008 issue of The Washingtonian. For more articles like it, click here.