What We Know About the NSA Metadata Program

The spy agency has been receiving Americans' phone records for years.

Multiple officials are now confirming that the National Security Agency’s practice of collecting all telephone metadata from Verizon, as first reported by the Guardian, is part of a program that has been active for years. A US intelligence official tells me that orders of the kind delivered to Verizon in April are routine. Sen. Dianne Feinstein said today that the collection of metadata from phone companies is a seven-year-old practice. And an unnamed source told the Washington Post that the order appears to be similar to one first issued by the Foreign Intelligence Surveillance Court in 2006, and that it is “reissued routinely every 90 days” and not related to any particular government investigation. 

Here’s what else we know so far about this massive intelligence collection program, a few things we might infer, and some big unanswered questions. 

What is the government doing with all this phone metadata? 

According to a senior administration official, “Information of the sort described in the Guardian article has been a critical tool in protecting the nation from terrorist threats to the United States, as it allows counterterrorism personnel to discover whether known or suspected terrorists have been in contact with other persons who may be engaged in terrorist activities, particularly people located inside the United States.” 

This is a description of standard link analysis. Say the government obtains the phone number for a suspected terrorist. It then runs that number against the huge metadatabase. If there’s a match, presumably the government then obtains some other authority to find out who the number in the metadatabase belongs to; according to the court order, and the administration official, the metadata does not contain the names of phone subscribers. It’s just phone numbers, lengths of calls, and other associated data that’s not considered “content.” 

What can you learn with metadata but no content? 

A lot. In fact, telephone metadata can be more useful than the words spoken on the phone call. Starting with just one target’s phone number, analysts construct a social network. They can see who the target talks to most often. They can discern if he’s trying to obscure who he knows in the way he makes a call; the target calls one number, say, hangs up, and then within second someone calls the target from a different number. With metadata, you can also determine someone’s location, both through physical landlines or, more often, by collecting cell phone tower data to locate and track him. Metadata is also useful for trying to track suspects that use multiple phones or disposable phones. For more on how instructive metadata can be, read this. 

Where is all that metadata being stored? 

According to the court order, at the National Security Agency. The electronic spying agency is headquartered in Ft. Meade, Md. But it has been running out of digital storage space there, as well as electricity to keep all its systems up and running. The NSA has built a new facility in the Utah desert, called, appropriately, the Utah Data Center. And it recently broke ground on another facility at Ft. Meade. 

How does that data get from the phone companies to the NSA?  

We still know little about the physical infrastructure that transmits the metadata. But we do know, from the order, that Verizon is sending the information to the NSA “on an ongoing daily basis.” That’s an extraordinary amount of information considering it covers millions of customers making multiple calls a day. In simple terms, we’re talking about a lot of pipes and cables leading from Verizon locations—like switching stations—to NSA facilities. We know from a whistleblower at AT&T that surveillance equipment was set up at the company’s offices in San Francisco as part of the NSA’s efforts to monitor terrorists after the 9/11 attacks. 

What else might the NSA or other government agencies be doing with this metadata? 

As I wrote in my book, The Watchers, the NSA has long been interested in trying to find unknown threats in very big data sets. You’ll hear this called “data mining” or “pattern analysis.” This is fundamentally a different kind of analysis than what I described above where the government takes a known suspect’s phone number and looks for connections in the big metadatabase. 

In pattern analysis, the NSA doesn’t know who the bad guy is. Analysts look at that huge body of information and try to establish patterns of activity that are associated with terrorist plotting. Or that they think are associated with terrorist plotting. 

The NSA spent years developing very complicated software to do this, and met with decidedly mixed results. One such invention was a graphing program that plotted thousands upon thousands of pieces of information and looked for relationships among them. Critics called the system the BAG, which stood for “the big ass graph.” For data geeks, this was cutting edge stuff. But for investigators, or for intelligence officials who were trying to target terrorist overseas, it wasn’t very useful. It produced lots of potentially interesting connections, but no definitive answers as to who were the bad guys. As one former high-level CIA officer involved in the agency’s drone program told me, “I don’t need [a big graph]. I just need to know whose ass to put a Hellfire missile on.” 

How big a database do you need to store all this metadata?

A very, very big one. And lots of them. That facility in Utah has 1 million square feet of storage space. 

But just storing the data isn’t enough. The NSA wants a way to manipulate it and analyze it in close to real-time. Back in 2004, the agency began building “in-memory” databases, which were different than traditional databases that stored information on disks. In-memory was built entirely with RAM, which allows a computer to hold data in storage and make it ready for use at an instant. With disks, the computer has to physically go find the data, retrieve it, and then bring it into a program. If you’re trying to analyze entire telephone networks at once—and that is precisely what the NSA wanted to do—a disk-based system will be too slow. But the NSA’s in-memory databases could perform analytical tasks on huge data sets in just a few seconds. 

The NSA poured oceans of telephone metadata into the in-memory systems in the hopes of building a real-time terrorist tracker. It was an unprecedented move for an organization of the NSA’s size, and it was extremely expensive. 

That was 2004. The court orders issued to Verizon, we’re told, go back to as early as 2006. It appears that the NSA has had an uninterrupted stream of metadata for at least seven years. But the agency was getting access almost immediately after 9/11. That could mean there’s more than a decade’s worth of phone records stored at the NSA’s facilities.