WAt the end of a long cul-de-sac at the bottom of a steep hill, our house sat near a storm sewer opening that in my memory is a couple of yards or so wide. If you poked your head down the opening and looked in the right direction, you could see daylight where the culvert emptied into a sort of open catch basin.
None of us ever had the nerve to slip into the drain and walk through the dark pipe to come out in the catch basin, maybe 300 yards away. But that is where, at the age of maybe 5 or 6, I realized a vast network of drainage pipes existed under our street, beneath our houses. That culvert fascinated me and my friends endlessly. We’d try to peer at one another from each end, or shout to see if our voices would carry. Or, after a rain, we’d drop a paper boat down the opening and see how long it would take to flow into the catch basin.
That sewer is like the Internet. Underneath the manifest “streets” that are thoroughly used and mapped lies a vast subterranean zone with its own stores of data. Some experts say the surface or easily accessed Internet holds only 4 percent of what’s out there. Much of the out-of-view, deep Internet consists of intellectual property that people — like academics or scientists — want to keep to themselves or share only with people they choose. But other areas lie within the deep Internet where criminal and terrorist elements gather and communicate. That’s called the dark Internet. It’s also where dissidents who might be targeted by their own country communicate with one another. To people using regular browsers and search engines, this vast online zone is like a broadcast occurring at a frequency you need a special antenna to detect.
At the recent GEOINT conference, held for the first time in Washington, I heard a theme from several companies: Agencies will need to exploit the deep Web and its subset dark Web to keep up with these unsavory elements. The trend in geographical intelligence is mashing up multiple, non-geographic data sources with geographic data. In this and a subsequent post, I’ll describe some of the work going on. In this post, I’ll describe work at two companies, one large and one small. They have in common some serious chops in GEOINT.
Mashup is the idea behind a Lockheed Martin service called Halogen. Clients are intelligence community and Defense agencies, but it’s easy to see how many civilian agencies could benefit from it. Matt Nieland, a former Marine Corps intelligence officer and the program manager for the product, said the Halogen team, from its operations center somewhere in Lockheed, responds to requests from clients for unconventional intel. This requires data from the deep Internet. It may be inaccessible to ordinary tools, but it still falls into publicly available data. Neiland draws a crude sketch in my notebook like a potato standing on end. The upper thin slice is the ordinary Internet. A tiny slice on the bottom is the dark element. The bulk of the potato represents the deep.
Halogen uses the surface, searchable Internet in the unclassified realm. Analysts ingest material like news feeds, social media, Twitter. They mix in material that is inaccessible to standard browsers and search engines, but are neither secret nor requiring hacking. It does take skill with the anonymizing Tor browser and knowledge of how to find the lookup tables giving URLs that otherwise look like gibberish. Beyond that, Nieland said Lockheed has contacts with people around the world who can verify what it finds online. Halogen’s secret sauce is the proprietary algorithms and trade craft its analysts use to create intel products.
At the opposite end of the size spectrum from Lockheed Martin, OGSystems assembles teams of non-traditional, mostly West Coast companies to help federal agencies solve unusual problems in cybersecurity and intel, or problems they can’t find solutions for in the standard federal contractors. CEO Omar Balkissoon said the company specializes in getting non-traditional people to think about traditional questions. A typical project is the Jivango community, where agencies can source answers to GEOINT questions.
OGSystems calls its R&D section VIPER Labs, crafts services, techniques and data products for national security. At GEOINT, I walked to a big monitor by Jessica Thomas, a data analyst and team leader at VIPER Labs. She’s working on an OSINT (open source intelligence) product for finding and stopping human traffickers and people who exploit minors. It’s a good example of mashing up non-GEO data with GEO. The product uses an ontology used by law enforcement and national security types of words found on shady websites and postings to them that may be markers for this type of activity. Thomas pulled two weeks worth of posting traffic and used a geo-coding algorithm to map it to the rough locations of the IP addresses. Posters tend to be careless about how easy it is to reverse-lookup IP addresses to get a general area from where it originated. In many cases, posts included phone numbers. It wasn’t long before clusters of locations emerged indicating a possible network of human trafficking.
An enthusiastic Tor user, Thomas wants to add dark Internet material to her trafficking data mashup. She also hopes to incorporate photo recognition, and sentiment analysis that can detect emotion within language found on a website. She said OGSystems has applied for a grant to develop its trafficking detection technology into a tool useful for wildlife trafficking — a major source of funding for terror-criminal groups like El Shabab.
Next week, some amazing things text documents can add to GEOINT.
Tom Temin is host of The Federal Drive, which airs 6-9 a.m., on Federal News Radio (1500AM). This post was originally written for his personal blog, Temin on Tech.