Improving IP-based geo-location through Internet Topology and Geospatial Datasets
Accurate IP geo-location is crucial to the effectiveness of a wide array of Internet-based services ranging from targeted advertising and website localization to content delivery, security logging and authentication. The most widely used technique for remote IP geo-location is to passively query a pre-built database mapping IP blocks to physical locations. Recent analysis of commercially available databases has revealed limited global coverage and limited accuracy below the country level.
In this work, we first present a new form of geo-location technique which cross-references Regional Internet Registry (RIR) entries with topology information derived from Border Gateway Protocol (BGP) routing data. Second, we present a Hadoop integrated PATRICIA tree designed to store this dataset. Finally, we present a system for accurately and efficiently mapping location strings to representative alpha-shape polygons.
Our experiments show that cross-referencing RIR entries with topology information allows for improvements in location accuracy below the country level in comparison with traditional databases. Furthermore, we show that use of a PATRICIA tree provides maximum storage efficiency with minimal performance impact. Finally, we show that representing locations as alpha shapes provides a high level of accuracy with minimal performance overhead.

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Masters Theses
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info