Geoparser.io is a RESTful web API that identifies place names mentioned in text, disambiguates those names, and returns GeoJSON for the places found in the text. Here's how you use it.
According to Wikipedia, geoparsing is the process of converting free-text descriptions of places (such as "Springfield") into unambiguous geographic identifiers (such as lat-lon coordinates). A geoparser is a tool that helps in this process. Geoparsing goes beyond geocoding in that, rather than analyzing structured location references like mailing addresses and numerical coordinates, geoparsing handles ambiguous place names in unstructured text.
Geoparser.io works best on complete sentences in English, like those shown in the example inputs below. If you have a very short text, such as a partial address like "Auckland New Zealand," you probably want to use a geocoder tool instead of a geoparser.
API Endpoint URI: https://geoparser.io/api/geoparser
Request Protocol: HTTPS POST
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Response Format: GeoJSON
Authentication: authorization token (apiKey
must be passed in HTTP request header, see examples below)
python -mjson.tool
pretty-prints the JSON response, but is certainly not required.$ pip install requests
.geometry
, Feature
, and FeatureCollection
objects.FeatureCollection.properties.id
is a unique-ish identifier generated for each API request submitted.Feature
object contains an id element with the corresponding geonameid
unique identifier from GeoNames.org.Feature
also contains a properties
object with the following elements:
Feature.properties.name
is the best name for the specified location, with a preference for official/short name forms (e.g., "New York" over "NYC," and "California" over "State of California"), which may be different from exactly what appears in the text.Feature.properties.country
is the ISO-3166 2-letter country code for the country in which this place is located, or NULL
for features outside any sovereign territory.Feature.properties.admin1
is a code representing the state/province-level administrative division containing this place. (From GeoNames.org: "Most adm1 are FIPS codes. ISO codes are used for US, CH, BE and ME. UK and Greece are using an additional level between country and fips code. The code '00' stands for general features where no specific adm1 code is defined.")Feature.properties.type
is a text description of the geographic feature type — see GeoNames.org for a complete list. Subject to change.Feature.properties.confidence
is a confidence score produced by the place name disambiguation algorithm. Currently returns a placeholder value; subject to change.Feature.properties.references
is an array of arrays, each element of which consists of two integers indicating the start (index of the first character in the place reference) and end (index of the first character after the place reference) of each reference to the this place name found in the input text."I was born in Springfield and grew up in Chicago."
and notice how the response output changes, particularly for Springfield!inputText
this time, adapted from Wikipedia (used & redistributed under a CC BY-SA license).python -mjson.tool
pretty-prints the JSON response, but is certainly not required.inputText
this time, adapted from Wikipedia (used & redistributed under a CC BY-SA license).inputText
this time, adapted from Wikipedia (used & redistributed under a CC BY-SA license).inputText
this time, adapted from Wikipedia (used & redistributed under a CC BY-SA license).inputText
this time, adapted from Wikipedia (used & redistributed under a CC BY-SA license).$ pip install requests
.inputText
this time, adapted from Wikipedia (used & redistributed under a CC BY-SA license).inputText
this time, adapted from Wikipedia (used & redistributed under a CC BY-SA license).inputText
, so the corresponding Feature.properties.references
array contains multiple entries.inputText
was matched to a longer "official" name in the response output: "Province of Davao Oriental." The original text can always be retrieved from the inputText
by using the index positions provided in the Feature.properties.references
elements.Feature
objects for Aliwagwag, Mindanao, and Compostela (separately from Compostela Valley) — hey, nobody's perfect!Geoparser.io uses the GeoNames.org geographical database as its primary source of gazetteer data. GeoNames.org was founded by Marc Wick, is maintained by the GeoNames Team, and is a project of Unxos GmbH, Switzerland. The GeoNames.org gazetteer contains data from dozens of sources and is licensed under a Creative Commons Attribution 3.0 License. For use in Geoparser.io, several modifications were made to address data quality issues and missing records in the GeoNames.org material.
Click below and let us know if you have any questions about using Geoparser.io!