OpenStreetMap

The library offers some facilities to request and parse the results from the OpenStreetMap Nominatim (search engine) and Overpass (map data database) API.

Nominatim API

You may search for a name or address (forward search) or look up data by its geographic coordinate (reverse search) or by its OpenStreetMap identifier.

  • forward search (from natural language name):

    from cartes.osm import Nominatim
    
    Nominatim.search("Isola di Capri")
    

    Nominatim

    osm_type relation
    osm_id 2675353
    place Capri
    village Anacapri
    county Napoli
    state Campania
    country Italia
    country_code it
    category place
    type_ island
    importance 0.619843

    You may access the underlying JSON information

    Nominatim.search("Isola di Capri").json
    
    {
      "place_id": 257200499,
      "licence": "Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright",
      "osm_type": "relation",
      "osm_id": 2675353,
      "boundingbox": ["40.5358786", "40.5618457", "14.1957565", "14.2663337"],
      "lat": "40.54877475",
      "lon": "14.22808744705355",
      "display_name": "Capri, Anacapri, Napoli, Campania, Italia",
      "place_rank": 17,
      "category": "place",
      "type": "island",
      "importance": 0.6198426309338769,
      "address": {
        "place": "Capri",
        "village": "Anacapri"
        // truncated
       }
    }
    
  • reverse search (from lat/lon coordinates):

    Nominatim.reverse(51.5033, -0.1277)
    

    Nominatim

    osm_type relation
    osm_id 1879842
    tourism Prime Minister’s Office
    house_number 10
    road Downing Street
    quarter Westminster
    suburb Covent Garden
    city City of Westminster
    state_district Greater London
    state England
    postcode SW1A 2AA
    country United Kingdom
    country_code gb
    category tourism
    type_ attraction
    importance 0.498706
  • lookup search, if you know the identifier:

    Nominatim.lookup("R9946787")
    

    Nominatim

    osm_type relation
    osm_id 9946787
    natural Eyjafjallajökull
    county Rangárþing eystra
    state_district Suðurland
    country Ísland
    country_code is
    category natural
    type_ glacier
    importance 0.456343

Overpass API

The Overpass API is a read-only API to selected parts of the OpenStreetMap data. It acts as a database where the end user can send queries using a dedicated query language (Overpass QL) in order to collect nodes, ways and relations referenced in OpenStreetMap.

The cartes library offers a direct access, with some helpers to generate the queries from a more natural Python call. The whole Overpass QL possibilities are not covered, but most simple use cases are.

Overpass QL “as is”

If you know how to write your query in Overpass QL, you can still benefit from the caching and parsing possibilities of the library with the query argument:

from cartes.osm import Overpass

parks = Overpass.request(query="""[out:json];
area[name="Helsinki"];
way(area)["leisure"="park"];
map_to_area->.a;
(
 node(area.a)[leisure=playground];
 way(area.a)[leisure=playground];
);
foreach(
  (._;>;);
  is_in;
  way(pivot)["leisure"="park"];
  out geom;
);""")

parks
id_ type_ geometry leisure name name:fi ...
0 30090776 way POLYGON ((24.88541 60.15269, 24.88632 60.15277... park Veneentekijänpuisto Veneentekijänpuisto ...
1 28188307 way POLYGON ((24.91803 60.17663, 24.91801 60.17670... park Hesperian esplanadi NaN ...
2 123811631 way POLYGON ((24.95604 60.16129, 24.95598 60.16115... park Tähtitornin vuori Tähtitornin vuori ...
3 123352934 way POLYGON ((24.95547 60.17155, 24.95554 60.17101... park Säätytalon puisto Säätytalon puisto ...
4 138661349 way POLYGON ((24.92256 60.19817, 24.92253 60.19810... park Susannanpuisto Susannanpuisto ...
... ... ... ... ... ... ... ...
189 6934764 way POLYGON ((25.08494 60.20598, 25.08514 60.20603... park Ystävyyden puisto NaN ...
190 218334351 way POLYGON ((25.13840 60.22072, 25.13755 60.22076... park Ilveskorvenpuisto Ilveskorvenpuisto ...
191 47218366 way POLYGON ((25.16009 60.21089, 25.16055 60.21074... park Nordsjön kartanonpuisto Nordsjön kartanonpuisto ...
192 44644822 way POLYGON ((25.14486 60.22378, 25.14471 60.22359... park Sudenkuoppa Sudenkuoppa ...
193 31776767 way POLYGON ((25.14933 60.21769, 25.14896 60.21842... park Marielundinpuisto Marielundinpuisto ...

194 rows × 40 columns

Warning

The representation calls the underlying Geopandas DataFrame generated upon parsing. You have access to several attributes:

  • parks.data returns the Geopandas DataFrame;

  • parks.json returns the raw JSON data received.

A simple request for Query Language (QL) illiterate people

This request selects all parks within Helsinki area where playgrounds for kids are available.
With Cartes, it is possible to generate a simpler request like “select all parks within Helsinki area”.
# That's all folks
parks = Overpass.request(area="Helsinki", leisure="park")

# You may check/debug the generated request as follows.
Overpass.build_query(area="Helsinki", leisure="park")

#   >> returns:
# [out:json][timeout:180];rel(id:34914);map_to_area;nwr(area)[leisure=park];out geom;

Note that unlike with the first (complicated) query above, the name "Helsinki" does not appear in the request as we use a Nominatim call first to identify the area (which can prove helpful when the name field does not use a familiar alphabet). It is possible to write the following for a closer result:

Overpass.build_query(area={"name": "Helsinki"}, leisure="park")

#   >> returns:
# [out:json][timeout:180];area[name=Helsinki];nwr(area)[leisure=park];out geom;

Writing your own queries

Warning

The API may change in the coming weeks. Different use cases may lead to different specifications.

The arguments to the requests function are:

  • the optional query argument, for raw Overpass QL queries. If this argument is not None, then all other arguments are ignored;

  • the optional bounds argument can be a tuple of four floats (west, south, east, north), a Nominatim object or any other object following the __geo_interface__ protocol. The bounds apply to the whole query;

    Danger

    The coordinate order to input here is (west, south, east, north). It will be converted to (south, west, north, east) for the Overpass QL.

    bounds = (24.8, 60.15, 25.16, 60.28)
    bounds = Nominatim("Helsinki")
    bounds = parks
    bounds = Polygon(...)  # from shapely.geometry import Polygon
    
  • the optional area argument can be a string, a Nominatim object or a dictionary of elements used to identify the area. The most commonly used tag is probably name.

    area = "Helsinki"
    area = Nominatim("Helsinki")
    area = {"name": "Helsinki, "admin_level": 8}
    area = {"name:ru": "Хельсинки"}
    

It is possible to specify the as_ argument in order to name (and reuse) the given area:

area = {"name": "Helsinki", "as_": "a"}
  • the node, way, rel (relation) and nwr (node-way-relation) keywords. The accept a dictionary specifying the request or a list of dictionaries:

    • the keys in the dictionary refer to the tag to be matched, the values to the value to be set to the tag. If you want to match all values, set it to True:

      nwr = dict(leisure="park")
      
    • if the node (or way, or …) must be within a named area, specify the area keyword;

      area = {"name": "Helsinki", "as_": "a"},
      nwr = dict(leisure="park", area="a")
      
    • if the match is not exact, but refers to a regular expression, you may nest a dictionary with the regex key:

      # name must end with park or Park
      nwr = dict(leisure="park", name=dict(regex="[Pp]ark$"))
      # name must be empty
      nwr = dict(leisure="park", name=dict(regex="^$"))
      
    • use a list if you want several elements:

      # get both parks and railway stations
      nwr = [dict(leisure="park", area="a"), dict(railway="station", area="a")]
      
  • any other keyword arguments are collected and passed as a dictionary to the nwr keyword:

# All those notations are equivalent:
Overpass.request("Helsinki", leisure="park")
Overpass.request("Helsinki", nwr=dict(leisure="park"))
Overpass.request("Helsinki", nwr=[dict(leisure="park)])

Post-processing

Geometry simplification

The simplify() method associated to GeoPandas dataframes comes from Shapely. Its major default comes from the fact that two neighbouring geometries may be simplified differently on the borders they share.

from cartes.osm import Overpass

toulouse = Overpass.request(
    area={"name": "Toulouse", "admin_level": 8},
    rel={"boundary": "postal_code"}
)

base = alt.Chart(
    toulouse.assign(
        geometry=toulouse.data.set_crs(epsg=4326)
        # Switch to Lambert93 to simplify (resolution 500m)
        .to_crs(epsg=2154).simplify(5e2)
        # Switch back to WGS84 (lat/lon)
        .to_crs(epsg=4326),
    )
)

alt.layer(
    base.mark_geoshape().encode(alt.Color("postal_code:N")),
    base.mark_text(
        color="black", font="Ubuntu", fontSize=14
    ).encode(
        alt.Latitude("latitude:Q"),
        alt.Longitude("longitude:Q"),
        alt.Text("postal_code:N"),
    )

The Cartes library provies a different .simplify() method on Overpass structures:

alt.concat(
    *list(
        alt.Chart(toulouse.simplify(resolution=value))
        .mark_geoshape()
        .encode(color="postal_code:N")
        .properties(width=200, height=200, title=f"simplify({value:.0f})")
        for value in [1e2, 5e2, 1e3]
    ),
    columns=2
)

Graph colouring

The four-colour theorem states that you can color a map with no more than 4 colors. If you only base yourself on the name of the regions you will map, you will use many colors, at the risk of looping and use the same (or similar) colors for two neighbouring regions.

The Overpass object offers as .coloring() method which builds a NetworkX graph and computes a greedy colouring algorithm on it.

The following map of administrative states of Austria and colours it with both methods. The resulting graph is also accessible via the graph attribute.

from cartes.osm import Overpass
import altair as alt

austria = Overpass.request(
    area=dict(name="Österreich"), rel=dict(admin_level=4)
).simplify(5e2).coloring()

base = alt.Chart(austria)

labels = base.encode(
    alt.Longitude("longitude:Q"), alt.Latitude("latitude:Q"), alt.Text("name:N"),
)
edges = pd.DataFrame.from_records(
    list(
        {
            "lat1": austria[e1].latitude, "lon1": austria[e1].longitude,
            "lat2": austria[e2].latitude, "lon2": austria[e2].longitude,
        }
        for e1, e2 in austria.graph.edges
    )
)

alt.vconcat(
    base.mark_geoshape().encode(alt.Color("name:N", scale=alt.Scale(scheme="set2"))),
    alt.layer(
        base.mark_geoshape().encode(
            alt.Color("coloring:N", scale=alt.Scale(scheme="set2"))
        ),
        labels.mark_text(fontSize=13, font="Ubuntu"),
    ),
    alt.layer(
        alt.Chart(edges).mark_line()
        .encode(
            alt.Latitude("lat1"), alt.Longitude("lon1"),
            alt.Latitude2("lat2"), alt.Longitude2("lon2"),
        ),
        base.mark_point(filled=True, size=100).encode(
            alt.Latitude("latitude:Q"), alt.Longitude("longitude:Q"),
            alt.Color("coloring:N", scale=alt.Scale(scheme="set2"),
        ),
        labels.mark_text(fontSize=13, font="Ubuntu", dy=-10),
    ),
).resolve_scale(color="independent")

Distances and areas

The methods .area() and .length() compute the area (resp. the length) of each geometry in square meters (resp. meters). They can be useful to select, sort or visualise geometries based on this criterion.

In the following example, we sort Helsinki public parks by size:

parks.area().sort_values("area", ascending=False)
id_ type_ geometry leisure name name:fi ... area
149 208985860 way POLYGON ((24.92436 60.19148, 24.92434 60.19147... park Helsingin keskuspuisto Helsingin keskuspuisto ... 1.093153e+07
80 208985860 way POLYGON ((24.92436 60.19148, 24.92434 60.19147... park Helsingin keskuspuisto Helsingin keskuspuisto ... 1.093153e+07
142 208985860 way POLYGON ((24.92436 60.19148, 24.92434 60.19147... park Helsingin keskuspuisto Helsingin keskuspuisto ... 1.093153e+07
... ... ... ... ... ... ... ... ...
100 47429588 way POLYGON ((24.95763 60.19529, 24.95736 60.19543... park Someronpuistikko NaN ... 1.253333e+03
101 228246499 way POLYGON ((24.96169 60.19272, 24.96184 60.19279... park Vallilantien puisto Vallilantien puisto ... 1.233400e+03
112 553732423 way POLYGON ((25.03069 60.18548, 25.03091 60.18549... park Neitojenpuisto Neitojenpuisto ... 9.617457e+02

194 rows × 41 columns

With the .length() method, we can imagine the following use case to filter rivers by their length:

riviera = Overpass.request(
    area={"name": "Alpes-Maritimes", "as_": "a"},
    rel=[
        dict(area="a"),  # the administrative region
        dict(waterway="river", area="a")  # the rivers
    ],
).simplify(5e2)

alt.layer(
    # The administrative region
    alt.Chart(riviera.query('boundary=="administrative"'))
    .mark_geoshape(fill="lightgray"),
    # The rivers
    alt.Chart(
        riviera.query('waterway=="river"').length()
        # at least 20k long, and remove one going to a different drainage basin
        .query("length > 20_000 and id_ != 7203495")
    )
    .mark_geoshape(filled=False)
    .encode(alt.Tooltip("name:N")),
).properties(
    width=400, height=400, title="Main rivers of French Riviera"
).configure_title(
    font="Fira Sans", fontSize=16, anchor="start"
)