Step 1: Extract Road Network from OpenStreetMap (OSM)#
This notebook will show you how to go from an .osm.pbf file to a GOSTnets-compatible network format. In addition, this notebook will show you how to:
Clip the roads to an input polygon (here we demonstrate with an ESRI Shapefile, rek2.shp)
Filter the roads DataFrame by a list of accepted road types, as tagged in OSM (see the list
accepted_road_types
that we define in the code)
Import Libraries#
First, we import the necessary libraries. In this notebook, we use the following libraries:
geopandas
to manipulate geospatial data (e.g. clipping the roads to a polygon)os
to define the path to the .osm.pbf fileGOSTnets
to import the OSM data and to save the network in a GOSTnets-compatible format (.gpickle in this case)
import geopandas as gpd
import os
import GOSTnets as gn
GDAL is not installed - OGR functionality not available
The load_osm module is an optional GOSTNets submodule that is used to import OSM data. Make sure gdal, geopy, and boltons are installed first as it has additional dependencies (look in GOSTnets repo main README file if you need additional help with this). Then import the submodule.
from GOSTnets.load_osm import OSM_to_network
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[2], line 1
----> 1 from GOSTnets.load_osm import OSM_to_network
File /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/GOSTnets/load_osm.py:17
14 import networkx as nx
15 # import matplotlib.pyplot as plt
---> 17 from osgeo import ogr
18 from rtree import index
19 from shapely.geometry import LineString, MultiPoint
ModuleNotFoundError: No module named 'osgeo'
Download data and set filepaths#
We will use the osm.pbf
file for the entire country of Iceland. You can download the file from Geofabrik, if you didn’t download it from the GOSTnets repository “tutorial_data” folder.
Once you have downloaded the file, set the filepath to the .osm.pbf file, this can be a relative or absolute path.
We demonstrate construction of a relative file path using the os.path
module to define a relative path to the “tutorial_data” folder.
pth = "./" # change this path to your working folder
fil = r"iceland-latest.osm.pbf" # download this file from geofabrik: http://download.geofabrik.de/europe/iceland.html.
# be sure to place the .osm.pbf file in the 'tutorial data' folder.
f = os.path.join(pth, "tutorial_data", fil)
# check if the file exists, raise an exception if it does not
if os.path.isfile(f):
print("File found.")
else:
raise Exception("File not found.")
# print the file path
print(f)
File found.
./tutorial_data/iceland-latest.osm.pbf
Load the OSM data as a GOSTnets object#
GOSTNets can load OSM data as a special ‘OSM_to_network’ object. This object gets initialized with both a copy of the OSM file itself and the roads extracted from the OSM file in a GeoPandas DataFrame. This DataFrame is a property of the object called ‘roads_raw’ and is the starting point for our network.
# convert the .osm.pbf file to a GOSTnets object
iceland = OSM_to_network(f)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 2
1 # convert the .osm.pbf file to a GOSTnets object
----> 2 iceland = OSM_to_network(f)
NameError: name 'OSM_to_network' is not defined
# show the doc-string for the OSM_to_network object
?iceland
Object `iceland` not found.
# display the first 5 rows of the extracted dataframe of the roads network
iceland.roads_raw.head()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 2
1 # display the first 5 rows of the extracted dataframe of the roads network
----> 2 iceland.roads_raw.head()
NameError: name 'iceland' is not defined
# show the different road types and counts
iceland.roads_raw.infra_type.value_counts()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 2
1 # show the different road types and counts
----> 2 iceland.roads_raw.infra_type.value_counts()
NameError: name 'iceland' is not defined
Define the accepted road types list#
We define a list of the types of roads from the above that we consider acceptable for our road network. This list is used to filter the roads DataFrame to only include the types specified.
accepted_road_types = [
"residential",
"unclassified",
"track",
"service",
"tertiary",
"road",
"secondary",
"primary",
"trunk",
"primary_link",
"trunk_link",
"tertiary_link",
"secondary_link",
]
Filter the roads DataFrame#
We call the filterRoads
method and pass it a list of acceptable road types.
Then we can look at the infrastructure types in the filtered roads DataFrame (similar to what was done above), except now we have filtered the roads to only include the types we want.
iceland.filterRoads(acceptedRoads=accepted_road_types)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[10], line 1
----> 1 iceland.filterRoads(acceptedRoads=accepted_road_types)
NameError: name 'iceland' is not defined
iceland.roads_raw.infra_type.value_counts()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[11], line 1
----> 1 iceland.roads_raw.infra_type.value_counts()
NameError: name 'iceland' is not defined
Clip the roads to a polygon#
Here, we import a shapefile for our area of interest. We want to make a map of Reykjavik, Iceland. We clip the extent of the country file to just the area around the capital (rek2.shp) below using standard GeoPandas functions:
# read the shapefile for the clip area
clip_shp = gpd.read_file(os.path.join(pth, "tutorial_data", "rek2.shp"))
clip_shp = clip_shp.to_crs("epsg:4326")
clip_shp_obj = clip_shp.geometry.iloc[0]
# visualize the clip area
clip_shp.plot()
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
File /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/geopandas/plotting.py:673, in plot_dataframe(df, column, cmap, color, ax, cax, categorical, legend, scheme, k, vmin, vmax, markersize, figsize, legend_kwds, categories, classification_kwds, missing_kwds, aspect, **style_kwds)
672 try:
--> 673 import matplotlib.pyplot as plt
674 except ImportError:
ModuleNotFoundError: No module named 'matplotlib'
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
Cell In[13], line 2
1 # visualize the clip area
----> 2 clip_shp.plot()
File /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/geopandas/plotting.py:979, in GeoplotAccessor.__call__(self, *args, **kwargs)
977 kind = kwargs.pop("kind", "geo")
978 if kind == "geo":
--> 979 return plot_dataframe(data, *args, **kwargs)
980 if kind in self._pandas_kinds:
981 # Access pandas plots
982 return PlotAccessor(data)(kind=kind, **kwargs)
File /opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/geopandas/plotting.py:675, in plot_dataframe(df, column, cmap, color, ax, cax, categorical, legend, scheme, k, vmin, vmax, markersize, figsize, legend_kwds, categories, classification_kwds, missing_kwds, aspect, **style_kwds)
673 import matplotlib.pyplot as plt
674 except ImportError:
--> 675 raise ImportError(
676 "The matplotlib package is required for plotting in geopandas. "
677 "You can install it using 'conda install -c conda-forge matplotlib' or "
678 "'pip install matplotlib'."
679 )
681 if ax is None:
682 if cax is not None:
ImportError: The matplotlib package is required for plotting in geopandas. You can install it using 'conda install -c conda-forge matplotlib' or 'pip install matplotlib'.
# this is the geometry object (shapely) from clip_shp
clip_shp_obj
Check to ensure some roads intersect the AOI#
We check to see everything lines up by running intersect and counting the True / False returns. The count of the True values are the number of roads that intersect the AOI
# intersects is a Shapely function that returns True if the boundary or interior of the object intersect in any way with those of the other
iceland.roads_raw.geometry.intersects(clip_shp_obj).value_counts()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[15], line 2
1 # intersects is a Shapely function that returns True if the boundary or interior of the object intersect in any way with those of the other
----> 2 iceland.roads_raw.geometry.intersects(clip_shp_obj).value_counts()
NameError: name 'iceland' is not defined
Next we remove any roads that don’t intersect the capital area, by modifying the roads_raw
DataFrame contained within the OSM_to_network
object
iceland.roads_raw = iceland.roads_raw.loc[
iceland.roads_raw.geometry.intersects(clip_shp_obj) == True # noqa: E712
]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[16], line 1
----> 1 iceland.roads_raw = iceland.roads_raw.loc[
2 iceland.roads_raw.geometry.intersects(clip_shp_obj) == True # noqa: E712
3 ]
NameError: name 'iceland' is not defined
# reprint the intersects value counts (should only be True now)
iceland.roads_raw.geometry.intersects(clip_shp_obj).value_counts()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[17], line 2
1 # reprint the intersects value counts (should only be True now)
----> 2 iceland.roads_raw.geometry.intersects(clip_shp_obj).value_counts()
NameError: name 'iceland' is not defined
# view the first 5 rows of the newly created roads GeoDataFrame
iceland.roadsGPD.head()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[18], line 2
1 # view the first 5 rows of the newly created roads GeoDataFrame
----> 2 iceland.roadsGPD.head()
NameError: name 'iceland' is not defined
Generate the RoadsGPD object#
Having narrowed the DataFrame to the roads we are interested in, we generate the RoadsGPD
object, which is stored as a property of the OSM_to_network
object.
The RoadsGPD
object is a GeoDataFrame that further processes the roads.
This includes splitting the edges where intersections occur, adding unique edge IDs, and adding to/from columns to the GeoDataFrame.
# generate the roads GeoDataFrame, may take a few minutes
iceland.generateRoadsGDF(verbose=False)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[19], line 2
1 # generate the roads GeoDataFrame, may take a few minutes
----> 2 iceland.generateRoadsGDF(verbose=False)
NameError: name 'iceland' is not defined
# view the first 5 rows of the newly created roads GeoDataFrame
iceland.roadsGPD.head()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[20], line 2
1 # view the first 5 rows of the newly created roads GeoDataFrame
----> 2 iceland.roadsGPD.head()
NameError: name 'iceland' is not defined
Generate the GOSTnets graph object#
With the initialReadIn()
method, we transform the roads GeoDataFrame into a graph object
iceland.initialReadIn()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[21], line 1
----> 1 iceland.initialReadIn()
NameError: name 'iceland' is not defined
Save the graph object to file#
We save this graph object down to file using gn.save()
. The save function produces three outputs: a node GeoDataFrame as a CSV, an edge GeoDataFrame as a CSV, and a graph object saved as a pickle.
gn.save?
gn.save(iceland.network, "iceland_unclean", os.path.join(pth, "tutorial_outputs"))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[23], line 1
----> 1 gn.save(iceland.network, "iceland_unclean", os.path.join(pth, "tutorial_outputs"))
NameError: name 'iceland' is not defined
Move on to Step 2 to see how we clean up this rough network.