# Sur colab
1!pip install pandas fiona shapely pyproj rtree geopandas
- 1
- Ces librairies sont utiles pour l’analyse géospatiale (cf. chapitre dédié)
Mapping is an excellent way of disseminating knowledge about data, even to audiences unfamiliar with statistics. This chapter looks at the challenge of mapping and how you can use Python
to build maps.
Lino Galiana
2025-03-19
Cartography is one of the oldest forms of graphical representation of information. Historically confined to military and administrative uses or navigation-related information synthesis, cartography has, at least since the 19th century, become one of the preferred ways to represent information. It was during this period that the color-shaded map, known as the choropleth map, began to emerge as a standard way to visualize geographic data.
According to Chen et al. (2008), the first representation of this type was proposed by Charles Dupin in 1826 Figure 1.1 to illustrate literacy levels across France. The rise of choropleth maps is closely linked to the organization of power through unitary political entities. For instance, world maps often use color shades to represent nations, while national maps use administrative divisions (regions, departments, municipalities, as well as states or Länder).
The emergence of choropleth maps during the 19th century marks an important shift in cartography, transitioning from military use to political application. No longer limited to depicting physical terrain, maps began to represent socioeconomic realities within well-defined administrative boundaries.
With the proliferation of geolocated data and the increasing use of data-driven decision-making, it has become crucial for data scientists to quickly create maps. This chapter, complementing the one on spatial data, offers exercises to explore the key principles of data visualization through cartography using Python
.
Creating high-quality maps requires time but also thoughtful decision-making. Like any graphical representation, it is essential to consider the message being conveyed and the most appropriate means of representation.
Cartographic semiology, a scientific discipline focusing on the messages conveyed by maps, provides guidelines to prevent misleading representations—whether intentional or accidental.
Some of these principles are outlined in this cartographic semiology guide from Insee. They are also summarized in this guide.
This presentation by Nicolas Lambert, using numerous examples, explores key principles of cartographic dataviz.
This chapter will first introduce some basic functionalities of Geopandas
for creating static maps. To provide context to the presented information, we will use official geographic boundaries produced by IGN. We will then explore maps with enhanced contextualization and multiple levels of information, illustrating the benefits of using interactive libraries based on JavaScript
, such as Folium
.
Throughout this chapter, we will use several datasets to illustrate different types of maps:
Before getting started, a few packages need to be installed:
# Sur colab
1!pip install pandas fiona shapely pyproj rtree geopandas
We will primarily need Pandas
and GeoPandas
for this chapter.
We will use cartiflette
, which simplifies the retrieval of administrative basemaps from IGN. This package is an interministerial project designed to provide a simple Python
interface for obtaining official IGN boundaries.
First, we will retrieve the departmental boundaries:
These data bring the DROM closer to mainland France, as explained in one of the cartiflette
tutorials and as Exercise 1 will allow us to verify.
Exercise 1 aims to ensure that we have correctly retrieved the desired boundaries by simply visualizing them. This should be the first reflex of any geodata scientist.
plot
method on the departements
dataset to check the spatial extent. What projection do the displayed coordinates suggest? Verify using the crs
method.matplotlib
options, create a map with black boundaries, a white background, and no axes.The map of the departments, without modifying any options, looks like this:
The displayed coordinates suggest WGS84
, which can be verified using the crs
method:
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
If we convert to Lambert 93 (the official system for mainland France), we obtain a different extent, which is supposed to be more accurate for the mainland (but not for the relocated DROM, since, for example, French Guiana is actually much larger).
And of course, we can easily reproduce the failed maps from the chapter on GeoPandas
, for example, if we apply a transformation designed for North America:
If we create a slightly more aesthetically pleasing map, we get:
And the same for Finistère:
These maps are simple, yet they already rely on implicit knowledge. They require familiarity with the territory. When we start coloring certain departments, recognizing which ones have extreme values will require a good understanding of French geography. Likewise, while it may seem obvious, nothing in our map of Finistère explicitly states that the department is bordered by the ocean. A French reader would see this as self-evident, but a foreign reader, who may not be familiar with the details of our geography, would not necessarily know this.
To address this, we can use interactive maps that allow:
For this, we will retain only the data corresponding to an actual spatial extent, excluding our zoom on Île-de-France and the DROM.
departements_no_duplicates = (
departements
1 .drop_duplicates(subset = "INSEE_DEP")
)
departements_hexagone = (
departements_no_duplicates
2 .loc[~departements['INSEE_DEP'].str.startswith("97")]
)
We successfully obtain the hexagon:
For the next exercise, we will need a few additional variables. First, the geometric center of France, which will help us position the center of our map.
We will also need a dictionary to provide Folium
with information about our map parameters.
style_function = lambda x: {
1 'fillColor': 'white',
'color': 'black',
'weight': 1.5,
'fillOpacity': 0.0
}
fillOpacity
parameter set to 0%.
style_function
is an anonymous function that will be used in the exercise.
Information that appears when hovering over an element is called a tooltip in web development terminology.
For the next exercise, the GeoDataFrame must be in the Mercator projection. Folium
requires data in this projection because it relies on navigation basemaps, which are designed for this representation. Typically, Folium
is used for local visualizations where the surface distortion caused by the Mercator projection is not problematic.
For the next exercise, where we will represent France as a whole, we are slightly repurposing the library. However, since France is still relatively far from the North Pole, the distortion remains a small trade-off compared to the benefits of interactivity.
center
object and set zoom_start
to 5.departements_hexagone
dataset and the parameters style_function
and tooltip
.Here is the base layer from question 1:
And once formatted, this gives us the map: