1 Introduction: What is an API?
In the previous chapters, we saw how to consume data from a file (the simplest access mode) or how to retrieve data through web scraping, a method that allows Python
to mimic the behavior of a web browser and extract information by harvesting the HTML that a website serves.
Web scraping is a makeshift approach to accessing data. Fortunately, there are other ways to access data: data APIs. In computing, an API is a set of protocols that enables two software systems to communicate with each other. For example, the term “Pandas API” is sometimes used to indicate that Pandas
serves as an interface between your Python
code and a more efficient compiled language (C
) that performs the calculations you request at the Python level. The goal of an API is to provide a simple access point to a functionality while hiding the implementation details.
In this chapter, we focus mainly on data APIs. They are simply a way to make data available: rather than allowing the user direct access to databases (often large and complex), the API invites them to formulate a query which is processed by the server hosting the database, and then returns data in response to that query.
The increased use of APIs in the context of open data strategies is one of the pillars of the 15 French ministerial roadmaps regarding the opening, circulation, and valorization of public data.
Note
In recent years, an official geocoding service has been established for French territory. It is free and efficiently allows addresses to be geocoded via an API. This API, known as the National Address Database (BAN), has benefited from the pooling of data from various stakeholders (local authorities, postal services, IGN) as well as the expertise of contributors like Etalab. Its documentation is available at https://api.gouv.fr/les-api/base-adresse-nationale.
A common example used to illustrate APIs is that of a restaurant. The documentation is like your menu: it lists the dishes (databases) that you can order and any optional ingredients you can choose (the parameters of your query): chicken, beef, or a vegetarian option? When you order, you don’t get to see the recipe used in the kitchen to prepare your dish – you simply receive the finished product. Naturally, the more refined the dish you request (i.e. involving complex calculations on the server side), the longer it will take to arrive.
Illustration with the BAN API
To illustrate this, let’s imagine what happens when, later in the chapter, we make requests to the BAN API.
Using Python
, we send our order to the API: addresses that are more or less complete, along with additional instructions such as the municipality code. These extra details are akin to information provided to a restaurant’s server—like dietary restrictions—which personalize the recipe.
Based on these instructions, the dish is prepared. Specifically, a routine is executed on Etalab’s servers that searches an address repository for the one most similar to the address requested, possibly adapting based on the additional details provided. Once the kitchen has completed this preparation, the dish is sent back to the client. In this case, the “dish” consists of geographic coordinates corresponding to the best matching address.
Thus, the client only needs to focus on submitting a proper query and enjoying the dish delivered. The complexity of the process is handled by the specialists who designed the API. Perhaps other specialists, such as those at Google Maps, implement a different recipe for the same dish (geographic coordinates), but they will likely offer a very similar menu. This greatly simplifies your work: you only need to change a few lines of API call code rather than overhauling a long and complex set of address identification methods.
Pedagogical Approach
After an initial presentation of the general principle of APIs, this chapter illustrates their use through Python
via a fairly standard use case: we have a dataset that we first want to geolocate. To do this, we will ask an API to return geographic coordinates based on addresses. Later, we will retrieve somewhat more complex information through other APIs.
2 First Use of APIs
An API is intended to serve as an intermediary between a client and a server. This client can be of two types: a web interface or programming software. The API makes no assumptions about the tool sending it a command; it simply requires adherence to a standard (usually an HTTP request), a query structure (the arguments), and then awaits the result.
2.1 Understanding the Principle with an Interactive Example
The first mode (access via a browser) is primarily used when a web interface allows a user to make choices in order to return results corresponding to those selections. Let’s revisit the example of the geolocation API that we will use in this chapter. Imagine a web interface that offers the user two choices: a postal code and an address. These inputs will be injected into the query, and the server will respond with the appropriate geolocation.
Here are our two widgets that allow the client (the web page user) to choose their address.
Definition 2.1
A little formatting of the values provided by this widget allows one to obtain the desired query:
This gives us an output in JSON format, the most common output format for APIs.
If a beautiful display is desired, like the map above, the web browser will need to reprocess this output, which is typically done using Javascript
, the programming language embedded in web browsers.
2.2 How to Do It with Python
?
The principle is the same, although we lose the interactive aspect. With Python
, the idea is to construct the desired URL and fetch the result through an HTTP request.
We have already seen in the web scraping chapter how Python
communicates with the internet via the requests
package. This package follows the HTTP protocol where two main types of requests can be found: GET
and POST
:
- The
GET
request is used to retrieve data from a web server. It is the simplest and most common method for accessing the resources on a web page. We will start by describing this one. - The
POST
request is used to send data to the server, often with the goal of creating or updating a resource. On web pages, it is commonly used for submitting forms that need to update information in a database (passwords, customer data, etc.). We will see its usefulness later, when we begin to deal with authenticated requests where additional information must be submitted with our query.
Let’s conduct a first test with Python
as if we were already familiar with this API.
import requests
= "88 avenue verdier"
adresse = f"https://api-adresse.data.gouv.fr/search/?q={adresse.replace(" ", "+")}&postcode=92120"
url_ban_example requests.get(url_ban_example)
<Response [200]>
What do we get? An HTTP code. The code 200 corresponds to successful requests, meaning that the server is able to respond. If this is not the case, for some reason x or y, you will receive a different code.
HTTP Status Codes
HTTP status codes are standard responses sent by web servers to indicate the result of a request made by a client (such as a web browser or a Python script). They are categorized based on the first digit of the code:
- 1xx: Informational
- 2xx: Success
- 3xx: Redirection
- 4xx: Client-side Errors
- 5xx: Server-side Errors
The key codes to remember are: 200 (success), 400 (bad request), 401 (authentication failed), 403 (forbidden), 404 (resource not found), 503 (the server is unable to respond)
To retrieve the content returned by requests
, there are several methods available. When the JSON is well-formatted, the simplest approach is to use the json
method, which converts it into a dictionary:
= requests.get(url_ban_example)
req = req.json()
localisation_insee localisation_insee
{'type': 'FeatureCollection',
'version': 'draft',
'features': [{'type': 'Feature',
'geometry': {'type': 'Point', 'coordinates': [2.309144, 48.81622]},
'properties': {'label': '88 Avenue Verdier 92120 Montrouge',
'score': 0.9735636363636364,
'housenumber': '88',
'id': '92049_9625_00088',
'banId': '92dd3c4a-6703-423d-bf09-fc0412fb4f89',
'name': '88 Avenue Verdier',
'postcode': '92120',
'citycode': '92049',
'x': 649270.67,
'y': 6857572.24,
'city': 'Montrouge',
'context': '92, Hauts-de-Seine, Île-de-France',
'type': 'housenumber',
'importance': 0.7092,
'street': 'Avenue Verdier'}}],
'attribution': 'BAN',
'licence': 'ETALAB-2.0',
'query': '88 avenue verdier',
'filters': {'postcode': '92120'},
'limit': 5}
In this case, we can see that the data is nested within a JSON. Therefore, a bit of code needs to be written to extract the desired information from it:
'features')[0].get('properties') localisation_insee.get(
{'label': '88 Avenue Verdier 92120 Montrouge',
'score': 0.9735636363636364,
'housenumber': '88',
'id': '92049_9625_00088',
'banId': '92dd3c4a-6703-423d-bf09-fc0412fb4f89',
'name': '88 Avenue Verdier',
'postcode': '92120',
'citycode': '92049',
'x': 649270.67,
'y': 6857572.24,
'city': 'Montrouge',
'context': '92, Hauts-de-Seine, Île-de-France',
'type': 'housenumber',
'importance': 0.7092,
'street': 'Avenue Verdier'}
This is the main disadvantage of using APIs: the post-processing of the returned data. The necessary code is specific to each API, since the structure of the JSON depends on the API.
2.3 How to Know the Inputs and Outputs of APIs?
Here, we took the BAN API as a magical tool whose main inputs (the endpoint, parameters, and their formatting…) were known. But how does one actually get there in practice? Simply by reading the documentation when it exists and testing it with examples.
Good APIs provide an interactive tool called swagger
. It is an interactive website where the API’s main features are described and where the user can interactively test examples. These documentations are often automatically created during the construction of an API and made available via an entry point /docs
. They often allow you to edit certain parameters in the browser, view the obtained JSON (or the generated error), and retrieve the formatted query that produced it. These interactive browser consoles replicate the experimentation that can otherwise be done using specialized tools like postman
.
Regarding the BAN API, the documentation can be found at https://adresse.data.gouv.fr/api-doc/adresse. Unfortunately, it is not interactive. However, it provides many examples that can be directly tested from the browser. You simply need to use the URLs provided as examples. These are presented using curl
(a command-line equivalent of requests
in Linux):
"https://api-adresse.data.gouv.fr/search/?q=8+bd+du+port&limit=15" curl
Just copy the URL (https://api-adresse.data.gouv.fr/search/?q=8+bd+du+port&limit=15
), open a new tab, and verify that it produces a result. Then change a parameter and check again until you find the structure that fits. After that, you can move on to Python
as suggested in the following exercise.
2.4 Application
To start this exercise, you will need the following variable:
= "88 Avenue Verdier" adresse
Exercise 1: Structure an API Call from Python
- Test the API without any additional parameters, and convert the result into a
DataFrame
. - Limit the search to Montrouge using the appropriate parameter and find the corresponding INSEE code or postal code via Google.
- (Optional): Display the found address on a map.
The first two rows of the DataFrame
obtained in question 1 should be
label | score | housenumber | id | banId | name | postcode | citycode | x | y | city | context | type | importance | street | _type | locality | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 88 Avenue Verdier 92120 Montrouge | 0.973564 | 88 | 92049_9625_00088 | 92dd3c4a-6703-423d-bf09-fc0412fb4f89 | 88 Avenue Verdier | 92120 | 92049 | 649270.67 | 6857572.24 | Montrouge | 92, Hauts-de-Seine, Île-de-France | housenumber | 0.7092 | Avenue Verdier | address | NaN |
1 | Avenue Verdier 44500 La Baule-Escoublac | 0.719373 | NaN | 44055_3690 | NaN | Avenue Verdier | 44500 | 44055 | 291884.83 | 6701220.48 | La Baule-Escoublac | 44, Loire-Atlantique, Pays de la Loire | street | 0.6006 | Avenue Verdier | address | NaN |
For question 2, this time we get back only one observation, which could be further processed with GeoPandas
to verify that the point has been correctly placed on a map.
label | score | housenumber | id | banId | name | postcode | citycode | x | y | city | context | type | importance | street | _type | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 88 Avenue Verdier 92120 Montrouge | 0.973564 | 88 | 92049_9625_00088 | 92dd3c4a-6703-423d-bf09-fc0412fb4f89 | 88 Avenue Verdier | 92120 | 92049 | 649270.67 | 6857572.24 | Montrouge | 92, Hauts-de-Seine, Île-de-France | housenumber | 0.7092 | Avenue Verdier | address | POINT (2.30914 48.81622) |
Finally, for question 3, we obtain this map (more or less the same as before):
Make this Notebook Trusted to load map: File -> Trust Notebook
Some APIs to Know
The main providers of official data offer APIs. This is notably the case for Insee, Eurostat, the ECB, FED, and the World Bank…
However, data production by state institutions is far from limited to public statistics producers. The API gouv portal serves as the main reference point for APIs produced by the French central administration or territorial authorities. Many cities also publish data about their infrastructures via APIs, for example the City of Paris.
Private data providers also offer APIs. For instance, SNCF or RATP provide APIs for various purposes. Major digital players, such as Spotify
, generally offer APIs to integrate some of their services into external applications.
That said, it is important to be aware of the limitations of certain APIs. First, the data shared may not be very detailed so as not to compromise the confidentiality of the users’ information or the market share of the provider, which may have little incentive to share high-value data. Additionally, an API can disappear or change its structure overnight. Since data restructuring code is often closely tied to an API’s structure, you might end up having to modify a significant amount of code if a critical API undergoes a substantial change.
3 More Examples of GET
Requests
3.1 Main Source
We will use as the main basis for this tutorial the permanent equipment database, a directory of public facilities open to the public.
We will begin by retrieving the data that interest us. Rather than fetching every variable in the file, we only retrieve the ones we need: some variables concerning the facility, its address, and its local municipality.
We will restrict our scope to primary, secondary, and higher education institutions in the department of Haute-Garonne (department 31). These facilities are identified by a specific code, ranging from C1
to C5
.
import duckdb
= """
query FROM read_parquet('https://minio.lab.sspcloud.fr/lgaliana/diffusion/BPE23.parquet')
SELECT NOMRS, NUMVOIE, INDREP, TYPVOIE, LIBVOIE,
CADR, CODPOS, DEPCOM, DEP, TYPEQU,
concat_ws(' ', NUMVOIE, INDREP, TYPVOIE, LIBVOIE) AS adresse, SIRET
WHERE DEP = '31'
AND starts_with(TYPEQU, 'C')
AND NOT (starts_with(TYPEQU, 'C6') OR starts_with(TYPEQU, 'C7'))
"""
= duckdb.sql(query)
bpe = bpe.to_df() bpe
3.2 Retrieving Custom Data via APIs
We previously covered the general principle of an API request. To further illustrate how to retrieve data on a larger scale using an API, let’s try to fetch supplementary data to our main source. We will use the education directory, which provides extensive information on educational institutions. We will use the SIRET number to cross-reference the two data sources.
The following exercise will demonstrate the advantage of using an API to obtain custom data and the ease of fetching it via Python
. However, this exercise will also highlight one of the limitations of certain APIs, namely the volume of data that needs to be retrieved.
Exercise 2
- Visit the swagger of the National Education Directory API on api.gouv.fr/documentation and test an initial data retrieval using the
records
endpoint without any parameters. - Since we have retained only data from Haute Garonne in our main database, we want to retrieve only the institutions from that department using our API. Make a query with the appropriate parameter, without adding any extras.
- Increase the limit on the number of parameters—do you see the problem?
- We will attempt to retrieve these data via the
data.gouv
Tabular API. Its documentation is here and the resource identifier isb22f04bf-64a8-495d-b8bb-d84dbc4c7983
(source). With the help of the documentation, try to retrieve data via this API using the parameterCode_departement__exact=031
to select only the department of interest. - Do you see the problem and how we could automate data retrieval?
The first question allows us to retrieve an initial dataset.
identifiant_de_l_etablissement | nom_etablissement | type_etablissement | statut_public_prive | adresse_1 | adresse_2 | adresse_3 | code_postal | code_commune | nom_commune | ... | libelle_nature | code_type_contrat_prive | pial | etablissement_mere | type_rattachement_etablissement_mere | code_circonscription | code_zone_animation_pedagogique | libelle_zone_animation_pedagogique | code_bassin_formation | libelle_bassin_formation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0720661C | Ecole primaire publique Jean de la Fontaine | Ecole | Public | 6 rue de l'Ecole | None | 72220 ST BIEZ EN BELIN | 72220 | 72268 | Saint-Biez-en-Belin | ... | ECOLE DE NIVEAU ELEMENTAIRE | 99 | 0721304B | None | None | 0721383M | None | None | 17012 | LE MANS |
1 | 0720677V | Ecole primaire publique | Ecole | Public | Rue de l'Eglise | None | 72700 ROUILLON | 72700 | 72257 | Rouillon | ... | ECOLE DE NIVEAU ELEMENTAIRE | 99 | 0720798B | None | None | 0721404K | None | None | 17012 | LE MANS |
2 rows × 72 columns
However, there are two issues: the number of rows and the department of interest. Let’s first address the latter with question 2.
identifiant_de_l_etablissement | nom_etablissement | type_etablissement | statut_public_prive | adresse_1 | adresse_2 | adresse_3 | code_postal | code_commune | nom_commune | ... | libelle_nature | code_type_contrat_prive | pial | etablissement_mere | type_rattachement_etablissement_mere | code_circonscription | code_zone_animation_pedagogique | libelle_zone_animation_pedagogique | code_bassin_formation | libelle_bassin_formation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0313236Z | Ecole élémentaire publique Marie Marvingt | Ecole | Public | Esplanade Pierre Garrigues | None | 31400 TOULOUSE | 31400 | 31555 | Toulouse | ... | ECOLE DE NIVEAU ELEMENTAIRE | 99 | None | None | None | 0312793T | None | None | None | None |
1 | 0310709C | Ecole primaire publique Ox | Ecole | Public | 15 rue de Gascogne | OX | 31600 MURET | 31600 | 31395 | Muret | ... | ECOLE DE NIVEAU ELEMENTAIRE | 99 | 0311319R | None | None | 0311339M | None | None | 16107 | MURET |
2 | 0310837S | Ecole primaire publique Yvette Campo | Ecole | Public | 10 avenue Tolosane | None | 31410 ST HILAIRE | 31410 | 31486 | Saint-Hilaire | ... | ECOLE DE NIVEAU ELEMENTAIRE | 99 | 0311319R | None | None | 0311339M | None | None | 16107 | MURET |
3 | 0311697B | Ecole maternelle publique Pech-David | Ecole | Public | 163 bis chemin de la Salade Ponsan | None | 31400 TOULOUSE | 31400 | 31555 | Toulouse | ... | ECOLE MATERNELLE | 99 | 0310092G | None | None | 0312793T | None | None | 16109 | TOULOUSE CENTRE |
4 | 0311785X | Ecole élémentaire publique Jean de la Fontaine | Ecole | Public | Avenue François Mitterrand | None | 31220 MARTRES TOLOSANE | 31220 | 31324 | Martres-Tolosane | ... | ECOLE DE NIVEAU ELEMENTAIRE | 99 | 0310012V | None | None | 0311109M | None | None | 16106 | COMMINGES |
5 rows × 72 columns
This is better, but we still only have 10 observations. If we try to adjust the number of rows (question 3), we get the following response from the API:
b'{\n "error_code": "InvalidRESTParameterError",\n "message": "Invalid value for limit API parameter: 200 was found but -1 <= limit <= 100 is expected."\n}'
Let’s try using more comprehensive data: the raw file on data.gouv
. As seen in the metadata, we know there are over 1,000 schools for which data can be retrieved, but only 20 have been extracted here. The next
field directly provides the URL to fetch the next 20 pages: this is how we can ensure we retrieve all our data of interest.
The key part for automating the retrieval of our data is the links
key in the JSON:
{'profile': 'https://tabular-api.data.gouv.fr/api/resources/b22f04bf-64a8-495d-b8bb-d84dbc4c7983/profile/',
'swagger': 'https://tabular-api.data.gouv.fr/api/resources/b22f04bf-64a8-495d-b8bb-d84dbc4c7983/swagger/',
'next': 'https://tabular-api.data.gouv.fr/api/resources/b22f04bf-64a8-495d-b8bb-d84dbc4c7983/data/?Code_departement__exact=031&page=2&page_size=20',
'prev': None}
By looping over it to traverse the list of accessible URLs, we can retrieve the data. Since the automation code is rather tedious to write, here it is:
import requests
import pandas as pd
# Initialize the initial API URL
= "https://tabular-api.data.gouv.fr/api/resources/b22f04bf-64a8-495d-b8bb-d84dbc4c7983/data/?Code_departement__exact=031&page_size=50"
url_api_datagouv
# Initialize an empty list to store all data entries
= []
all_data
# Initialize the URL for pagination
= url_api_datagouv
current_url
# Loop until there is no next page
while current_url:
try:
# Make a GET request to the current URL
= requests.get(current_url)
response # Raise an exception for HTTP errors
response.raise_for_status()
# Parse the JSON response
= response.json()
json_response
# Extract data and append to the all_data list
= json_response.get('data', [])
page_data
all_data.extend(page_data)print(f"Fetched {len(page_data)} records from {current_url}")
# Get the next page URL
= json_response.get('links', {})
links = links.get('next') # This will be None if there's no next page
current_url
except requests.exceptions.RequestException as e:
print(f"An error occurred: {e}")
break
The resulting DataFrame is as follows:
= pd.DataFrame(all_data)
schools_dep31 schools_dep31.head()
__id | Identifiant_de_l_etablissement | Nom_etablissement | Type_etablissement | Statut_public_prive | Adresse_1 | Adresse_2 | Adresse_3 | Code_postal | Code_commune | ... | libelle_nature | Code_type_contrat_prive | PIAL | etablissement_mere | type_rattachement_etablissement_mere | code_circonscription | code_zone_animation_pedagogique | libelle_zone_animation_pedagogique | code_bassin_formation | libelle_bassin_formation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 454 | 0310164K | Ecole maternelle publique Marie Laurencin | Ecole | Public | Route de Lasbordes | None | 31130 BALMA | 31130 | 31044 | ... | ECOLE MATERNELLE | 99 | 0312698P | None | None | 0311101D | None | None | 16110 | TOULOUSE NORD |
1 | 455 | 0310172U | Ecole maternelle publique Jean Jaurès | Ecole | Public | Impasse des Ecoles | None | 31270 CUGNAUX | 31270 | 31157 | ... | ECOLE MATERNELLE | 99 | 0311093V | None | None | 0311105H | None | None | 16108 | TOULOUSE SUD-OUEST |
2 | 456 | 0310174W | Ecole maternelle publique | Ecole | Public | Rue des Ecoles | None | 31230 L ISLE EN DODON | 31230 | 31239 | ... | ECOLE MATERNELLE | 99 | 0310003K | None | None | 0311108L | None | None | 16106 | COMMINGES |
3 | 457 | 0310183F | Ecole maternelle publique le pilat | Ecole | Public | 1 rue du Dr Ferrand | None | 31800 ST GAUDENS | 31800 | 31483 | ... | ECOLE MATERNELLE | 99 | 0310083X | None | None | 0311108L | None | None | 16106 | COMMINGES |
4 | 458 | 0310200Z | Ecole maternelle publique Léo Lagrange | Ecole | Public | 35 allée Henri Sellier | None | 31400 TOULOUSE | 31400 | 31555 | ... | ECOLE MATERNELLE | 99 | 0311338L | None | None | 0312793T | None | None | 16109 | TOULOUSE CENTRE |
5 rows × 73 columns
We can merge this new data with our previous dataset to enrich it. For reliable production, care should be taken with schools that do not match, but this is not critical for this series of exercises.
= bpe.merge(
bpe_enriched
schools_dep31,= "SIRET",
left_on = "SIREN_SIRET"
right_on
)2) bpe_enriched.head(
NOMRS | NUMVOIE | INDREP | TYPVOIE | LIBVOIE | CADR | CODPOS | DEPCOM | DEP | TYPEQU | ... | libelle_nature | Code_type_contrat_prive | PIAL | etablissement_mere | type_rattachement_etablissement_mere | code_circonscription | code_zone_animation_pedagogique | libelle_zone_animation_pedagogique | code_bassin_formation | libelle_bassin_formation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ECOLE PRIMAIRE PUBLIQUE DENIS LATAPIE | LD | LA BOURDETTE | 31230 | 31001 | 31 | C108 | ... | ECOLE DE NIVEAU ELEMENTAIRE | 99 | 0310003K | None | None | 0311108L | None | None | 16106 | COMMINGES | |||
1 | ECOLE MATERNELLE PUBLIQUE | 21 | CHE | DE L AUTAN | 31280 | 31003 | 31 | C107 | ... | ECOLE MATERNELLE | 99 | 0311335H | None | None | 0311102E | None | None | 16128 | TOULOUSE EST |
2 rows × 85 columns
This provides us with data enriched with new characteristics about the institutions. Although there are geographic coordinates in the dataset, we will pretend there aren’t to reuse our geolocation API.
4 Discovering POST
Requests
4.1 Logic
So far, we have discussed GET
requests. Now, we will introduce POST
requests, which allow for more complex interactions with API servers.
To explore this, we will revisit the previous geolocation API but use a different endpoint that requires a POST
request.
POST
requests are typically used when specific data needs to be sent to trigger an action. For instance, in the web world, if authentication is required, a POST
request can send a token to the server, which will respond by accepting your authentication.
In our case, we will send data to the server, which will process it for geolocation and then send us a response. To continue the culinary metaphor, it’s like handing over your own container (tupperware) to the kitchen to collect your takeaway meal.
4.2 Principle
Let’s look at this request provided on the geolocation API’s documentation site:
curl -X POST -F data=@path/to/file.csv -F columns=voie -F columns=ville -F citycode=ma_colonne_code_insee https://api-adresse.data.gouv.fr/search/csv/
As mentioned earlier, curl
is a command-line tool for making API requests. The -X POST
option clearly indicates that we want to make a POST
request.
Other arguments are passed using the -F
options. In this case, we are sending a file and adding parameters to help the server locate the data inside it. The @
symbol indicates that file.csv
should be read from the disk and sent in the request body as form data.
4.3 Application with Python
We have requests.get
, so naturally, we also have requests.post
. This time, parameters must be passed to our request as a dictionary, where the keys are argument names and the values are Python
objects.
The main challenge, illustrated in the next exercise, lies in passing the data
argument: the file must be sent as a Python
object using the open
function.
Exercise 3: A POST request to geolocate our data in bulk
- Save the
adresse
,DEPCOM
, andNom_commune
columns of the equipment database merged with our previous directory (objectbpe_enriched
) in CSV format. Before writing to CSV, it may be helpful to replace commas in theadresse
column with spaces. - Create the
response
object usingrequests.post
with the correct arguments to geocode your CSV. - Transform your output into a
geopandas
object using the following command:
= pd.read_csv(io.StringIO(response.text)) bpe_loc
The obtained geolocations take this form
index | adresse | DEPCOM | Nom_commune | result_score | latitude | longitude | |
---|---|---|---|---|---|---|---|
0 | 0 | LD LA BOURDETTE | 31001 | Agassac | 0.404609 | 43.374288 | 0.880679 |
1 | 1 | 21 CHE DE L AUTAN | 31003 | Aigrefeuille | 0.730293 | 43.567530 | 1.585745 |
By enriching the previous data, this gives:
NOMRS | NUMVOIE | INDREP | TYPVOIE | LIBVOIE | CADR | CODPOS | DEPCOM | DEP | TYPEQU | ... | etablissement_mere | type_rattachement_etablissement_mere | code_circonscription | code_zone_animation_pedagogique | libelle_zone_animation_pedagogique | code_bassin_formation | libelle_bassin_formation | result_score | latitude_ban | longitude_ban | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ECOLE PRIMAIRE PUBLIQUE DENIS LATAPIE | LD | LA BOURDETTE | 31230 | 31001 | 31 | C108 | ... | None | None | 0311108L | None | None | 16106 | COMMINGES | 0.404609 | 43.374288 | 0.880679 | |||
1 | ECOLE MATERNELLE PUBLIQUE | 21 | CHE | DE L AUTAN | 31280 | 31003 | 31 | C107 | ... | None | None | 0311102E | None | None | 16128 | TOULOUSE EST | 0.730293 | 43.567530 | 1.585745 |
2 rows × 88 columns
We can check that the geolocation is not too off by comparing it with the longitudes and latitudes of the education directory added earlier:
NOMRS | Nom_commune | longitude_annuaire | longitude_ban | latitude_annuaire | latitude_ban | |
---|---|---|---|---|---|---|
476 | ECOLE ELEMENTAIRE PUBLIQUE | Montpitol | 1.650650 | 1.641803 | 43.704645 | 43.708265 |
258 | ECOLE PRIMAIRE PUBLIQUE | Estancarbon | 0.785720 | 0.778757 | 43.105282 | 43.111991 |
782 | ECOLE MATERNELLE PUBLIQUE LE BEARNAIS | Toulouse | 1.444210 | 1.426256 | 43.604637 | 43.613454 |
150 | ECOLE MATERNELLE PUBLIQUE LES 4 COLLINES | Castelmaurou | 1.534418 | 1.534124 | 43.680288 | 43.680452 |
508 | ECOLE ELEMENTAIRE PUBLIQUE JEAN ROSTAND | Nailloux | 1.621001 | 1.620835 | 43.358126 | 43.357786 |
Without going into detail, the positions seem very similar, with only minor inaccuracies.
To make use of our enriched data, we can create a map. To add some context to it, we can place a background map of the municipalities. This can be retrieved using cartiflette
:
from cartiflette import carti_download
= carti_download(
shp_communes = 4326,
crs = ["31"],
values ="COMMUNE",
borders="topojson",
vectorfile_format="DEPARTEMENT",
filter_by="EXPRESS-COG-CARTO-TERRITOIRE",
source=2022
year
)= 4326 shp_communes.crs
This is an experimental version of cartiflette published on PyPi.
To use the latest stable version, you can install it directly from GitHub with the following command:
pip install git+https://github.com/inseeFrLab/cartiflette.git
ERROR 1: PROJ: proj_create_from_database: Open of /opt/conda/share/proj failed
Represented on a map, this gives the following map:
Make this Notebook Trusted to load map: File -> Trust Notebook
5 Managing Secrets and Exceptions
We have already used several APIs. However, these APIs were all without authentication and had few restrictions, except for the number of calls. This is not the case for all APIs. It is common for APIs that allow access to more data or confidential information to require authentication to track data users.
This is usually done through a token. A token is a kind of password often used in modern authentication systems to certify a user’s identity (see Git chapter).
To illustrate the use of tokens, we will use an API from the INPI (National Institute of Intellectual Property). The APIs developed by this organization require authentication. We will use this API to retrieve PDF documents from corporate financial statements.
Before diving into this, we will take a detour to discuss token confidentiality and how to avoid exposing tokens in your code.
5.1 Using a Token in Code Without Revealing It
Tokens are personal information that should not be shared. They are not meant to be present in the code. As mentioned multiple times in the production deployment course taught by Romain Avouac and myself in the third year, it is crucial to separate the code from configuration elements.
The idea is to find a way to include configuration elements with the code without exposing them directly in the code. The general approach is to store the token value in a variable without revealing it in the code. How can we declare the token value without making it visible in the code?
- For interactive code (e.g., via a notebook), it is possible to create a dialog box that injects the provided value into a variable. This can be done using the
getpass
package. - For non-interactive code, such as command-line scripts, the environment variable approach is the most reliable, provided you are careful not to include the password file in
Git
.
The following exercise will demonstrate these two methods. These methods will help us confidentially add a payload to authentication requests, i.e., confidential identifying information as part of a request.
5.2 Application
For this application, starting from question 4, we will need to create a special class that allows requests
to override our request with an authentication token. Since it is not trivial to create without prior knowledge, here it is:
class BearerAuth(requests.auth.AuthBase):
def __init__(self, token):
self.token = token
def __call__(self, r):
"authorization"] = "Bearer " + self.token
r.headers[return r
We will also need this variable, which corresponds to Decathlon’s Siren.
= "500569405" siren
Exercise 4: Adding a payload to a request
- Create an account for the INPI API (National Institute of Intellectual Property), which we will use to retrieve financial statements of companies in PDF format.
- Create the
username
andpassword
variables usinggetpass
, ensuring the values are not hardcoded. - Using the API documentation and the
json
argument ofrequests.post
, retrieve an authentication token and store it in a variabletoken
. - Retrieve the data using the f-string
f'https://registre-national-entreprises.inpi.fr/api/companies/{siren}/attachments'
and providerequests
with the argumentauth=BearerAuth(token)
. - Create
identifier = documents.get('bilans')[0]['id']
and userequests
with the URLf'https://registre-national-entreprises.inpi.fr/api/bilans/{identifier}/download'
, without arguments, to retrieve a PDF. Did it work? Check the status code. What does it mean? How can this be avoided? - Assuming the
requests.get
object created is namedr
, write the API output to a PDF as follows:
= 'decathlon.pdf'
binary_file_path with open(binary_file_path, 'wb') as f:
f.write(r.content)
- Replace the use of
getpass
with the environment variable approach usingdotenv
.
For question 5, without an identifier, we get the error code 401, which corresponds to “Unauthorized”, meaning the request is denied. However, if we add the token as before, everything works fine, and we retrieve Decathlon’s financial statement.
The retrieved PDF
Important
The environment variable approach is the most general and flexible. However, it is crucial to ensure that the .env
file storing the credentials is not added to Git
. Otherwise, you risk exposing identifying information, which negates any benefits of the good practices implemented with dotenv
.
The solution is simple: add the .env
line to .gitignore
and, for extra safety, include *.env
in case the file is not at the root of the repository. To learn more about the .gitignore
file, refer to the Git chapters.
6 Opening Up to Model APIs
So far, we have explored data APIs, which allow us to retrieve data. However, this is not the only interesting use case for APIs among Python
users.
There are many other types of APIs, and model APIs are particularly noteworthy. They allow access to pre-trained models or even perform inference on specialized servers with more resources than a local computer (more details in the machine learning and NLP sections). The most well-known library in this field is the transformers
library developed by HuggingFace
.
One of the objectives of the 3rd-year production deployment course is to demonstrate how this type of software architecture works and how it can be implemented for models you have created yourself.
7 Additional Exercises: et si on ajoutait des informations sur la valeur ajoutée des lycées ?
Bonus Exercise
In our example on schools, limit the scope to high schools and add information on the added value of high schools available here.
Bonus Exercise 2: Where are we going out tonight?
Finding a common place to meet friends is always a subject of tough negotiations. What if we let geography guide us?
- Create a
DataFrame
recording a series of addresses and postal codes, like the example below. - Adapt the code from the exercise on the BAN API, using its documentation, to geolocate these addresses.
- Assuming your geolocated data is named
adresses_geocoded
, use the proposed code to transform them into a polygon. - Calculate the centroid and display it on an interactive
Folium
map as before.
You forgot there’s a couple in the group… Take into account the poids
variable to calculate the barycenter and find out where to meet tonight.
Create the polygon from the geolocations
from shapely.geom
import Polygon
etry = list(zip(adresses_geocoded['longitude'], adresses_geocoded['latitude']))
coordinates = Polygon(coordinates)
polygon
= gpd.GeoDataFrame(index=[0], crs='epsg:4326', geometry=[polygon])
polygon polygon
The example DataFrame:
= pd.DataFrame(
adresses_text
{"adresse": [
"10 Rue de Rivoli",
"15 Boulevard Saint-Michel",
"8 Rue Saint-Honoré",
"20 Avenue des Champs-Élysées",
"Place de la Bastille",
],"cp": ["75004", "75005", "75001", "75008", "75011"],
"poids": [2, 1, 1, 1, 1]
}) adresses_text
adresse | cp | poids | |
---|---|---|---|
0 | 10 Rue de Rivoli | 75004 | 2 |
1 | 15 Boulevard Saint-Michel | 75005 | 1 |
2 | 8 Rue Saint-Honoré | 75001 | 1 |
3 | 20 Avenue des Champs-Élysées | 75008 | 1 |
4 | Place de la Bastille | 75011 | 1 |
adresse | poids | cp | result_score | latitude | longitude | |
---|---|---|---|---|---|---|
0 | 10 Rue de Rivoli | 2 | 75004 | 0.970320 | 48.855500 | 2.360410 |
1 | 15 Boulevard Saint-Michel | 1 | 75005 | 0.973425 | 48.851852 | 2.343614 |
2 | 8 Rue Saint-Honoré | 1 | 75001 | 0.866401 | 48.863787 | 2.334511 |
3 | 20 Avenue des Champs-Élysées | 1 | 75008 | 0.872191 | 48.871285 | 2.302859 |
4 | Place de la Bastille | 1 | 75011 | 0.965357 | 48.853711 | 2.370213 |
The geolocation obtained for this example
Here is the map obtained from the example dataset. We might stay drier with the barycenter than with the centroid.
Make this Notebook Trusted to load map: File -> Trust Notebook
Informations additionnelles
environment files have been tested on.
Latest built version: 2025-03-19
Python version used:
'3.12.6 | packaged by conda-forge | (main, Sep 30 2024, 18:08:52) [GCC 13.3.0]'
Package | Version |
---|---|
affine | 2.4.0 |
aiobotocore | 2.21.1 |
aiohappyeyeballs | 2.6.1 |
aiohttp | 3.11.13 |
aioitertools | 0.12.0 |
aiosignal | 1.3.2 |
alembic | 1.13.3 |
altair | 5.4.1 |
aniso8601 | 9.0.1 |
annotated-types | 0.7.0 |
anyio | 4.8.0 |
appdirs | 1.4.4 |
archspec | 0.2.3 |
asttokens | 2.4.1 |
attrs | 25.3.0 |
babel | 2.17.0 |
bcrypt | 4.2.0 |
beautifulsoup4 | 4.12.3 |
black | 24.8.0 |
blinker | 1.8.2 |
blis | 1.2.0 |
bokeh | 3.5.2 |
boltons | 24.0.0 |
boto3 | 1.37.1 |
botocore | 1.37.1 |
branca | 0.7.2 |
Brotli | 1.1.0 |
bs4 | 0.0.2 |
cachetools | 5.5.0 |
cartiflette | 0.0.2 |
Cartopy | 0.24.1 |
catalogue | 2.0.10 |
cattrs | 24.1.2 |
certifi | 2025.1.31 |
cffi | 1.17.1 |
charset-normalizer | 3.4.1 |
chromedriver-autoinstaller | 0.6.4 |
click | 8.1.8 |
click-plugins | 1.1.1 |
cligj | 0.7.2 |
cloudpathlib | 0.21.0 |
cloudpickle | 3.0.0 |
colorama | 0.4.6 |
comm | 0.2.2 |
commonmark | 0.9.1 |
conda | 24.9.1 |
conda-libmamba-solver | 24.7.0 |
conda-package-handling | 2.3.0 |
conda_package_streaming | 0.10.0 |
confection | 0.1.5 |
contextily | 1.6.2 |
contourpy | 1.3.1 |
cryptography | 43.0.1 |
cycler | 0.12.1 |
cymem | 2.0.11 |
cytoolz | 1.0.0 |
dask | 2024.9.1 |
dask-expr | 1.1.15 |
databricks-sdk | 0.33.0 |
dataclasses-json | 0.6.7 |
debugpy | 1.8.6 |
decorator | 5.1.1 |
Deprecated | 1.2.14 |
diskcache | 5.6.3 |
distributed | 2024.9.1 |
distro | 1.9.0 |
docker | 7.1.0 |
duckdb | 1.2.1 |
en_core_web_sm | 3.8.0 |
entrypoints | 0.4 |
et_xmlfile | 2.0.0 |
exceptiongroup | 1.2.2 |
executing | 2.1.0 |
fastexcel | 0.11.6 |
fastjsonschema | 2.21.1 |
fiona | 1.10.1 |
Flask | 3.0.3 |
folium | 0.17.0 |
fontawesomefree | 6.6.0 |
fonttools | 4.56.0 |
fr_core_news_sm | 3.8.0 |
frozendict | 2.4.4 |
frozenlist | 1.5.0 |
fsspec | 2023.12.2 |
geographiclib | 2.0 |
geopandas | 1.0.1 |
geoplot | 0.5.1 |
geopy | 2.4.1 |
gitdb | 4.0.11 |
GitPython | 3.1.43 |
google-auth | 2.35.0 |
graphene | 3.3 |
graphql-core | 3.2.4 |
graphql-relay | 3.2.0 |
graphviz | 0.20.3 |
great-tables | 0.12.0 |
greenlet | 3.1.1 |
gunicorn | 22.0.0 |
h11 | 0.14.0 |
h2 | 4.1.0 |
hpack | 4.0.0 |
htmltools | 0.6.0 |
httpcore | 1.0.7 |
httpx | 0.28.1 |
httpx-sse | 0.4.0 |
hyperframe | 6.0.1 |
idna | 3.10 |
imageio | 2.37.0 |
importlib_metadata | 8.6.1 |
importlib_resources | 6.5.2 |
inflate64 | 1.0.1 |
ipykernel | 6.29.5 |
ipython | 8.28.0 |
itsdangerous | 2.2.0 |
jedi | 0.19.1 |
Jinja2 | 3.1.6 |
jmespath | 1.0.1 |
joblib | 1.4.2 |
jsonpatch | 1.33 |
jsonpointer | 3.0.0 |
jsonschema | 4.23.0 |
jsonschema-specifications | 2024.10.1 |
jupyter-cache | 1.0.0 |
jupyter_client | 8.6.3 |
jupyter_core | 5.7.2 |
kaleido | 0.2.1 |
kiwisolver | 1.4.8 |
langchain | 0.3.20 |
langchain-community | 0.3.9 |
langchain-core | 0.3.45 |
langchain-text-splitters | 0.3.6 |
langcodes | 3.5.0 |
langsmith | 0.1.147 |
language_data | 1.3.0 |
lazy_loader | 0.4 |
libmambapy | 1.5.9 |
locket | 1.0.0 |
loguru | 0.7.3 |
lxml | 5.3.1 |
lz4 | 4.3.3 |
Mako | 1.3.5 |
mamba | 1.5.9 |
mapclassify | 2.8.1 |
marisa-trie | 1.2.1 |
Markdown | 3.6 |
markdown-it-py | 3.0.0 |
MarkupSafe | 3.0.2 |
marshmallow | 3.26.1 |
matplotlib | 3.10.1 |
matplotlib-inline | 0.1.7 |
mdurl | 0.1.2 |
menuinst | 2.1.2 |
mercantile | 1.2.1 |
mizani | 0.11.4 |
mlflow | 2.16.2 |
mlflow-skinny | 2.16.2 |
msgpack | 1.1.0 |
multidict | 6.1.0 |
multivolumefile | 0.2.3 |
munkres | 1.1.4 |
murmurhash | 1.0.12 |
mypy-extensions | 1.0.0 |
narwhals | 1.30.0 |
nbclient | 0.10.0 |
nbformat | 5.10.4 |
nest_asyncio | 1.6.0 |
networkx | 3.4.2 |
nltk | 3.9.1 |
numpy | 2.2.3 |
opencv-python-headless | 4.10.0.84 |
openpyxl | 3.1.5 |
opentelemetry-api | 1.16.0 |
opentelemetry-sdk | 1.16.0 |
opentelemetry-semantic-conventions | 0.37b0 |
orjson | 3.10.15 |
outcome | 1.3.0.post0 |
OWSLib | 0.28.1 |
packaging | 24.2 |
pandas | 2.2.3 |
paramiko | 3.5.0 |
parso | 0.8.4 |
partd | 1.4.2 |
pathspec | 0.12.1 |
patsy | 1.0.1 |
Pebble | 5.1.0 |
pexpect | 4.9.0 |
pickleshare | 0.7.5 |
pillow | 11.1.0 |
pip | 24.2 |
platformdirs | 4.3.6 |
plotly | 5.24.1 |
plotnine | 0.13.6 |
pluggy | 1.5.0 |
polars | 1.8.2 |
preshed | 3.0.9 |
prometheus_client | 0.21.0 |
prometheus_flask_exporter | 0.23.1 |
prompt_toolkit | 3.0.48 |
propcache | 0.3.0 |
protobuf | 4.25.3 |
psutil | 7.0.0 |
ptyprocess | 0.7.0 |
pure_eval | 0.2.3 |
py7zr | 0.20.8 |
pyarrow | 17.0.0 |
pyarrow-hotfix | 0.6 |
pyasn1 | 0.6.1 |
pyasn1_modules | 0.4.1 |
pybcj | 1.0.3 |
pycosat | 0.6.6 |
pycparser | 2.22 |
pycryptodomex | 3.21.0 |
pydantic | 2.10.6 |
pydantic_core | 2.27.2 |
pydantic-settings | 2.8.1 |
Pygments | 2.19.1 |
PyNaCl | 1.5.0 |
pynsee | 0.1.8 |
pyogrio | 0.10.0 |
pyOpenSSL | 24.2.1 |
pyparsing | 3.2.1 |
pyppmd | 1.1.1 |
pyproj | 3.7.1 |
pyshp | 2.3.1 |
PySocks | 1.7.1 |
python-dateutil | 2.9.0.post0 |
python-dotenv | 1.0.1 |
python-magic | 0.4.27 |
pytz | 2025.1 |
pyu2f | 0.1.5 |
pywaffle | 1.1.1 |
PyYAML | 6.0.2 |
pyzmq | 26.3.0 |
pyzstd | 0.16.2 |
querystring_parser | 1.2.4 |
rasterio | 1.4.3 |
referencing | 0.36.2 |
regex | 2024.9.11 |
requests | 2.32.3 |
requests-cache | 1.2.1 |
requests-toolbelt | 1.0.0 |
retrying | 1.3.4 |
rich | 13.9.4 |
rpds-py | 0.23.1 |
rsa | 4.9 |
rtree | 1.4.0 |
ruamel.yaml | 0.18.6 |
ruamel.yaml.clib | 0.2.8 |
s3fs | 2023.12.2 |
s3transfer | 0.11.3 |
scikit-image | 0.24.0 |
scikit-learn | 1.6.1 |
scipy | 1.13.0 |
seaborn | 0.13.2 |
selenium | 4.29.0 |
setuptools | 76.0.0 |
shapely | 2.0.7 |
shellingham | 1.5.4 |
six | 1.17.0 |
smart-open | 7.1.0 |
smmap | 5.0.0 |
sniffio | 1.3.1 |
sortedcontainers | 2.4.0 |
soupsieve | 2.5 |
spacy | 3.8.4 |
spacy-legacy | 3.0.12 |
spacy-loggers | 1.0.5 |
SQLAlchemy | 2.0.39 |
sqlparse | 0.5.1 |
srsly | 2.5.1 |
stack-data | 0.6.2 |
statsmodels | 0.14.4 |
tabulate | 0.9.0 |
tblib | 3.0.0 |
tenacity | 9.0.0 |
texttable | 1.7.0 |
thinc | 8.3.4 |
threadpoolctl | 3.6.0 |
tifffile | 2025.3.13 |
toolz | 1.0.0 |
topojson | 1.9 |
tornado | 6.4.2 |
tqdm | 4.67.1 |
traitlets | 5.14.3 |
trio | 0.29.0 |
trio-websocket | 0.12.2 |
truststore | 0.9.2 |
typer | 0.15.2 |
typing_extensions | 4.12.2 |
typing-inspect | 0.9.0 |
tzdata | 2025.1 |
Unidecode | 1.3.8 |
url-normalize | 1.4.3 |
urllib3 | 1.26.20 |
uv | 0.6.8 |
wasabi | 1.1.3 |
wcwidth | 0.2.13 |
weasel | 0.4.1 |
webdriver-manager | 4.0.2 |
websocket-client | 1.8.0 |
Werkzeug | 3.0.4 |
wheel | 0.44.0 |
wordcloud | 1.9.3 |
wrapt | 1.17.2 |
wsproto | 1.2.0 |
xgboost | 2.1.1 |
xlrd | 2.0.1 |
xyzservices | 2025.1.0 |
yarl | 1.18.3 |
yellowbrick | 1.5 |
zict | 3.0.0 |
zipp | 3.21.0 |
zstandard | 0.23.0 |
View file history
SHA | Date | Author | Description |
---|---|---|---|
8111459f | 2025-03-17 14:59:44 | Lino Galiana | Réparation du chapitre API (#600) |
3f1d2f3f | 2025-03-15 15:55:59 | Lino Galiana | Fix problem with uv and malformed files (#599) |
3a4294de | 2025-02-01 12:18:20 | lgaliana | eval false API |
efa65690 | 2025-01-22 22:59:55 | lgaliana | Ajoute exemple barycentres |
6fd516eb | 2025-01-21 20:49:14 | lgaliana | Traduction en anglais du chapitre API |
4251573b | 2025-01-20 13:39:36 | lgaliana | orrige chapo API |
3c22d3a8 | 2025-01-15 11:40:33 | lgaliana | SIREN Decathlon |
e182c9a7 | 2025-01-13 23:03:24 | Lino Galiana | Finalisation nouvelle version chapitre API (#586) |
f992df0b | 2025-01-06 16:50:39 | lgaliana | Code pour import BPE |
dc4a475d | 2025-01-03 17:17:34 | Lino Galiana | Révision de la partie API (#584) |
6c6dfe52 | 2024-12-20 13:40:33 | lgaliana | eval false for API chapter |
e56a2191 | 2024-10-30 17:13:03 | Lino Galiana | Intro partie modélisation & typo geopandas (#571) |
9d8e69c3 | 2024-10-21 17:10:03 | lgaliana | update badges shortcode for all manipulation part |
47a0770b | 2024-08-23 07:51:58 | linogaliana | fix API notebook |
1953609d | 2024-08-12 16:18:19 | linogaliana | One button is enough |
783a278b | 2024-08-12 11:07:18 | Lino Galiana | Traduction API (#538) |
580cba77 | 2024-08-07 18:59:35 | Lino Galiana | Multilingual version as quarto profile (#533) |
101465fb | 2024-08-07 13:56:35 | Lino Galiana | regex, webscraping and API chapters in 🇬🇧 (#532) |
065b0abd | 2024-07-08 11:19:43 | Lino Galiana | Nouveaux callout dans la partie manipulation (#513) |
06d003a1 | 2024-04-23 10:09:22 | Lino Galiana | Continue la restructuration des sous-parties (#492) |
8c316d0a | 2024-04-05 19:00:59 | Lino Galiana | Fix cartiflette deprecated snippets (#487) |
005d89b8 | 2023-12-20 17:23:04 | Lino Galiana | Finalise l’affichage des statistiques Git (#478) |
3fba6124 | 2023-12-17 18:16:42 | Lino Galiana | Remove some badges from python (#476) |
a06a2689 | 2023-11-23 18:23:28 | Antoine Palazzolo | 2ème relectures chapitres ML (#457) |
b68369d4 | 2023-11-18 18:21:13 | Lino Galiana | Reprise du chapitre sur la classification (#455) |
889a71ba | 2023-11-10 11:40:51 | Antoine Palazzolo | Modification TP 3 (#443) |
04ce5676 | 2023-10-23 19:04:01 | Lino Galiana | Mise en forme chapitre API (#442) |
3eb0aeb1 | 2023-10-23 11:59:24 | Thomas Faria | Relecture jusqu’aux API (#439) |
a7711832 | 2023-10-09 11:27:45 | Antoine Palazzolo | Relecture TD2 par Antoine (#418) |
a63319ad | 2023-10-04 15:29:04 | Lino Galiana | Correction du TP numpy (#419) |
154f09e4 | 2023-09-26 14:59:11 | Antoine Palazzolo | Des typos corrigées par Antoine (#411) |
3bdf3b06 | 2023-08-25 11:23:02 | Lino Galiana | Simplification de la structure 🤓 (#393) |
130ed717 | 2023-07-18 19:37:11 | Lino Galiana | Restructure les titres (#374) |
f0c583c0 | 2023-07-07 14:12:22 | Lino Galiana | Images viz (#371) |
ef28fefd | 2023-07-07 08:14:42 | Lino Galiana | Listing pour la première partie (#369) |
f21a24d3 | 2023-07-02 10:58:15 | Lino Galiana | Pipeline Quarto & Pages 🚀 (#365) |
62aeec12 | 2023-06-10 17:40:39 | Lino Galiana | Avertissement sur la partie API (#358) |
38693f62 | 2023-04-19 17:22:36 | Lino Galiana | Rebuild visualisation part (#357) |
32486330 | 2023-02-18 13:11:52 | Lino Galiana | Shortcode rawhtml (#354) |
3c880d59 | 2022-12-27 17:34:59 | Lino Galiana | Chapitre regex + Change les boites dans plusieurs chapitres (#339) |
f5f0f9c4 | 2022-11-02 19:19:07 | Lino Galiana | Relecture début partie modélisation KA (#318) |
2dc82e7b | 2022-10-18 22:46:47 | Lino Galiana | Relec Kim (visualisation + API) (#302) |
f10815b5 | 2022-08-25 16:00:03 | Lino Galiana | Notebooks should now look more beautiful (#260) |
494a85ae | 2022-08-05 14:49:56 | Lino Galiana | Images featured ✨ (#252) |
d201e3cd | 2022-08-03 15:50:34 | Lino Galiana | Pimp la homepage ✨ (#249) |
1239e3e9 | 2022-06-21 14:05:15 | Lino Galiana | Enonces (#239) |
bb38643d | 2022-06-08 16:59:40 | Lino Galiana | Répare bug leaflet (#234) |
5698e303 | 2022-06-03 18:28:37 | Lino Galiana | Finalise widget (#232) |
7b9f27be | 2022-06-03 17:05:15 | Lino Galiana | Essaie régler les problèmes widgets JS (#231) |
1ca1a8a7 | 2022-05-31 11:44:23 | Lino Galiana | Retour du chapitre API (#228) |
Citation
BibTeX citation:
@book{galiana2023,
author = {Galiana, Lino},
title = {Python Pour La Data Science},
date = {2023},
url = {https://pythonds.linogaliana.fr/},
doi = {10.5281/zenodo.8229676},
langid = {en}
}
For attribution, please cite this work as:
Galiana, Lino. 2023. Python Pour La Data Science. https://doi.org/10.5281/zenodo.8229676.