Lino Galiana

2025-03-19

Data science with Python

Course website Python for Data Science , an introduction to Python for the second year of the engineering curriculum at ENSAE (Master 1).


All content of this group is freely available here or on Github and can be tested in the form of Jupyter notebooks.


Example with the introduction to Pandas

On the agenda:

Overall, this course offers a comprehensive content that can satisfy both beginners in data science and those looking for more advanced material:

  1. Data Manipulation: standard data manipulation (Pandas), geographic data (Geopandas), data retrieval (web scraping, APIs)…
  2. Data Visualization: classic visualizations (Matplotlib, Seaborn), cartography, reactive visualizations (Plotly, Folium)
  3. Modeling: machine learning (Scikit), econometrics
  4. Text Data Processing (NLP): introduction to tokenization with NLTK and SpaCy, modeling…
  5. Introduction to Modern Data Science: cloud computing, ElasticSearch, continuous integration…

All content on this site relies on open data, whether it is French data (mainly from the central platform data.gouv or the Insee website) or American data. The program is presented linearly at the top of this page (👆️) or in a disordered manner below (👇️).

A good complement to the website content is the course given with Romain Avouac in the final year at ENSAE, which focuses more on the production of data science projects: https://ensae-reproductibilite.github.io/


Thèmes en vrac

Pour découvrir Python de manière désordonnée. La version ordonnée est dans la partie supérieure de cette page (👆️).

No matching items
Back to top

Citation

BibTeX citation:
@book{galiana2023,
  author = {Galiana, Lino},
  title = {Python Pour La Data Science},
  date = {2023},
  url = {https://pythonds.linogaliana.fr/},
  doi = {10.5281/zenodo.8229676},
  langid = {en}
}
For attribution, please cite this work as:
Galiana, Lino. 2023. Python Pour La Data Science. https://doi.org/10.5281/zenodo.8229676.