HAB796B9: Software development for data science

(Almost) everything you need to know as an applied mathematician/statistician concerning coding and system administration.

Teachers

Prerequisite

Some rudiments in coding are expected (if, for, while, functions) but not mandatory. Students are expected to know basic notions of statistics.

Course description

This course focuses on discovering good coding practices (the language used is Python, but some elements of bash and git will also be useful) for professional coding. A special focus on data processing and visualization will be at the heart of the course. We will mostly focus on basic programming concepts, as well as on discovering the Python scientific libraries, including numpy, scipy, pandas, matplotlib, seaborn. Beyond pandas ninja skills, we will also introduce modern practices for coders: (unitary) tests, version control, documentation generation, etc.

Date Teacher Details
25/09/2023 BC Command-line tools, Version control with Git
02/10/2023 BC Version control with Git
16/10/2023 BC Version control with Git, IDE / Python virtual environment
23/10/2023 BC Python basics
06/11/2023 BC SciPy, Pandas
14/11/2023 BC Project snapshots
12/12/2023 BC The end: Project presentations

Grading

For this course, the grading consists of one group project.

Books and other resources

The resources for the course are available on the present GitHub repository. Additional elementary elements (in French) on Python are available in the course python for biologists.

Moodle webpage

The Moodle web page is available to registered students only.

Additional resources

Authors

This course material was improved with the help of some students including:

Back to top