Basics for the use of Python in Official Statistics | |
Course Leader | Christian Kauth |
Target Group | Statistical production units and methodologists of NSIs. |
Entry Qualifications |
|
Objective(s) | The main objectives of the course are:
|
Contents | FIRST PART – Theory (3 days) Python fundamentals through examples • Why Python? History, needs, advantages, disadvantages, … • Python data structures o data types and variables, mutable and immutable, o strings: formatting, slicing, concatenation, repetition, … o lists, tuples: indexing, slicing, concatenation, repetition, shallow and deep copy, … o dictionaries, sets: accessing, merging, iterating, … o *args and **kwargs. • Basics programming o core syntax and semantics, o comments and documentation, o flow control: conditional statements, iterative constructs (loops), sequences and enumeration… o arithmetic and comparison operators, True and False, o iterators and generators, o list comprehensions, o functions, parameter passing (arbitrary, optional, keyword parameters), global and local variables, returning values o file management, o lambda, filter, reduce, map and zip operators. • Brief tour of standard libraries. Programming in Python through examples • Using online resources (e.g., python.org, stackoverflow, realpython.com, etc…) • Running a Python script o interpreter and compiler, understanding modules and packages, o importing modules, o interactive shell, executable and script files. • Programming in Python o procedural vs. modular programming, o namespaces and scopes, o memorization and decoration, o general introduction to Object Oriented programming in Python: inheritance, polymorphism, encapsulation, o class and instances, methods, instance attributes and properties, o memory management and garbage collection, o error and exception handling, o testing, debugging and logging, o duck typing and monkey patching. Advanced Python programming through projects • Virtual environments and packages o managing packages with pip, o creating virtual environments with pipenv. • Python and Jupyter notebooks • Introduction to data analytics o basics with numpy and scipy: math, matrices, arrays, o data fetching from open API (e.g., Eurostat REST API), o data handling with pandas: times series, dataframes, … o data visualisation with matplotlib, o basic statistical analysis and machine learning methods with scikit-learn. Dealing with databases: SQLite SECOND PART – Practice (2 days): • Quick recap of the first part • Question and answers • Bring your own project |
Expected Outcome | The participants should have a good understanding of Python language basics and its ecosystem in order to proficiently use it for Official Statistics purposes. Familiarity with the syntax of Python Knowledge about the individual aspects of a data processing pipeline: reading a file, processing data, modelling, aggregation, visualization and saving results. Experience with creation, manipulation and conversion of common data structures Experience with writing functions and using (pre-existent) functions Basic knowledge of important packages like Numpy, Pandas, Matplotlib, Seaborn, Scikit-learn, Geopandas |
Training Methods | The course consists of alternately:
For the practical hands-on parts of the course Jupyter notebook will be used and there will be a discussion regarding possible solutions to the exercises that will be assigned to participants. The participants will be stimulated to write Python code from scratch under the tutors supervision. |
Required Reading | None |
Suggested Reading |
|
Required Preparation | Register for free to the Google Colaboratory |
Trainer(s)/ Lecturer(s) | Christian Kauth (Independent expert) |
Practical Information | ||||
When | Duration | Where | Organiser | Application via National Contact Point |
24 – 27 March 2025 | 4 days | Cologne, Germany | ICON-INSTITUT Public Sector | Deadline: 02.12.2024 |