Course Leader | Christian Kauth |
Target Group | Statistical production units and methodologists of NSIs with at least basic to intermediate knowledge of Python. Applicants can prove this knowledge if they have attended the course "Basic Python for Official Statistics" or demonstrate equivalent experience. |
Entry Qualifications |
|
Objective(s) | Master advanced Python techniques for robust, reproducible statistical production through software engineering practices, advanced analytics, and modern AI integration. By course end, participants will:
|
Contents | Building on Python fundamentals, this course equips NSI staff with advanced skills for statistical production. Participants learn software engineering practices, advanced data handling, modern analytics—from machine learning to geospatial applications—developing robust, reproducible solutions for complex challenges in official statistics.
The course emphasizes practical application through real statistical challenges, ensuring participants can immediately apply learned techniques to their institutional work.
Day 1: Advanced Python Programming & Software Engineering Build reusable classes with OOP and dataclasses, create data transformation pipelines with functional programming and decorators, organize code into packages with proper dependencies, write test suites with pytest and ensure code quality with black/ruff.
Day 2: Advanced Data Handling for Official Statistics Process large datasets with advanced pandas techniques, explore modern tools like Polars for performance, automate data collection through web scraping (BeautifulSoup, Selenium), and integrate PostgreSQL databases with SQLAlchemy ORM.
Day 3: Statistical Computing, Machine Learning & AI Integration Apply numerical computing with NumPy and statistical methods with statsmodels, build ML pipelines with scikit-learn and interpret models with SHAP. Explore AI integration: create MCP servers to expose Python functions to LLM agents and understand AI agent orchestration with Semantic Kernel. Develop interactive Plotly visualizations and animated geospatial choropleth maps with geopandas.
Day 4: Reproducible Research & Project Launch Morning: Generate reproducible reports with Quarto, automate workflows with papermill, followed by technical Q&A covering all topics. Afternoon: Reverse-pitch hackathon—pitch ambitious projects (MCP tools, AI agents, interactive dashboards, ML systems, geospatial apps), form expert teams, and begin implementation.
Day 5: Advanced Project Development & Technical Showcase Morning: Intensive development with architecture coaching—build production-ready demos with testing and documentation. Afternoon: Technical presentations (live system demos, architecture highlights, design decisions), peer voting for innovation and production-readiness, and celebration of achievements. |
Expected outcome
| Participants gain both technical proficiency and strategic understanding to leverage Python for modern statistical production:
Technical Skills:
Practical Capabilities:
Professional Development:
|
Training Methods |
|
Required Reading | None |
Suggested Reading |
|
Required Preparation | Software to install (detailed instructions provided during course)
Free accounts to create
Recommended skills (not required but helpful)
Project preparation (critical for 2-day hackathon)
|
Trainer(s)/ | Christian Kauth (Independent expert) |
Practical Information | |||||
Start date | End Date | Duration | Where | Address | APPLICATION VIA National Contact Point |
22 June 2026 | 26 June 2026 | 5 days | ICON-INSTITUT Public Sector GmbH | Von-Groote-Str. 28 50968 Cologne, Germany | Deadline for application: 22/04/2026 |