Book page

R, Python and Julia: Do you know them all? – Basic

Default profile image
Magda CHMIEL • 20 January 2026

Course Leader

Christian Kauth 

Target Group

This webinar is designed for statisticians, methodologists, and statistical production units from National Statistical Institutes (NSIs) who have an interest in data science tools and possess a basic understanding of software development.

 

Entry Qualifications

Sound command of English (passive and actively). Participants should be able to make short interventions and presentations and to actively participate in discussions and group exercises.

Objective(s)

This course will enable you to make informed, evidence-based decisions when selecting languages for official statistics workflows.

  • Understand advanced features of Python, R, and Julia through identical workflow structures
  • Implement statistical workflows across all three languages using the same Eurostat datasets
  • Master advanced data manipulation: multi-indexing, reshaping, time series handling, and multiple data formats
  • Compare performance, ecosystem maturity, and deployment characteristics objectively
  • Build statistical models, machine learning workflows, and production services in each language
  • Create interactive dashboards and automated reporting systems for data dissemination
  • Leverage performance optimization, parallel processing, and concurrency patterns appropriately
  • Master bidirectional cross-language interoperability for hybrid workflows
  • Identify language-specific strengths: Python for AI/ML, R for statistical rigor, Julia for performance
  • Design decision frameworks for language selection based on project requirements
Contents

The course is structured around three 2-hour modules, each focusing on one language. All modules follow an identical structure to enable direct comparison: Language Fundamentals & Environment, Advanced Data Manipulation, Statistical Computing & Visualization, Performance & Production, and Language-Specific Strengths & Interoperability.

Common Dataset Theme: All modules work with comparable Eurostat data (population statistics, GDP indicators, labor market data) to ensure meaningful cross-language comparisons and demonstrate how different languages approach identical statistical challenges.

Module 1 - Python for Versatile Statistical Workflows

Created by Guido van Rossum in 1991, Python has evolved into the dominant language for data science, AI/ML, and general-purpose programming with a "batteries included" philosophy.

  • Language fundamentals: syntax essentials, virtual environments, and dependency management

  • Advanced data manipulation: multi-indexing, reshaping, time series handling with Eurostat APIs

  • Statistical modeling, machine learning workflows, and AI/LLM integration

  • Data visualization (static and interactive) and dashboards for dissemination

  • Performance optimization, parallel processing, and API development

  • Python's unique strengths: THE language of AI/ML, vast ecosystem, rapid prototyping

  • Showcase: building AI-powered data analysis workflows with ease

  • Cross-language interoperability: calling R and Julia from Python

  • When to choose Python: AI/ML integration, breadth over depth, general-purpose needs

Module 2 - R for Reproducible Official Statistics

Developed in 1993 by Ross Ihaka and Robert Gentleman, R is the statistician's language of choice, built specifically for statistical computing and graphics with deep roots in academic research.

  • Language fundamentals: syntax essentials, package management, and reproducible project environments

  • Advanced data wrangling: tidyverse patterns, multi-indexing, reshaping, time series with Eurostat APIs

  • Statistical modeling for official statistics and machine learning workflows

  • Professional visualization and interactive dashboards for dissemination

  • Performance optimization, parallel processing, and API development

  • R's unique strengths: statistical depth, reproducible research, domain expertise

  • Automated reporting workflows and publication-quality graphics

  • Cross-language interoperability: calling Python and Julia from R

  • When to choose R: statistical rigor, automated reporting, specialized statistical packages

Module 3 - Julia for High-Performance Statistical Computing

Created in 2012 at MIT to solve the "two-language problem," Julia combines Python-like ease of use with C-like performance, designed specifically for scientific and numerical computing.

  • Language fundamentals: syntax essentials including multiple dispatch, package management, and project environments

  • Advanced data manipulation: DataFrame operations, multi-indexing, reshaping, time series with Eurostat APIs

  • Statistical modeling, numerical computing, and machine learning workflows

  • Visualization and dashboard creation approaches

  • Performance optimization, benchmarking vs. Python/R, and parallel/distributed computing

  • Julia's unique strengths: execution speed, multiple dispatch, composability

  • Comparative performance demonstrations on identical datasets

  • Cross-language interoperability: calling Python and R from Julia

  • When to choose Julia: performance-critical workflows, scientific computing, large-scale data

Expected Outcome

Participants will be able to:

  • Objectively assess the strengths and limitations of Python, R and Julia for different official statistics tasks

  • Implement the same analytical workflow in all three languages using identical Eurostat datasets

  • Navigate each language's syntax, environment management, and package ecosystems

  • Apply advanced data manipulation techniques: multi-indexing, reshaping, time series, multiple formats

  • Build statistical models, machine learning pipelines, and production-grade services

  • Create interactive dashboards and automated reporting systems for dissemination

  • Optimize performance using language-appropriate techniques: vectorization, parallel processing, concurrency

  • Design and implement hybrid workflows leveraging multiple languages' strengths

  • Make evidence-based technology choices using clear decision frameworks

  • Communicate technical trade-offs to stakeholders when proposing solutions

Training Methods
  • Presentations lectures (80%)

  • Exercises (20%)


 

Required Reading 

None

Suggested Reading

Official documentation for Python, R, and Julia

Required Preparation

Register for free to Google Colab (https://colab.research.google.com/)

Trainer(s)/
Lecturer(s)
Christian Kauth (Independent expert)

 

Practical Information

When

Duration

Where

APPLICATION VIA National Contact Point

03, 04 and 05 March 2026

(Between 14h – 16h)

3 sessions, 2h each

(3 delivery days)

Online (Zoom)

Deadline for application:

03/02/2026